cta-taped should log the FST being used for a data transfer
When cta-taped
opens a disk file it uses a URL to the EOS MGM. The MGM redirects the open request to the appropriate FST. When an FST is running slow or is not responding we want to know the network name of the machine. This is currently not logged in the /var/log/cta/cta-taped.log
file. Today we first find the "open disk file" log, for example:
[1607945232.949140000] Dec 14 12:27:12.949140 tpsrv029 cta-taped: LVL="INFO" PID="5102" TID="9420" MSG="Opened disk file for writing" thread="DiskWrite" tapeDrive="I3551411" tapeVid="I56914" mountId="44332" threadCount="10" threadID="0" fileId="1498975153" dstURL="root://eosctacms.cern.ch//eos/ctacms/archive/cms/store/data/Run2015D/JetHT_0T/MINIAOD/27Jan2016-v1/60000/9C0049C9-91C8-E511-B627-0025907750A0.root?eos.lfn=fxid:59588bb1&eos.ruid=0&eos.rgid=0&eos.injection=1&eos.wo
rkflow=retrieve_written&eos.space=retrieve&oss.asize=1156826765" fSeq="31134" actualURL="root://eosctacms.cern.ch//eos/ctacms/archive/cms/store/data/Run2015D/JetHT_0T/MINIAOD/27Jan2016-v1/60000/9C0049C9-91C8-E511-B627-0025907750A0.root?eos
.lfn=fxid:59588bb1&eos.ruid=0&eos.rgid=0&eos.injection=1&eos.workflow=retrieve_written&eos.space=retrieve&oss.asize=1156826765"
Next we go to the MGM logs, find the re-direction log and extract the network name of the FST, for example in this case the FST network name is eosctafst0130.cern.ch
:
[root@eosctafst0124 ~]# grep 12:27 /var/log/eos/mgm/xrdlog.mgm | grep /eos/ctacms/archive/cms/store/data/Run2015D/JetHT_0T/MINIAOD/27Jan2016-v1/60000/9C0049C9-91C8-E511-B627-0025907750A0.root | grep redirection | sed 's/ /\n/g' | grep redirection
redirection=eosctafst0130.cern.ch?&cap.sym=<...>&cap.msg=<...>&mgm.logid=4dfdf530-3dff-11eb-a188-b8599f4010a2&mgm.replicaindex=0&mgm.replicahead=0&mgm.id=59588bb1&mgm.event=sync::closew&mgm.workflow=retrieve_written&mgm.instance=eosctacms&mgm.owner_uid=22014&mgm.owner_gid=1399&mgm.requestor=root&mgm.requestorgroup=root&mgm.attributes=c3lzLmFyY2hpdmUuZmlsZV9pZD0xNDk4OTc1MTUzOzs7c3lzLmFyY2hpdmUuc3RvcmFnZV9jbGFzcz1jbXM7OztzeXMuY3RhLm9iamVjdHN0b3JlLmlkPVJldHJpZXZlUmVxdWVzdC1Gcm9udGVuZC1jdGFwcm9kdWN0aW9uZnJvbnRlbmQwMS5jZXJuLmNoLTE4MzkxLTIwMjAxMjEwLTE0OjA3OjAyLTAtMzI3MzI=
[root@eosctafst0124 ~]#
The step of determining the result of the redirection could be eliminated if cta-taped
could get this information itself and log it to /var/log/cta/cta-taped.log
. This would greatly help operators identify slow or unresponsive FSTs.
At the request of Vlado I contacted Michal who is responsible for the XRootD client library and asked him if we could use the XRootD client library API to get the result of the redirection. The short answer is Michal said yes. The specifics are as follows:
Hi Steve,
Yes it is possible, after successful open you can call on your XrdCl::File object the
XrdCl::File::GetProperty method with “DataServer“ key, the returned value will
be the disk server host name colon port (e.g. myhost.cern.ch:1094), or you could
use “LastURL” if you want the full url.
Cheers,
Michal
The cta-taped
daemon should be modified to get the result of the redirection and it should be logged in a convenient place for operators with the /var/log/cta/cta-taped.log
file.