https://www.gridpp.ac.uk/w/index.php?title=BaBar_RAL_Xrootd_Maintenance&feed=atom&action=historyBaBar RAL Xrootd Maintenance - Revision history2024-03-29T07:55:11ZRevision history for this page on the wikiMediaWiki 1.22.0https://www.gridpp.ac.uk/w/index.php?title=BaBar_RAL_Xrootd_Maintenance&diff=1455&oldid=prevFergus babar wilson at 13:26, 15 September 20092009-09-15T13:26:11Z<p></p>
<p><b>New page</b></p><div>[[Category:BaBar RAL Tier A Operations]]<br />
<br />
==Redirectors and Servers==<br />
<br />
We currently have two redirectors (xrootd260 and xrootd274) and four servers (gdsss147, gdss148, gdss164 and gdss165) which can be reached from the babar front end machines<br />
<br />
<pre><br />
ssh babar.rl.ac.uk<br />
ssh xrootd260.gridpp.rl.ac.uk<br />
</pre><br />
<br />
On the redirector xrootd260 you can see in the logs requests for files and the server the file is redirected to.<br />
<br />
<pre><br />
more /opt/xrootd/logs/xrdlog<br />
081123 23:00:00 17925 Copr. 2007 Stanford University, xrd version 20071101-0808p1_dbg<br />
081123 23:00:00 17925 xrootd anon@xrootd260.gridpp.rl.ac.uk:1094 running.<br />
081124 09:29:02 17925 XrootdXeq: babarmc.15394:15@lcg0860 login<br />
081124 09:29:02 17925 odc_send2Man: babarmc.15394:15@lcg0860 redirected to gdss165.gridpp.rl.ac.uk:1094 by xrootd-rdr1 path=/store/cfg/2008/05/CfgDB-20080514T224248.root<br />
081124 09:29:04 17925 odc_send2Man: babarmc.15394:15@lcg0860 redirected to gdss147.gridpp.rl.ac.uk:1094 by xrootd-rdr1 path=/store/cfg/2008/05/CfgDB-20080514T224248.root<br />
081126 09:29:04 17925 odc_send2Man: babarmc.15394:15@lcg0860 redirected to gdss147.gridpp.rl.ac.uk:1094 by xrootd-rdr1 path=/store/test.root<br />
</pre><br />
<br />
If your job was having problems accessing the file test.root you can log into gdss147 and restart the olbd and xrootd (see below). Also restart the olbd and xrootd on both the redirectors<br />
<br />
==Starting and Stopping Services==<br />
<br />
A script has been created to allow the stopping and restarting of xrootd and oldb services on all the servers. Log into <code>xrootd260</code> as <code>bbdatsrv</code>. To see the status of the services run the following script:<br />
<br />
<pre><br />
> ~/xrootd_manage.sh status<br />
</pre><br />
<br />
You should see something like the following if the services are up and running:<br />
<pre><br />
gdss147<br />
xrootd (pid 4024) is running...<br />
olbd (pid 3961) is running...<br />
gdss148<br />
xrootd (pid 30828) is running...<br />
olbd (pid 30764) is running...<br />
.<br />
.<br />
.<br />
</pre><br />
<br />
To stop and restart all services, run the following script:<br />
<br />
<pre><br />
> ~/xrootd_manage.sh restart<br />
</pre><br />
<br />
There are six servers (gdss147 gdss148 gdss165 xrootd260 xrootd274 gdss164). To stop a service on a single server, log into the server as <code>bbdatsrv</code> and issue the command:<br />
<br />
<pre><br />
sudo /sbin/service xrootd|olbd|mps stop|start|restart*<br />
</pre><br />
<br />
If you don't have passwordless ssh login you will have to log into the machines independently from the front ends. For example:<br />
<br />
<pre><br />
[lcgui0359] /home/csf/bbdatsrv > ssh bbdatsrv@gdss147.gridpp.rl.ac.uk<br />
<br />
-bash-3.00$ sudo /sbin/service xrootd stop<br />
Shutting down xrootd server: [ OK ]<br />
-bash-3.00$ sudo /sbin/service olbd stop<br />
Shutting down olbd server: [ OK ]<br />
-bash-3.00$ sudo /sbin/service olbd start<br />
Starting olbd server: [ OK ]<br />
-bash-3.00$ sudo /sbin/service xrootd start<br />
Starting xrootd server: [ OK ]<br />
</pre><br />
<br />
==To check the stage queues on all the staging servers==<br />
Again log into one of the front ends as <code>bbdatsrv</code> and run:<br />
<br />
<pre><br />
for server in `cat ral-stagers.txt`<br />
do echo $server<br />
ssh -x $server 'cat /opt/xrootd/stageQ/PreStageQ.0|sort --unique'<br />
done<br />
</pre><br />
<br />
==What services should be running==<br />
<br />
All these processes will show up multiple times (1 parent + 1 or more children) on RH73 boxes on SL boxes only one process show up in ps.<br />
<br />
===On All Machines===<br />
<br />
<code>xrootd</code><br><br />
<code>olbd</code><br />
<br />
===On Stagers===<br />
<br />
<code>mps_prep</code> as a subprocess of <code>xrootd</code><br><br />
<code>mps_MigrPurg</code><br><br />
<code>mps_PreStage</code><br></div>Fergus babar wilson