Difference between revisions of "QMUL"

From GridPP Wiki
Jump to: navigation, search
 
(Replaced content with "The content of this page (last updated in 2007) has been removed. To contact the [https://goc.egi.eu/portal/index.php?Page_Type=Site&id=503 QMUL grid site] please use: edg...")
Line 1: Line 1:
== Cross Site support at QMUL ==
+
The content of this page (last updated in 2007) has been removed.
 
+
To contact the [https://goc.egi.eu/portal/index.php?Page_Type=Site&id=503 QMUL grid site] please use: edg-site-admin@qmul.ac.uk
Here is a list of things you can do:
+
*ce02:
+
::<tt>-bash-2.05b$ sudo tail -1 /var/log/messages</tt>
+
::<tt>-bash-2.05b$ sudo /etc/rc.d/init.d/globus-gatekeeper status</tt>
+
::<tt>-bash-2.05b$ sudo /etc/rc.d/init.d/globus-gridftp status</tt>
+
::<tt>-bash-2.05b$ sudo /etc/rc.d/init.d/globus-mds status
+
</tt>
+
 
+
*ce01:
+
::<tt>-bash-2.05b$ sudo /etc/rc.d/init.d/bdii status
+
</tt>
+
 
+
*se01:
+
::<tt>-bash-2.05b$ sudo tail -1 /var/log/messages</tt>
+
::<tt>-bash-2.05b$ sudo /etc/rc.d/init.d/dpm status</tt>
+
 
+
== Local resources ==
+
 
+
Currently the HTC consists of a total of 174 machines (348 processors). There are 160 "compute nodes" (128 dual 2.8 GHz Intel Xeon nodes with 2 Gbyte RAM and 32 dual 2.0 GHz AMD Athlon nodes with 1 Gbyte RAM). There is a total of about 40 Tbyte of disk storage ( 25 Tbyte on RAID arrays and 15 Tbyte distributed amongst the cluster nodes). All the nodes are connected together on a dedicated Gbit ethernet and are also connected to the London MAN via a Gbit connection.
+
 
+
[http://www.esc.qmul.ac.uk/cluster/ e-Science High Throughput Cluster]
+
 
+
== Upgrades ==
+
=== Glite 3.0 ===
+
==== 06/07/06 ====
+
===== Creation of the VO users =====
+
*made a script for the creation of users and groups in nis from the users.yaim file
+
*users.yaim file is created via qmul/scripts/updateVO.pl that is invoked by make in qmul/config
+
*The <tt>updatevo.pl</tt> file takes two things as input:
+
::<tt>passwd</tt>
+
::<tt>currentVO.cfg</tt>
+
::<tt>site-info.def.main</tt>
+
*<tt>updatevo.pl</tt> is invoked by running make in qmul/config
+
*currentVO.cfg has the following format. One entry per vo
+
<pre>
+
[geant4]
+
gname=geant4
+
gid=32001
+
numusers=50
+
voms_server_uri="vomss://lcg-voms.cern.ch:8443/voms/geant4?/geant4/"
+
vomses="'geant4 lcg-voms.cern.ch 15007 /C=CH/O=CERN/OU=GRID/CN=host/lcg-voms.cern.ch geant4'"
+
</pre>
+
*The <tt>updatevo.pl</tt> script produces two files
+
::<tt>site-info.def</tt>
+
::<tt>users.yaim</tt>
+
::<tt>vos.yaim</tt>
+
*From the users.yaim we need to create the <tt>passwd</tt> and <tt>group</tt>
+
for the NIS server.
+
*This is done with the script <tt>makePasswd.py</tt> in the config directory
+
*<tt>makePasswd.py</tt> is invoked as: <tt>makePasswd.py [users.yaim] [pwdfile] [gfile] [homedir]</tt>
+
*the <tt>pwdfile</tt> and <tt>group</tt> files need to be added to the passwd.local and group.local on the NIS that contains the local users.
+
*In the NIS: <tt>/var/yp/src</tt> there need to be the two files
+
*Then run make in <tt>/var/yp</tt> that will create the relevant entries in <tt>/var/yp/htc</tt>
+
*On the WN the home directories are created from the NIS entries by running:
+
::<tt>/etc/init.d/lcg2 start</tt>
+
 
+
====== makePasswd script ======
+
<pre>
+
#!/usr/bin/python
+
# Dummy script to create group and passwd file for nis
+
# usage ./makePasswd.py users.yaim mypass /scratch/lcg2/
+
 
+
import string
+
import sys
+
 
+
 
+
def getUserAndGroup(filename):
+
        file=open(filename)
+
        content=file.readlines()
+
        poolinfo=[]
+
        for line in content:
+
                sline=string.split(line,":")
+
                current=[[sline[0],sline[1],sline[2],sline[3]]]
+
                poolinfo=poolinfo+current
+
        return poolinfo
+
 
+
#def create
+
def extractGroup(poolinfo):
+
        groups={}
+
        for apool in poolinfo:
+
                guid=apool[2]
+
                group=apool[3]
+
                if(not groups.has_key(group)):
+
                        groups[group]=guid
+
        return groups
+
 
+
def createPasswdFile(newpassfile,poolinfo,homedir):
+
        outfile=open(newpassfile,"w")
+
        for i in poolinfo:
+
                uid=i[0]
+
                login=i[1]
+
                guid=i[2]
+
                group=i[3]
+
                passwdline=login+':x:'+uid+':'+guid+':mapped user for group ID '+guid+':'+homedir+login+':/bin/bash\n'
+
                outfile.write(passwdline)
+
 
+
def createGroupFile(groupfile,groups):
+
        outfile=open(groupfile,"w")
+
        for i in groups.keys():
+
                groupline=i+':x:'+groups[i]+':'+'edguser\n'
+
                outfile.write(groupline)
+
 
+
 
+
def main(arg):
+
        if(len(arg)==5):
+
                poolinfo=getUserAndGroup(arg[1])
+
                createPasswdFile(arg[2],poolinfo,arg[4])
+
                groups=extractGroup(poolinfo)
+
                createGroupFile(arg[3],groups)
+
        else:
+
                print "Usage is makePasswd.py [users.yaim] [passwordfile] [groupfile] [homedir]"
+
#      print poolpass
+
 
+
main(sys.argv)
+
</pre>
+
 
+
===== Certificates =====
+
* Giuseppe has received renewal for ce01,mon01,ce02,se01 but none of the serials in the mail correspond to the serials on the box. Probably certificates requested previously and never used.
+
* We have requested two certificates
+
:: wn01.esc.qmul.ac.uk (currently the name of the ce)
+
:: ce02.esc.qmul.ac.uk
+
===== Problem with submission at QMUL =====
+
sft's are showing that the submission is failing match. Observed that the bdii is alive but not responding on port 2170 hence the problem. Could submit a job on the short queue from imperial as dteam
+
==== 17/07/06 ====
+
 
+
 
+
===== Installation of the rpms on wn01 =====
+
the yum.conf file used is:
+
<pre>
+
[glite3-base]
+
name=GLITE - 3_0_0 Repository For sl3 $basearch
+
baseurl=http://kickstart/RPMS/GLITE/3_0_0/sl3/$basearch/base
+
 
+
[glite3-externals]
+
name=GLITE - 3_0_0 Repository For sl3 $basearch
+
baseurl=http://kickstart/RPMS/GLITE/3_0_0/sl3/$basearch/externals
+
 
+
[glite3-updates]
+
name=GLITE - 3_0_0 Repository For sl3 $basearch
+
baseurl=http://kickstart/RPMS/GLITE/3_0_0/sl3/$basearch/updates
+
 
+
[CA]
+
name=CA Repository
+
baseurl=http://kickstart/RPMS/LCG_CA
+
</pre>
+
*yum -c yum.conf install lcg-CE
+
*yum -c yum.conf install lcg-CA
+
 
+
===== makeGroupConf.py to populate groups.conf =====
+
*The groups.conf can be populated from the users.conf, since the two last columns are the vo and the role.
+
<pre>
+
#!/usr/bin/python
+
# Dummy script to create yaim groups.conf  from the users.conf file
+
# usage ./makeGroupConf.py users.yaim groups.conf
+
 
+
import string
+
import sys
+
 
+
 
+
def getVOandRoles(filename):
+
        file=open(filename)
+
        content=file.readlines()
+
        voroleDict={}
+
        for line in content:
+
                sline=string.split(line,":")
+
#              print sline
+
                vo=sline[-3]
+
                role=sline[-2]
+
                if(voroleDict.has_key(vo)):
+
                        if(not role in voroleDict[vo]):
+
                                voroleDict[vo]=voroleDict[vo]+[role]
+
                else:
+
                        voroleDict[vo]=[role]
+
#      print voroleDict
+
        return voroleDict
+
 
+
def createGroupConf(filename,vorole):
+
        map={'sgm':'lcgadmin','prd':'production'}
+
        outfile=open(filename,"w")
+
        for vo in vorole.keys():
+
                for role in vorole[vo]:
+
                        if(role!='' and role not in map.keys()):
+
                                print 'role='+role+' not found, skipping'
+
                                continue
+
                        if(role==''):
+
                                line='''"/VO='''+vo+'''/GROUP=/'''+vo+'''"::::'''
+
                        else:
+
                            line='''"/VO='''+vo+'''/GROUP=/'''+vo+'''/ROLE='''+map[role]+'''":::'''+role+''':'''
+
                        outfile.write(line+'\n')
+
 
+
def main(arg):
+
        if(len(arg)==3):
+
                vorole=getVOandRoles(arg[1])
+
                createGroupConf(arg[2],vorole)
+
        else:
+
                print "Usage is makeGroupConf.py [users.yaim] [groups.conf]"
+
#      print poolpass
+
 
+
main(sys.argv)
+
</pre>
+
 
+
===== Torque Client =====
+
* commands used
+
<pre>
+
[root@wn01 etc]# cat fstab
+
[root@wn01 etc]# cat fstab | tail -2
+
pbs:/var/spool/pbs    /var/spool/pbs          nfs    ro    0 0
+
that is we added the above line to /etc/fstab file
+
yum install torque-clients
+
cd /etc/
+
mkdir pbs
+
cd pbs/
+
echo "pbs.htc.esc.qmul" >  /etc/pbs/server_name
+
mount /var/spool/pbs/
+
 
+
[root@wn01 pbs]# qstat -q
+
to test that it worked
+
</pre>
+
Note that first we have to install torque clients and then we can mount the
+
/var/spool/pbs dir, otherwise the installation of the torque clients rpm
+
will overide the stuff inside the just mounted dir.
+
 
+
===== Maui =====
+
* After configuring the nodes I realized that the information system was reporting an empty tree. In the log files I could see that the vomaxjob plugin did not return anything. The reason is that the diagnose -g was not there because the installation did not install the maui client tools.
+
* The configuration file that specify to run vomaxjob is found in <tt>/opt/lcg/etc/lcg-info-dynamic-scheduler.conf</tt>
+
* Maui Client Tools.
+
** <tt>yum install maui-clients</tt> provides the clients from the lcg distribution which is the rpm <tt>maui-client-3.2.6p11-2_SL30X.i386.rpm</tt>. When using it is not compatible with the version compiled by Alex on fe03 build for fc2 (maui-3.2.6p10-4.fc2.qmul).
+
** Tried to copy the <tt>diagnose -g</tt> binary from gfe03 but when using it complains with the following: <tt>./diagnose: /lib/tls/libc.so.6: version `GLIBC_2.3.4' not found (required by ./diagnose)</tt>
+
**As a temporary solution we have written a dummy diagnose command that does:
+
<pre>
+
#!/bin/sh
+
# passing all arguments to maui on pbs
+
ssh root@fe03.htc.esc.qmul "diagnose $1"
+
</pre>
+
This should be changed in the future.
+
 
+
===== Information system =====
+
After down the maui tricks about the information system was still not publishing the right information I realized that there was still one lcg rpm missing hence the final list to install is
+
<pre>
+
yum install lcg-info-dynmaic
+
yum install lcg-info-dynamic
+
yum install lcg-info-dynamic-pbs
+
yum install  lcg-info-dynamic-scheduler-generic
+
yum install  lcg-info-dynamic-software
+
yum install lcg-info-dynamic-scheduler-pbs
+
yum install lcg-info-templates
+
yum install  lcg-info-generic
+
</pre>
+
Where the missing one was <tt>lcg-info-generic</tt>.
+
*'''BUG''': the lcg-CE dependency tree should contain those rpms. Need to check that it is not contained in the dependency. It generally works because people are updating the system
+
 
+
===== Yaim configuration =====
+
*Yaim needs to create rgma, edguser, edginfo. Alex prefers that we do not put it in yp since it is only used on the service nodes.
+
*Yaim will create those users automatically. But the /home directory is automounted and yaim crashes because it cannot create the three users above.
+
*We had to comment out the entry for that home dir in /etc/auto.master
+
*Yaim needs the users.yaim and groups.conf file and cannot be empty. So we have deleted out the config_users in the <tt>nod-info.def</tt> file to avoid it creating the pool accounts.
+
*'''Issue''': prd users are not in the yellow pages. They will be added if necessary.
+
 
+
==== 18/07/2006 CE ====
+
* The day before (not logged) we could submit a job to QMUL with globus-job-submit from gfe03. The problem is that the output cannot be retreived. Usually it is because the pbs output could not be retreived.
+
*We tried to submit a job from wn01 now renamed ce02 (which caused use to rerun yaim).
+
*From <tt>dteam001</tt> on ce02, <tt>qsub -q lcg2_short test.sh</tt> returns
+
<tt>Bad UID for Job execution</tt>
+
*Alex found that this is because ce02 was not listed in the shosts.equiv of the pbs server. After checking today it has disapeared from there.
+
* After that we could submit a job but the output was not returned. Alex told us he was not expecting to support two ce.
+
* We understood that the authentication of the cn when the jobs comes back is done via an agent that is started in the prologue of the script. So two things needs to be done on the <tt>cn</tt> and <tt>ce</tt>.
+
** <tt>ce</tt> the home directories should contain an autorized_key file that is readable by root only
+
** <tt>cn</tt> /etc/ssh/ssh_known_host should contain the public key of the ce02.
+
 
+
==== Mon Box upgrade ====
+
* First of all I need to disable the automount of /home which was also there when running the installation on the ce02
+
* The auto.master contains
+
<pre>
+
# /misc /etc/auto.misc  --timeout=60
+
#/home /etc/auto.home  --timeout=300
+
#/opt /etc/auto.opt  --timeout=300
+
/opt/shared /etc/auto.opt --timeout=300
+
/mnt/auto /etc/auto.mnt  --timeout=60
+
</pre>
+
* /etc/init.d/autofs stop
+
* updated the yum.conf with the following repositories:
+
<pre>
+
[glite3-base]
+
name=GLITE - 3_0_0 Repository For sl3 $basearch
+
baseurl=http://kickstart/RPMS/GLITE/3_0_0/sl3/$basearch/base
+
 
+
[glite3-externals]
+
name=GLITE - 3_0_0 Repository For sl3 $basearch
+
baseurl=http://kickstart/RPMS/GLITE/3_0_0/sl3/$basearch/externals
+
 
+
[glite3-updates]
+
name=GLITE - 3_0_0 Repository For sl3 $basearch
+
baseurl=http://kickstart/RPMS/GLITE/3_0_0/sl3/$basearch/updates
+
 
+
[CA]
+
name=CA Repository
+
baseurl=http://kickstart/RPMS/LCG_CA
+
</pre>
+
* Removing the old meta-package lcg-MON (rpm -e lcg-MON)
+
* Installing the glite-MON (yum install glite-MON) '''fails''' with
+
<pre>
+
Package edg-rgma-api-perl needs librgma-c.so.0, this is not available.
+
Package edg-rgma-api-perl needs librgma-c.so.0, this is not available.
+
Package edg-rgma-api-perl needs librgma-c.so.0, this is not available.
+
Package edg-rgma-api-perl needs librgma-c.so.0, this is not available.
+
Package edg-rgma-api-perl needs edg-rgma-api-c, this is not available.
+
Package edg-rgma-api-perl needs librgma-cpp.so.0, this is not available.
+
Package edg-rgma-api-perl needs edg-rgma-api-cpp, this is not available.
+
Package edg-rgma-api-perl needs edg-rgma-base, this is not available.
+
</pre>
+
* The package edg-rgma-api-perl is not in the glite stack hence I remove it <tt>rpm -e edg-rgma-api-perl</tt>
+
* <tt>yum install glite-MOM</tt>
+
* We have putted all the yaim configuration in /opt/glite/yaim/config
+
* <tt>configure_node /opt/glite/yaim/config/site-info.def MON</tt> fails with <tt>Java Location not set</tt> the reason is that the java version on the mon box is older than on the ce02.
+
* We have installed the latests version <tt>j2sdk-1_4_2_12-linux-i586.rpm</tt>
+
*'''Note''' this should be put in the QMUL repository for the other machines.
+
* Running the configuration: <tt>configure_node /opt/glite/yaim/config/site-info.def MON</tt>
+
* Everything fine apart from the fmon (GridIcE) that we can ignore
+
* Checked that the apel cron job works <tt>/etc/cron.d/edg-apel-publisher</tt> and it get the following message:
+
<pre>
+
org.glite.apel.core.ApelException: org.glite.rgma.RGMAException: Error registering producer table in Registry for table: LcgRecords
+
Caused by: cannot service request, client hostname is currently being blocked
+
</pre>
+
This is because we where using a to old version of rgma servlet that causes problems to the registry. I (Ovda) have written a mail to Alastair Duncan to unblock us.
+
* ''' Checking what services are running ''' We decided to shut down:
+
** stopped cupsd
+
 
+
==== SE Upgrade (19/07/07) ====
+
* First thing to avoid disasters we backup the mysql database as advised on [[MySQL Backups | backup your MySQL database]].
+
* We have stored the backup in /root/mysql.backup.gz
+
* Removed the metapackage with the lcg dependencies: <tt>rpm -e lcg-SE_dpm_mysql</tt>
+
* Installed the grid sw on the machine:
+
  <tt> yum install glite-SE_dpm_mysql </tt>
+
* Copied config file from mon01:
+
<tt>
+
[root@se01 scripts]# cd /opt/glite/yaim/
+
                      scp -r mon01:/opt/glite/yaim/config .
+
</tt>
+
* Created user accounts:
+
<pre>
+
[root@se01 root]# /opt/glite/yaim/config/setup_lcg_pool_accounts
+
</pre>
+
 
+
* Installed newer version of java package:
+
<pre>
+
j2sdk-1_4_2_12-linux-i586.rpm
+
</pre>
+
 
+
* Deleted function <pre> config_users </pre> from file
+
<pre>
+
[root@se01 root]# less /opt/glite/yaim/scripts/node-info.def
+
</pre>
+
 
+
* Configured the machine
+
<pre>
+
[root@se01 scripts]# ./configure_node /opt/glite/yaim/config/site-info.def SE_dpm_mysql
+
</pre>
+
 
+
* Used Alex's recepit:
+
<pre>
+
[root@se01 etc]# cp resolv-pub.conf resolv.conf
+
[root@se01 etc]# /etc/rc.d/init.d/rfiod restart
+
[root@se01 etc]# /etc/rc.d/init.d/dpm-gsiftp restart
+
[root@se01 etc]# /etc/rc.d/init.d/dpnsdaemon restart
+
[root@se01 etc]# /etc/rc.d/init.d/dpm restart
+
[root@se01 etc]# /etc/rc.d/init.d/srmv1 restart
+
[root@se01 etc]# /etc/rc.d/init.d/srmv2 restart
+
[root@se01 etc]# cp resolv-priv.conf resolv.conf
+
</pre>
+
 
+
 
+
* Tested the machine:
+
<pre>
+
[mazza@gfe03 mazza]$ globus-url-copy file:////`pwd`/pippo gsiftp://se01.esc.qmul.ac.uk:2811/dpm/esc.qmul.ac.uk/home/dteam/ol12
+
</pre>
+
 
+
 
+
==== cn069 tarball (20/07/07) ====
+
 
+
* Configuration of the tarball on cn069.
+
* Created a gridadmin user in the nis.
+
* Runned yaim as gridadmin.
+
* Had to get the cronjob for crl to be installed as gridadmin
+
* Had to run install_cert_userland to make sure that the certs are in the tarball
+
* We had to make an rpm out of the tarball
+
*: The rpm would contain the certificates, cronjob for crl and the creation of links withing to have
+
grid-security in /etc pointing to the tarball one. (I know rgma is not using X509_ env var properly)
+
 
+
==== cn069 testing (21/07/07) ====
+
 
+
* Problem is to be sure that dteam and ops jobs are going to the node
+
* We did that by setting a property on the cn069 node in /var/spool/pbs/nodes with properties=lcg2shortattr
+
* In QMGR we did to allow only dteam jobs and we specified that we wanted lcg2shortattr
+
<pre>
+
set queue lcg2_short acl_group_enable=true
+
set queue lcg2_short acl_group+=dteam
+
set queue lc2_short resources_default.neednodes = lcg2shortattr
+
</pre>
+
A similar thing can be done in maui.cfg by doing a sft partition
+
<pre>
+
 
+
</pre>
+
 
+
=== CA 1.9 (20/09/2006) ===
+
The procedure to upgrade the CA is far from perfect
+
* First we have to retrieve the CA rpm from this location: http://linuxsoft.cern.ch/LCG-CAs/current/RPMS.production/
+
* On fe02 at /mnt/installs/RPMS there is a script <tt>get_LCG_CA</tt> that takes the last rpms and creates the headers files. Beware the remove the old rpms.
+
* On each machine you have to run yum update lcg-CA
+
 
+
For the WN we have to rebuild the glite rpm that contains the whole software.
+
* The build procedure is done on ce01 in /usr/src/redhat
+
* First get a fresh version of the tarball from cn120.
+
* untar it in /usr/src/redhat/SOURCES/temp/
+
* add the new certificates in the ./grid-security/certificates/
+
* cd /usr/src/redhat/SPECS/
+
* Edit <tt>glite-qmul.spec</tt> and increase the version number
+
* run <tt>rpmbuild -ba glite-qmul.spec >& glite-qmul-8.log</tt>
+
 
+
Testing:
+
* To test, install on cn362 and direct jobs there by assigning the sft partition to OPS and DTEAM
+
:: See GROUPCFG[ops]
+
* After running the sft verify that they where ok [https://lcg-sft.cern.ch:9443/sft/sitehistory.cgi?site=ce02.esc.qmul.ac.uk here]
+
 
+
== Site log ==
+
 
+
[[Category:London Tier2]]
+
 
+
=== SC4 Transfer Test ===
+
 
+
==== 16/01/2006 ====
+
Realized that the srm version distributed in dcache-client-1.6.6-4 gives
+
srmcp error: nulljava.lang.NullPointerException
+
 
+
Used the version distributed in http://www.dcache.org/downloads/dcache-v1.6.5-2.tgz
+
rpm2cpio d-cache-client-1.0-100-RH73.i386.rpm  > client.cpio
+
cpio --make-directories -F client.cpio -i
+
 
+
defined SRM_PATH as the path of the unpacked srm client. Then Graeme [http://www.physics.gla.ac.uk/~graeme/scripts/ scripts] worked.
+
 
+
 
+
 
+
<pre>
+
nohup ./filetransfer.py --ftp-options="-p 10" --number=2  --delete -s
+
https://fts0344.gridpp.rl.ac.uk:8443/sc3ral/glite-data-transfer-fts/services/FileTransfer 
+
srm://se2-gla.scotgrid.ac.uk:8443/dpm/scotgrid.ac.uk/home/dteam/tfr2tier2/canned1G 
+
srm://se01.esc.qmul.ac.uk:8443/dpm/esc.qmul.ac.uk/home/dteam/can1G
+
</pre>
+
 
+
Which resulted in:
+
 
+
<pre>
+
Transfer Bandwidth Report:
+
  2/2 transferred in 237.424527884 seconds
+
  2000000000.0 bytes transferred.
+
Bandwidth: 67.389836015Mb/s
+
</pre>
+
 
+
The Network [[Network_Testing|bandwidth]] obtained by [http://dast.nlanr.net/Projects/Iperf/ iperf] was 94Mb/s which indicates that there is a 100Mb trunk in the line.
+
When doing the same bandwidth test from [[IC-HEP]] we have 392Mb/s clearly indicating that the 100Mb trunk
+
is on the [[QMUL]] side.
+
 
+
 
+
Then submitted 50x1G
+
 
+
<pre>
+
nohup ./filetransfer.py --ftp-options="-p 10" --number=50  --delete -s
+
https://fts0344.gridpp.rl.ac.uk:8443/sc3ral/glite-data-transfer-fts/services/FileTransfer 
+
srm://se2-gla.scotgrid.ac.uk:8443/dpm/scotgrid.ac.uk/home/dteam/tfr2tier2/canned1G 
+
srm://se01.esc.qmul.ac.uk:8443/dpm/esc.qmul.ac.uk/home/dteam/can1G
+
</pre>
+
 
+
I had to cancel the transfer since nohup crashed.
+
 
+
Tried to submit a 500GB transfer but nohup did not keep the filetransfer up and I could not get the outcome
+
 
+
 
+
==== 17/01/2006 ====
+
Submitted a 10GB transfer test. 10 Files two streams per file
+
 
+
<pre>
+
Transfer Bandwidth Report:
+
  10/10 transferred in 1245.54273701 seconds
+
  10000000000.0 bytes transferred.
+
Bandwidth: 64.2290285376Mb/s
+
</pre>
+
 
+
Alex and Giuseppe have changed the se01.esc.qmul.ac.uk to a Gb connection. I have rescheduled a transfer for 18h00.
+
 
+
Have submitted using the 0.3.0 filetransfer script:
+
<pre>
+
filetransfer.py --ftp-options="-p 2" --number=500 --background  --delete -s https://fts0344.gridpp.rl.ac.uk:8443/sc3ral/glite-data-transfer-fts/services/FileTransfer srm://se2-gla.scotgrid.ac.uk:8443/dpm/scotgrid.ac.uk/home/dteam/tfr2tier2/canned1G srm://se01.esc.qmul.ac.uk:8443/dpm/esc.qmul.ac.uk/home/dteam/can1G
+
</pre>
+
 
+
<pre>
+
Child:  /opt/glite/bin/glite-transfer-status -l 6b7363f8-884b-11da-a18f-e44be7748cb0 -s https://fts0344.gridpp.rl.ac.uk:8443/sc3ral/glite-data-transfer-fts/services/FileTransfer
+
FTS status query for 6b7363f8-884b-11da-a18f-e44be7748cb0 failed:
+
FTS Error: status: getFileStatus: requestID <6b7363f8-884b-11da-a18f-e44be7748cb0> was not found
+
</pre>
+
 
+
I could defenitly see the transfer on the se01 node and the machine load rising so the transfer was going on.
+
I could not cancel the transfer, it was giving a soap error.
+
 
+
Tried to destroy myproxy which did not have a direct effect but after 66 files transfered it stopped
+
 
+
<pre>
+
Submit time:    2006-01-18 17:54:07.000
+
Files:          500
+
        Done:          66
+
        Active:        0
+
        Pending:        0
+
        Canceled:      0
+
        Failed:        0
+
        Finished:      0
+
        Submitted:      0
+
        Restarted:      0
+
</pre>
+
 
+
The bandwidth can be seen here
+
 
+
[[Image:Qmul-transfer1.gif]]
+
 
+
[[Image:Qmul-fts1.gif]]
+
 
+
==== 15/02/2006 ====
+
 
+
I used the following command:
+
 
+
[mazza@grid05 mazza]$ filetransfer.py --background --ftp-options="-p 2" --number=500  --delete srm://se01.esc.qmul.ac.uk:8443/dpm/esc.qmul.ac.uk/home/dteam/canned2G srm://dcache.gridpp.rl.ac.uk:8443/pnfs/gridpp.rl.ac.uk/data/dteam/qmul
+
 
+
The bandwidth can be seen here
+
 
+
[[Image:QMUL-RAL_graph.gif]]
+
 
+
 
+
[[Image:QMUL-RAL_graph_2.gif]]
+
 
+
 
+
The bandwidth mean value is 172.8 Mbit/s
+
 
+
==== 22/02/2006 ====
+
 
+
I used the following command:
+
 
+
[mazza@grid05 mazza]$ filetransfer.py --background --ftp-options="-p 2" --number=500  --delete srm://dcache.gridpp.rl.ac.uk:8443/pnfs/gridpp.rl.ac.uk/data/dteam/tfr2tier2/canned2G srm://se01.esc.qmul.ac.uk:8443/dpm/esc.qmul.ac.uk/home/dteam/canned2G_from_RAL
+
 
+
The bandwidth can be seen here
+
 
+
[[Image:060222_RAL-QMUL.gif]]
+
 
+
The bandwidth mean value is 118.00 Mbit/sec
+
 
+
=== SC4 Throughput Test ===
+
 
+
The throughput test are ment to stress test the RAL Tier1 production network
+
by pulling from differents Tier2 from the Tier1.
+
 
+
More details can be found [[SC4_Aggregate_Throughput|here]]
+
 
+
 
+
== Monitoring links ==
+
 
+
[http://goc.grid.sinica.edu.tw/gstat/QMUL-eScience/ GSTAT for QMUL-eScience]
+
 
+
[http://www.gridpp.ac.uk/storage/status/gridppDiscStatus.html GridPP storage status]
+

Revision as of 15:52, 22 January 2016

The content of this page (last updated in 2007) has been removed. To contact the QMUL grid site please use: edg-site-admin@qmul.ac.uk