Sam-client-pps-ral

From GridPP Wiki
Jump to: navigation, search

PPS SAM (pps-sam.gridpp.rl.ac.uk) Client rebuild instructions

These instructions are primarily for tier1 staff at Rutherford Appleton Laboratory UK, however the wider EGEE community is welcomed to use as well if they find it usefull. More generic instructions for glite-UI on SL4 you may find here https://www.gridpp.ac.uk/wiki/Glite-UI-on-SL4

The PPS sam client is xen guest system hosted on SL5 pps-xen-285.gridpp.rl.ac.uk.(the upper machine room) If you need or want to reinstall even the host system then on touch.gridpp.rl.ac.uk run

/pps/reinstall/pps-xen-285-sl5

and reboot the machine pps-xen-285.

Make sure you are booted into xen kernel (uname -a) and xend is running, try

xm list

Make sure you have xen network working with the bridge technique. Check the file

/etc/xen/xend-config.sxp 

for presence of something like this:

 (network-script 'network-bridge vifnum=1 bridge=xenbr1')
 (vif-script 'vif-bridge bridge=xenbr1')

('ifconfig' and 'brctl show' commands should list among others the xenbr1 or xenbr0)

Network address translation NAT / masqueradding approach should also work but I have not tested it. With NAT the ifconfig and 'brctl show' should display virbr0 or virb1)

Make sure you have LVM (Logical volume manager) packages installed , try either or both of the next examples:

lvm version
lvcreate --version

Download the xen-strap tool from http://www.gridpp.ac.uk/wiki/Xen-strap or from http://www.gridpp.rl.ac.uk/pps/xen-strap/ and change the permissions

wget http://www.gridpp.rl.ac.uk/pps/xen-strap/xen-strap
chmod 755 ./xen-strap

The next command is magnum opus, it does take about 30 minutes, most of the time is the installation of the middleware:

Comment: If you want to customize the post install scripts in the next command for any reason, then download the scripts by using wget, customize them and use path to the file instead of the URL: --post-install-in="host:/root/glite-ui-sl4-customized" , if you don't specify host: or dom0: prefix, the file is looked for wrongly in guest/target system ,

The tier1 staff at RAL will make pps-sam.gridpp.rl.ac.uk with the command:

./xen-strap -a -u -y -b --name=pps-sam --ip=130.246.187.101 --ntp -i mc \
--adduser="sam-ops-pps-ral,http://wwwinstall.gridpp.rl.ac.uk/yum/pps/ks/pps_authorized_keys" \
--post-install-in="http://wwwinstall.gridpp.rl.ac.uk/yum/pps-ks/ks/glite-ui-sl4; \
http://wwwinstall.gridpp.rl.ac.uk/yum/pps-ks/ks/sam-ui-sl4 sam-ops-pps-ral" \
sl4 lvm:xen/pps-sam:3G

other people will use

./xen-strap -a -u -y -b --name=sam --ip=<your IP> --ntp -i mc \
--adduser="<the user name>,http://path_to_your/authorized_keys" \
--post-install-in="http://www.gridpp.rl.ac.uk/pps/files/glite-ui-sl4; \
http://www.gridpp.rl.ac.uk/pps/files/sam-ui-sl4 <the user name>" \
sl4 lvm:xen/pps-sam:3G

Description:

These next two arguments are mandatory
lvm:xen/pps-sam:3G  Create LVM image of the size 3G in the group LV group xen with the logical volume name pps-sam 
sl4 installs the latest  Scientific Linux 4.x on the system. (4.5)
the next options are optional
-a  installs apt system rather then default yum (for sl4 the apt is much faster then yum)
-u  updates system immediately after system installation
-y  confirm yes to install questions,  : apt-get install -y ....
-b  boot xen system immediately after succesfull installation
--ntp  installs ntp server, (for sl4) it also synchronize the time with pool.ntp.org in the guest system before running the server.
-i mc  installs package mc (Midnight Commander)
--name=pps-sam   use this name in xen system, it will be listed in: xm list, also it will be used as a hostname in guest system
                  if not specified then it tries to get the name from ip
--ip=130.246.187.101  this IP is used in guest/target system. The gateway and netmask and network are detected from host system
         this expect you use bridge network approach (as opposed to nat or route) in xen host  config file /etc/xen/xend-config.sxp
         like this
        (network-script 'network-bridge vifnum=1 bridge=xenbr1')
         (vif-script 'vif-bridge bridge=xenbr1')

--adduser="sam-ops-pps-ral,http://wwwinstall.gridpp.rl.ac.uk/yum/pps/ks/pps_authorized_keys"
      add this user to the system and copy specified file to  .ssh/authorized_keys (it is safe to publish public key on the internet)
      the password for the user is not set
--post-install-in="http://wwwinstall.gridpp.rl.ac.uk/yum/pps-ks/ks/glite-ui-sl4"
      downloads the script and source it in the chrooted environment after the system is installed and configured and before guest xen is booted 
--post-install-in="http://wwwinstall.gridpp.rl.ac.uk/yum/pps-ks/ks/sam-ui-sl4 sam-ops-pps-ral"
      downloads the specified script and source it with the argument sam-ops-pps-ral
      install sam specific packages, and pps-sam.gridpp.rl.ac.uk instance specific scripts in the chrooted evironment into sam-ops-pps-ral home directory

More datailed explanation of the command you get by ./xen-strap --help or ./xen-strap --examples


After some time (30 min) if everything is ok, you should end up with login prompt in xen-guest system with UI middleware installed. It is recommended you login via ssh rather then from host system, because the console doesn't work properly from host. By pressing CTRL + ] you leave the guest system and login via ssh from anywhere:

ssh sam-ops-pps-sam@pps-sam.gridpp.rl.ac.uk

and login by using the copied public keys. The root password for the guest system is taken from the host system.

check if the certificates are in place .globus/*.pem (non-RAL readers have to copy their own certificate there) you will have to set a password for the private key in certificate in /home/sam-ops-pps-ral/.globus/passwd (This is not secure, but it works. We should probably use my-proxy-server) For security reason I will not say here what the password is. Ask me in person or use your own certificates with ops membership. You should keep in mind that the password of your personal certificate is exposed and you should give access to the machine only to trusted users. The personal certificate (public and private key on the name Marian Klein) for sam-pps client (they are protected by a password) are stored on touch.gridpp.rl.ac.uk server in /pps/ks/globus_cert/*.pem If you are not a Marian Klein you are encouraged to use your own certificates. You also have to ask for OPS membership once you have the certificate which might take some time. (1 week)


You should have set properly the time and timezone according to host system. the guest clock is not bind to host clock, guest clock is controled by ntp server. Make sure you have London local time (maybe UTC is better idea, needs to be negotiated with other site) by creating a soft link.

[sam-pps@pps-sam sam-pps]$ ls -l /etc/loc*
lrwxrwxrwx    1 root     root           33 Jul  2 15:25 /etc/localtime -> /usr/share/zoneinfo/Europe/London

make sure the grid is configured and is working by one test submission under ops

voms-proxy-init --vo ops
glite-wms-job-submit ...

make sure the sam is configured and is working properly by sam test submission

/home/sam-ops-pps-ral/same-exec-pps CE

[sam-ops-pps-ral@pps-sam ~]$ cat crontab

30 1,3,5,7,9,11,13,15,17,19,21,23 * * * /home/sam-ops-pps-ral/same-exec-pps CE
36,38  1,3,5,7,9,11,13,15,17,19,21,23 * * * /home/sam-ops-pps-ral/same-exec-pps --publish CE

The sam tests are executed from crontab in the specific times. Currently two sites run the pps clients for redundancy. (PPS-RAL and CYFRONET-PPS) The time slots have to be negotiated over the email (Lukasz Skital <l.skital@cyfronet.pl>). They are in interleaved times. Two phase sam tests are CE and gCE. They consist of the submission stage and publishing stage (during the next execution of the two phase test) All other tests are one phase tests. At the CYFRONET for two phase tests the times have odd hour + 15 min 1:15, 3:15,.... So the PPS-RAL has to have the even hour + something for the same timezone (or any other timezone with the even shift) (To signal on the SAM page easily what line was created by which site I submit with the shift +30min rather then +15min) or odd hour+ something for local london timezine (providing there is 1 hour time shift)

if you are satisfied then

crontab crontab

Check that the crond is running.

The scripts in /home/sam-pps are very simple. You may study them quickly. The description of files is in

https://twiki.cern.ch/twiki/bin/view/LCG/PPSSamInstallation

Instruction for production SAM client

http://sam-docs.web.cern.ch/sam-docs/index.php?dir=./admin/client/&

How to update PPS SAM Client after CA upgrade

apt-get update
apt-get install lcg-CA

look at the site http://grid-deployment.web.cern.ch/grid-deployment/gis/SAM/slc3/RPMS.SAM/ and find the latest version of lcg-sam-client-sensors-*.rpm download it :

wget http://grid-deployment.web.cern.ch/grid-deployment/gis/SAM/slc3/RPMS.SAM/lcg-sam-client-sensors-1.3.0-2.noarch.rpm

and do


apt-get install lcg-sam-client-sensors-1.3.0-2.noarch.rpm

Any problems consult with l.skital@cyfronet.pl, a.retico@cern.ch or sam-support@cern