DPM 1.7 Upgrade

From GridPP Wiki
Revision as of 11:48, 11 September 2009 by Wahid bhimji (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This documents experiences during the upgrade to DPM 1.7.2-4 with a mysql database on the DPM head node.

It can be used alongside these official instructions

RPMS to install

The official instructions list some packages but these are the actual package names (in particular DPM-gridftp-server is in the DPM-DSI package)

  • DPM-client (some of the later packages depend on this so will pull it in)
  • DPM-server-mysql
  • DPM-copy-server-mysql
  • DPM-name-server-mysql
  • DPM-srm-server-mysql
  • DPM-rfio-server
  • DPM-DSI

We found the only other dependent package that was not already installed (or at a suitable version) to be

  • glite-security-voms-api-cpp

Backups

The RPMs only install files in /opt/lcg ; /opt/glite and /etc so you may as well back these directories up.

You should also ensure the mysql database is being backed up following instructions such as those in the MySQL Backups page.

Increase InnoDB pool size for mysql

Edit

/etc/my.conf

add

set-variable=innodb_buffer_pool_size=256M

and restart

/sbin/service mysql restart

Stop services

On disk servers

/sbin/service dpm-gsiftp stop 

On DPM head node

/sbin/service srmv2.2 stop
/sbin/service srmv2 stop
/sbin/service srmv1 stop
/sbin/service dpm stop

On disk servers

/sbin/service rfiod stop 

On DPM head node

/sbin/service dpnsdaemon stop

Clean up anything else

Take the down-time opportunity to do some other cleaning up - perhaps just by rebooting the nodes.

Install head node RPMS

Ensure

enabled=1

in

/etc/yum.repos.d/glite-dpm-mysql.repo

Then

yum update DPM-server-mysql
yum install DPM-copy-server-mysql
yum update DPM-name-server-mysql
yum update DPM-srm-server-mysql
yum update DPM-rfio-server 
yum update DPM-DSI
yum update DPM-client

The last one won't do anything as it is installed earlier as a dependency

Run schema upgrade migration script

Check you have enough space

Ensure you have enough space on /var.

ls -lh /var/lib/mysql 
df -kh /var

you will need enough for around double the current contents of /var/lib/mysql

If not you could try pruning some of the contents of the mysql database.

Run script

YAIM will do this for you or you can easily run it yourself.

cd /opt/lcg/share/DPM/dpm-db-310-to-320

put the db password into a pwd file

./dpm_db_310_to_320 --db-vendor MySQL --db localhost --user root --pwd-file /root/pwd --dpm-db dpm_db --verbose

We did this with root as user but you may have a different dpm account

This takes quite a while (approx 2-3 hours for our 9 GB database) - so you can do the disk server install below AND have a sandwich in the meantime.

Install disk server RPMS

You can install the packages (DPM-rfio-server and DPM-DSI) using yum in a similar way to above.

Due to the way these disk severs are set up at Edinburgh we installed the RPMs (and required dependencies) using rpm directly:

rpm --nodeps -iv http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-SE_dpm_disk/sl4/x86_64/RPMS.release/glite-security-voms-api-cpp-1.8.12-1.slc4.x86_64.rpm
rpm --nodeps -iv http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-SE_dpm_disk/sl4/x86_64/RPMS.updates/DPM-client-1.7.2-4sec.slc4.x86_64.rpm
rpm --nodeps -iv http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-SE_dpm_disk/sl4/x86_64/RPMS.updates/DPM-rfio-server-1.7.2-4sec.slc4.x86_64.rpm
rpm --nodeps -iv http://glitesoft.cern.ch/EGEE/gLite/R3.1/glite-SE_dpm_disk/sl4/x86_64/RPMS.updates/DPM-DSI-1.7.1-2sec.slc4.x86_64.rpm

Check / replace configuration files

This will have overwritten some configuration files so you could either.

1. Rerun yaim to write out all configuration.

2. Edit these files by hand.

We only edited

/etc/sysconfig/dpm-gsiftp 

to reinstate

DPNS_HOST=srm.glite.ecdf.ed.ac.uk
DPM_HOST=srm.glite.ecdf.ed.ac.uk

Copy patched srmv2.2 binary

DPM versions 1.7.0 to 1.7.2-5 have an unresolved bug reported here.

This should only affect a heavily used node but you can copy a patched binary as follows

cd /opt/lcg/bin
mv srmv2.2 srmv2.2.orig
wget http://cern.ch/~dhsmith/DPM-server-mysql-1.7.2-4sec.slc4.x86_64_patch_for_53568/srmv2.2
chmod 755 srmv2.2

Restart services

On the head node

/sbin/service dpnsdaemon start

On the disk servers

/sbin/service rfiod start
/sbin/service dpm-gsiftp start

On the head node

/sbin/service dpm start
/sbin/service srmv1 start
/sbin/service srmv2 start
/sbin/service srmv2.2 start
/sbin/service dpmcopyd start

Test

Rfio on pool nodes

rfmkdir pool2.glite.ecdf.ed.ac.uk:/tmp/wahid
rfcp bob pool2.glite.ecdf.ed.ac.uk:/tmp/wahid
rfdir bob pool2.glite.ecdf.ed.ac.uk:/tmp/wahid
rfrm bob pool2.glite.ecdf.ed.ac.uk:/tmp/wahid/bob

SRM and GSIFTP

Mimicking the Steve Lloyd SE test

Copy and register

lcg-cr -v --vo atlas -d srm.glite.ecdf.ed.ac.uk file:/phys/linux/wbhimji/bob

List register

lcg-lr --vo atlas lfn:WHAT EVER WAS GIVEN TO YOU IN OUTPUT OF ABOVE

Copy file back

lcg-cp -v --vo atlas srm://srm.glite.ecdf.ed.ac.uk/dpm/ecdf.ed.ac.uk/home/atlas/generated/2009-07-30/file3830946d-62f8-4efe-9a15-20658ad0a60a file:/phys/linux/wbhimji/bobbyD

Clean

lcg-del -v --vo atlas -s srm.glite.ecdf.ed.ac.uk  lfn:/grid/atlas/generated/2009-07-30/file-3dd98ccf-001a-41ea-8f17-7765e9bee488

This will probably use srm v1 so you can also test v2 with

lcg-cr -D srmv2 -v --vo atlas -d srm.glite.ecdf.ed.ac.uk file:/phys/linux/wbhimji/bob

Check the SAM Tests