Difference between revisions of "DPMUpgradeTips"

From GridPP Wiki
Jump to: navigation, search
Line 11: Line 11:
 
   
 
   
 
= Tips =  
 
= Tips =  
 +
==Generic: DPM 1.8.9 Extreme logging
 +
 +
if you don't use puppet you need to manually add LogLevel 1 after LoadPlugin plugin_config /usr/lib64/dmlite/plugin_config.so in /etc/dmlite.conf
  
 
== Glasgow: SRM daemon unstable after upgrade to 1.8.7 ==
 
== Glasgow: SRM daemon unstable after upgrade to 1.8.7 ==
Line 19: Line 22:
  
 
to /etc/sysconfig/srmv2.2
 
to /etc/sysconfig/srmv2.2
 
==Generic: DPM 1.8.9 Extreme logging
 
if you don't use puppet you need to manually add LogLevel 1 after LoadPlugin plugin_config /usr/lib64/dmlite/plugin_config.so in /etc/dmlite.conf
 
  
 
==Generic: DPM 1.8.7 "/etc/dmlite.conf.d/adapter.conf missing" and other oddness diverging from the expected ==
 
==Generic: DPM 1.8.7 "/etc/dmlite.conf.d/adapter.conf missing" and other oddness diverging from the expected ==

Revision as of 10:30, 4 March 2015

This page is a collection of tips determined by various sites upgrading DPM versions.

See also:

Tips

==Generic: DPM 1.8.9 Extreme logging

if you don't use puppet you need to manually add LogLevel 1 after LoadPlugin plugin_config /usr/lib64/dmlite/plugin_config.so in /etc/dmlite.conf

Glasgow: SRM daemon unstable after upgrade to 1.8.7

For some reason, srmv2.2 daemon became very unstable after update - fixed by re-adding the

export GLOBUS_THREAD_MODEL="pthread"

to /etc/sysconfig/srmv2.2

Generic: DPM 1.8.7 "/etc/dmlite.conf.d/adapter.conf missing" and other oddness diverging from the expected

Make sure you have updated dpm-yaim (release version: 4.2.16-1) as well as the dpm packages themselves (the two are not linked). Also make sure that dmlite-libs has a release like : 0.6.1-1


Generic: DPM 1.8.7 "permission denied" errors , SL5 and SL6

The dpm database may need updated to replace "null" entries with "0" entries for the banned field. This is a result of a change in the parsing used by xrootd and other dmlite backed transfer protocols. To do this:

mysql -u dpmmgr -p
                                       < type password, e.g. the one in /usr/etc/DPMCONFIG >
use cns_db
update Cns_groupinfo set banned = 0 where banned is null;
update Cns_userinfo set banned = 0 where banned is null;
commit;
quit

Update (25 Oct 13): This is still needed was believed fixed with latest Yaim but encountered since then)

Oxford: Head node (and Disk nodes) DPM 1.8.6 (EMI2) -> DPM 1.8.7 (EMI 3 and EPEL) SL5 ; Upgrade in place

Xrootd on disk servers transfers stopped working with :

130916 16:58:06 13158 Xrd: CheckErrorStatus: Server [t2se01.physics.ox.ac.uk:1094] declared: Unable to open /dpm/physics.ox.ac.uk/home/atlas/bob; unknown error 1020(error code: 3005) 
Last server error 3005 ('Unable to open /dpm/physics.ox.ac.uk/home/atlas/bob; unknown error 1020')

This turned out to be because the path of the physical file on the disk server starts with

/dpm

If this hits you the current workaround is to edit /etc/dmlite.conf.d/adapter.conf

LoadPlugin plugin_fs_rfio /usr/lib64/dmlite/plugin_adapter.so

to

LoadPlugin plugin_fs_io /usr/lib64/dmlite/plugin_adapter.so

and restart xrootd (/sbin/service xrootd restart)


ECDF: Head node (and Disk nodes) DPM 1.8.6 (EMI2) -> DPM 1.8.7 (EMI 3 and EPEL) SL5 ; Upgrade in place

A number of packages have been placed now only in EPEL and removed from EMI3. The metapackage is in EMI. I did the upgrade as this transition was taking place ( 6 Sep 13) so some of the steps may not be needed once all the packages are transitioned.

Upgrade to EMI3

rpm --import http://emisoft.web.cern.ch/emisoft/dist/EMI/3/RPM-GPG-KEY-emi
rpm -Uvh http://emisoft.web.cern.ch/emisoft/dist/EMI/3/sl5/x86_64/base/emi-release-3.0.0-2.el5.noarch.rpm

Clear the yum cache

rm -rf /var/cache/yum/*

Update

yum update --exclude=kernel* 

Rerun yaim

xrootd didn't start . Some packages not updated from latest epel and didn't appear to be in my epel mirror so Disabled protection and priorities in

/etc/yum.repos.d/emi3-base.repo 

changed epel repo in

/etc/yum.repos.d/epel.repo

to

baseurl=http://dl.fedoraproject.org/pub/epel/5/x86_64/

yum update and reran yaim and everything started fine.

Following a similar procedure for the Disk nodes encountered one further issue: emi-dpm_disk currently depends on dmlite-plugins-mysql - this caused my xrootd transfers to fail . You need to make sure either to have

DMLITE="yes" in your yaim site config file,

Or at least NOT "no". An alternative is to remove that package (on the disk node only )

rpm -e --nodeps dmlite-plugins-mysql-0.6.0-2.el5.x86_64

ECDF: Head node glite3.1 (!) -> EMI 2 (DPM 1.8.3) SL5 ; Fresh install on new box - moved hostname

  • Completely set up box ie ran installed yum ; ran yaim ; tested transfers with a different hostname before moving the hostname. That may have caused some of the problems below but also meant I could be sure ports etc. were open before the downtime.
  • Access denied for root@localhost: had to run mysql_upgrade
  • In some places e.g. /etc/sysconfig/rfio /etc/cron.monthly/create-default-dirs-DPM.sh DPM_HOST and DPNS_HOST were set to the old hostname.

also the files/usr/etc/DPMCONFIG /usr/etc/NSCONFIG had the old hostname which causes database problems. This would not happen if you did not run yaim before

  • When dpm starts up it cannot contact half of the disk servers and marks them DISABLED: host iptables firewall
  • lcg-cp got stuck on transfer for any disk server. This was the reverse DNS being cached on part of the cluster.

UPDATE: yum updated this node to 1.8.4 which is supposed to have "pthread" set in sysconfig scripts set automatically (see link to 1.8.3 release notes at top of page). Unfortunately that did not happen for me and it updated the sysconfig scripts I already did have which caused the dpns deamon to be unstable. Adding the "pthread" line fixed it.

ECDF: Disk node glite3.2 -> EMI 2 (DPM 1.8.4) SL5 ; upgrade in place

Followed steps on DPM trac page mentioned at the top of this page. Specifically the yum remove step in that guide removes the whole of dpm so I stopped the services before that.

Disable old repos
/etc/yum.repos.d/glite.repo
/etc/yum.repos.d/glite-SE_dpm_disk.repo
/etc/yum.repos.d/dag.repo

yum install yum-priorities yum-protectbase
rpm -ivh http://emisoft.web.cern.ch/emisoft/dist/EMI/2/sl5/x86_64/base/emi-release-2.0.0-1.sl5.noarch.rpm
wget http://mirror01.th.ifl.net/epel//5/x86_64/epel-release-5-4.noarch.rpm
yum install epel-release-5-4.noarch.rpm 
/sbin/service xrootd stop 
/sbin/service dpm-gsiftp stop 
/sbin/service rfiod stop 
yum remove vdt_globus_essentials vdt_globus_data_server lcg-service-proxy glite-yaim-dpm

Had to move old globus directories and possibly lcg dir too

mv /opt/globus/ OLDOPTGLOBUS
mv /opt/lcg OLDOPTLCG
yum install  emi-dpm_disk

Had to update yaim core and install mkgridmap neither of which were pulled in at correct version (the former causes all kind of service startups to fail as it looks in the wrong places)

yum update glite-yaim*
yum install edg-mkgridmap

  • Had to set extra yaim variables SITE_SUPPORT_EMAIL=lcg_support@nesc.ac.uk; DMLITE="no" (the later could be yes but then even more variables seem required)
/opt/glite/yaim/bin/yaim -c -s siteinfo/site-info.def -n emi_dpm_disk
  • As I have xrootd on these had to copy back config ; disable old test repo (as now it is in EMI externals) and yum install it
 cp /etc/xrootd/xrootd-dpmdisk.cfg.rpmsave /etc/xrootd/xrootd-dpmdisk.cfg
emacs -nw /etc/yum.repos.d/dpm-xrootd5.repo 
yum install dpm-xrootd
/sbin/service xrootd restart

EXTRA: Actually on a reyaim, I found certain variables are required for xrootd: DPM_XROOTD_SHAREDKEY and DPM_XROOTD_DISK_MISC (for ATLAS) - see the DPM trac pages for details of what to set them to.

EXTRA2: After a while I also found that lcg-expiregridmapdir had not been (properly) updated was at 2.1.0 instead of 3.0.1 so the cron script pointed to a non-existant EMI script. This was resolved with rpm -e --nodeps lcg-expiregridmapdir-2.1.0-1.noarch; yum install lcg-expiregridmapdir

EXTRA3: For upgrading from an already EMI (1.8.3) release to a 1.8.4 then I just ran

yum update --enablerepo=epel --enablerepo=EMI-2-base --enablerepo=EMI-2-contribs --enablerepo=EMI-2-third-party --enablerepo=EMI-2-updates emi-dpm_disk *yaim*

and reran yaim with set DMLITE="no", DPM_XROOTD_SHAREDKEY and DPM_XROOTD_DISK_MISC

Brunel MICE SE

My experiences with the MICE-specific SE at Brunel are here.