RAL Tier1 Upgrade Plan

From GridPP Wiki
Jump to: navigation, search

Upgrade Plan for 2.1.9 (DRAFT)

The following contains technical details and schedule for the proposed upgrade of CASTOR to 2.1.9. This is quite a complex upgrade and the plan is yet to be reviewed by the Configuration Control Board and other parties such as the PMB. Nevertheless there are certain steps that must be taken in order to achieve this upgrade as transparently as possible. Dates are subject to change and require agreement from the relevant experiments.


This document does not cover configuration and use of gridFTP internal since it is not suitable for use at RAL. This protocol requires a one-to-one mapping between disk pools and service classes which is not the case in some of the setups used at RAL.

The following list the dependencies and possible future issues which have been considered. This has been requested by the CCB. The list may not be exhaustive and the CCB may wish to add to the list.

Issue Attention required Notes
Do client tools on worker nodes and UI need updating Do not beleive this is required Some functional tests successfully performed with 2.1.7 clients. More exhaustive testing during VO testing. However, if xrootd clients are required this will need an upgrade.
Impact on lcg-utils None Will be tested during VO testing
Will central logging be affected by use of rsyslog No Providing the correct rsyslog config file is rolled out to all affected nodes, all required information should be redirected to the central loggers. However space on the central loggers will need to be monitored.
Will the local log files cause space problems on the var partition Possibly Anecdotally, the log files seem larger. However, since CASTOR logs can now be logged on the central loggers, we may need to keep fewer on local machines.
FTS None Will be tested by VO testing
CIP and BDII None Jens has tested the CIP with the pre-production system and it seems there is no impact. The CIP has been also successfully tested against 2.1.9 with the S2 tests written by Flavia Donno which verifies capacity conformance
Nagios tests Require rewriting. JPK doing this. Issue with switching tests during a rolling upgrade still to be investigated.
Restarters Require rewriting Restarters on stager node for pre-prod done. The rest need rewriting.
LFC None LFC is decoupled from CASTOR. The upgrade will have no impact.
CastorMon None The calls used to derive information for castormon do not change with this ugrade so no impact is forseen. In addition, CastorMon is a non-critical service so we can potentially work without it for a time until any problems are fixed.

Preliminary Work

The following preliminary work must take place prior to the upgrade. This preliminary work can be undertaken at any point prior to the upgrade and can be done in a rolling manner on each instance. The configuration has been tested already and we are confident that this step is low risk. For each instance there will be a short (< 0.5 hours) service interruption while all CASTOR services are restarted. While it would be possible to do this at-risk it may be more prudent to request a short outage so all services are brought up cleanly and coherently. The steps to be undertaken for each instance are as follows:

  • Install a local nameserver on the DLF machine for each instance. This will need to be at the same version as the current CASTOR version (2.1.7-27) to avoid library incompatibilities.
  • Modify the CastorConfig table in each stager schema as follows:
    UPDATE CastorConfig SET value = 'castorns.ads.rl.ac.uk' 
        WHERE class = 'stager' AND key = 'nsHost';
  • Modify castor.conf on all the CASTOR head nodes and disk servers to tell the system to use the local nameserver rather than the central one. This requires adding/modifying the CNS HOST entry to reference the DLF machine for that instance.
  • Restart all CASTOR daemons on the CASTOR head nodes and disk servers.

This use of local nameservers has been tested and is currently used in production for the SRM's. This method is also used for upgrades at CERN.

In addition we will carry out some preliminary work to remove redundant data from the stager schemas which would otherwise slow down the later schema upgrade process. This has been tested on snapshots of the production schemas. The procedure for the stager schemas is as follows:

  • Clean up entries in the castorfile table which relate to non existent service classes. These can be found using the following:

Once these svcclass entries are found they can be deleted from the castorfile table

    DELETE FROM CastorFile WHERE svclass=<id of svcClass>
  • Cleanup all entries in the diskcopy table that do not have an entry in the castorfile table. These can be identified using the following:

In some cases at RAL we have also found entries in the diskcopy table which have a castorfile value of 0. These can also be deleted. In addition, we will need to assign an appropriate file class to the nameserver top level directories. In the current production this would only be the directories '/' and '/castor'.

  • Revalidate DiskCopy to cadtorFile constraints again


Upgrade the Central Nameserver

Once all the instances are configured to run the local nameserver, the central nameserver can then be upgraded to 2.1.9. NOTE: This will be a pure upgrade of the relevant RPMs (nameserver, CUPV, VMGR and VDQM). The nameserver upgrade itself is low risk since by this point nothing should be using it. The upgrade of CUPV, VMGR and VDQM does carry some additional risk since these are shared by all instances. However, since there are no schema changes between our current version (2.1.8-3) and the latest 2.1.9, this additional risk is minimal. If needed, it could be further reduced by using local daemons, but this configuration has not been tested.

Following the upgrade the list of CASTOR related rpm's are shown here.

Upgrade CASTOR Services

Details of the upgrade plan are available here

The upgrade of the CASTOR instances can be done in a rolling manner, but will require about 3 days of downtime for each experiment. This downtime includes not only the time to perform the necessary steps, but allows time for internal testing before reopening the service to users. This will also require a drain of the FTS queues and disabling the batch system for affected VO's. The steps required are as follows based on our internal testing; however see the additional notes for the upgrade of the GEN instance where xrootd is required.

  • Drain batch system for affected VO's
  • Drain and stop FTS for the affected VO's
  • Stop all CASTOR services on CASTOR head nodes and disk servers
  • Take a copy of the rhServer init.d startup script (testing has shown this gets overwritten)
  • Upgrade all RPMs on CASTOR head nodes and disk servers and replace the rhServer init script
  • Upgrade the relevant stager database schema to 2.1.9
  • Modify all restarters to use the new daemon names
  • Modify the appropriate nagios scripts to use the new daemon names
  • Restart all CASTOR services on head nodes
  • Perform functional tests on the upgraded system
  • Reopen FTS and batch system.

The following pages show the RPM's that should be installed on the Stager Node, LSF Node, DLF Node and diskservers

Additional Information for DLF Replacement

In 2.1.9, th DLF application has been deprecated and replaced with a combination of rsyslog and a new application, logprocessord. needs to be installed on all of the castor head nodes including the nameserver and the SRM when it is upgraded to 2.1.9. It must also be installed on all the disk servers. The logprocessord only needs to be installed on the head nodes. The following procedure should be followed:

    • Make sure all CASTOR services are down.
    • Ensure /etc/castor/DLFCONFIG is installed on all the head nodes, and the correct entry is in the tnsnames.ora file.
    • Install rsyslog.
    • Install /etc/rsyslog.conf from https://svnweb.cern.ch/trac/CASTOR/browser/CASTOR2/trunk/debian/rsyslog.conf.server
    • Start rsyslog (NOTE:: This should also be ckconfig'd on)
    • Install castor-logprocessor-server
    • Copy /etc/castor/logprocessord.conf.example to /etc/castor/logprocessord
    • Start logprocessord. (NOTE: - at RAL it seems that TNS_ADMIN is ignored by logprocessord; tnsnames.ora must be in /etc as well as /etc/castor.)
    • Start all CASTOR processes
    • Check that everything is working by checking the files /var/log/castor/logprocessord.log.dlf-syslog-to-db and /var/log/dlf/syslog.input

Additional Information for xrootd

Detailed instructions are available here, but the following gives an outline. xrootd needs to be installed on all disk servers and on the manager node(s). It is not necessary to install any xroot specific packages on the head nodes unless one or more of these is designated the manager.

  • On the disk servers and manager(s) install the following rpm's:

For ALICE, the following rpm's are also required:

  • On disk servers and managers, create a directory /opt/xrootd/keys
  • On manager, add /opt/xrootd/lib to LD_LIBRARY_PATH
   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/xrootd/lib
  • Configuring the Manager
    • Following procedures in the above link create a service key in /opt/xrootd/keys directory
    • Copy the public key to the /opt/xrootd/keys directory on all disk servers (or push out via puppet/quattor/whetever)
    • Copy /etc/sysconfig/xrd.example to /etc/sysconfig/xrd
    • Edit /etc/xrd.cf. Some examples of this configuration are shown in the link provided. Additionally, the Manager Configuration File which has been set up for PreProduction at RAL. Note that this sample configuration file has been created with minimal security; tests have not yet been carried out with SSL or Kerberos authentication. There are a few important points to note when setting this up.
      • The all.manager item must contain the result of hostname -f, not any convenient alias.
      • In the xcastor2.stagemap, if multiple service classes are associated with the nameserver path, then the first service class wil be used for all reads and writes. In order to get xroot to search for through all services classes for a file for reading, it is neccesary to add the nohsm option to the xcastor2.stagerpolicy item. However, this option also prevents tape recalls from xrootd.
    • Start the manager process
    service xrd start
    Starting xrootd as manager Starting with -n manager -r -c /etc/xrd.cf -l /var/log/xroot/xrdlog.manager -b
  • Configuring Disk Servers
    • Copy /etc/sysconfig/xrd.example to /etc/sysconfig/xrd
    • There is no need to modify /etc/xrd.cf supplied with with xrootd rpm
    • Start xrd process on the disk servers
    service xrd start
    Starting xrootd as server Starting with -n server -c /etc/xrd.cf -l /var/log/xroot/xrdlog.server -b -R stage

Post Upgrade Phase

Following the main upgrade for each instance it will be necessary to upgrade the nameserver schema and reset each instance to use the central nameserver. This will require a complete downtime for CASTOR of the order of 2 hours to make the relevant changes. Since this upgrade involves a schema change (dropping a column), it can not be done until all nameservers are at a version greater than 2.1.9-3. The following steps are required:

  • Stop all CASTOR services on the CASTOR head nodes (including SRM)
  • Upgrade the database schema to the latest 2.1.9
  • Modify castor.conf on all head nodes, SRM's and disk servers to ensure the CNS HOST parameter refers to the central nameserver
  • Restart all CASTOR daemons on head nodes, SRM's and disk servers.

We propose to stop the local nameserver on the DLF node at this point, but will not remove the RPM so this method can be used in any future upgrades.


The proposed schedule is as follows:

  • Switch to local nameserver
    • GEN
    • ATLAS
    • CMS
    • LHCb
  • Upgrade Central nameserver
  • Upgrade CASTOR Head nodes
    • GEN
    • ATLAS
    • CMS
    • LHCb
  • Revert to Central nameserver.

Summary of Downtimes

This plan requires a total of 3 down times for each experiment of varying lengths.

  1. The first will be a relatively short interruption while we switch daemons to use a local nameserver. This can be done on a rolling basis and should require < 1 hour downtime for each instance.
  2. The next time will be for the upgrade of each instance. This will require about 2 days downtime and a drain and stop of FTS and the batch system (TBC) prior to upgrade. This will be done in a rolling fashion at a time convenient for each experiment. We will give as much notice as possible for this.
  3. A short (about 2hour) downtime for all experiments while we upgrade the nameserver schema and switch back to using the central nameserver.

Current Issues Identified in Testing

  • The logprocessord seems to die randomly.
  • The NSFILEID is not logged in DLF.
  • service logmonitord awlays reports failed.
  • We have had one rmMasterDaemon crash.
  • After first starting logprocessord, all DLF reporting failed. It is necessary to restart each service to re-register it with DLF. This will need to be done every time the DLF database is recreated.