Difference between revisions of "VOMS"

From GridPP Wiki
Jump to: navigation, search
m
m
Line 193: Line 193:
 
</references>
 
</references>
  
{{KeyDocs|responsible=Robert Frank|reviewdate=2018-06-04|accuratedate=2018-06-04|percentage=70}}
+
{{KeyDocs|responsible=Robert Frank|reviewdate=2018-11-06|accuratedate=2018-11-06|percentage=70}}

Revision as of 09:00, 6 November 2018

VOMS Service Setup

Replication

The current setup consists of one master server hosted in Manchester and two slaves in Oxford and Imperial.

The master server hosts both, the VOMS daemons that issue the attribute certificates (AC), and the VOMS-admin interface that is used by VO managers to manage their VOs and by users to join a VOs or update their VO membership. The slaves run the VOMS daemons, but no VOMS-admin interface. They cannot be used to get ACs, but not to modify VOs.

All servers run a local MySQL database. Each slave has a local copy of the master database which is kept up-to-date using MySQL replication. The connections between master and slaves are secured with SSL.

Restarting Services

Normally, the services only have to be restarted if their configurations have changed or if newer versions have been installed during an upgrade.

VOMS daemons can be either restarted or reloaded. Configuration changes can be applied by reloading the daemons. A restart is only necessary if there are problems with the service or after software upgrades.

A single VOMS daemon can be reloaded or restarted by running

service voms reload [VO]     # reloads the daemon for [VO]
service voms restart [VO]    # restarts the daemon for [VO]

If the VO name is omitted then all VOMS daemons are reloaded / restarted.

The configuration is not read instantly when the service is reloaded. This happens when the next client connects, but before the client request is processed. It is enough to open a connection to the port of the daemon to force the reading of the configuration file, e.g.,

echo "" | telnet voms.gridpp.ac.uk 15050

To restart the Java service container, run

service voms-admin restart

This will reload all VOMS-admin instances as well. To restart a single VOMS-admin instance it has to be removed and added again (there is no restart option and the stop/start actions don't seem to work):

service voms-admin undeploy <VO>
service voms-admin deploy <VO>

IMPORTANT, please read before restarting any services:

Restarting services makes them inaccessible for a period of time. The restart of the VOMS daemons only takes a few seconds if at all. Even though the command to restart VOMS-admin returns almost immediately, it takes a lot longer to complete, as most of it is done in the background. The command just starts the process, but does not wait for its completion.

EVEN MORE IMPORTANT

VOMS-admin stores the last notification date in memory only (see VOMS notifications). Every time VOMS-admin or a VO is restarted, emails are sent out to VO admins !!! See Dealing with Notification E-Mails on how to avoid the problem.

Adding New CAs

A restart of services is not required after installing new CA certificates on the server. VOMS-admin requires a CRL to be present before accepting the CA. The CRLs of new CAs can be downloaded by manually running

fetch-crl

The list of CAs that the Java service container accepts can be shown with the following command (on the server):

echo "" | openssl s_client -connect localhost:8443 -CApath /etc/grid-security/certificates/ \
 -cert /etc/grid-security/hostcert.pem -key /etc/grid-security/hostkey.pem

Adding New VOs

This section contains obsolete information

Due to major changes to the VOMS deployment with the upgrade from EMI-2 to EMI-3, the way of adding new VOs described in this section stopped working. There is no yaim, and the script used on the master server was not installed on the server because it stopped working after the upgrade. This section will be updated once the replacement script and procedures are in place.

Master Server

It is not advisable to use yaim to add a VO due to the disruptions to services (multiple restarts of Tomcat and VOMS daemons) and other unwanted side effects (see comments in Restarting Services) it is causing.

Instead use the man-voms-add-vo script located in /opt/glite/bin. It will deploy the VO, update the configuration files with our default settings, update the ACL of the VO, and start the processes for the VO. It can also be used to update the yaim configuration files. All options are explained in the usage instructions that can by viewed by running

/opt/glite/bin/man-voms-add-vo -h

In most cases the following example should be sufficient when deploying a VO:

/opt/glite/bin/man-voms-add-vo --vo <vo name> --email <email> --hostname voms.gridpp.ac.uk --restart --yaim-config /etc/yaim

The --restart parameter restarts the VOMS admin siblings web application that shows the list of VOs (/vomses). Without restarting it, the list will not contain the new VO. It will not trigger any emails to VO admins.

Replicated Servers

The VO has to be configured on the replicated servers after it was deployed on the master. Currently, the replicated servers are configured by yaim, so they need the relevant yaim configuration blocks for the new VOs. These can be generated by running

/opt/glite/bin/print_slave_yaim_vo_config <vo>

on the master. Multiple VOs can be specified on the command line. Please be aware that this script only creates yaim configuration options for the configuration options supported by the man-voms-add-vo scripts. Other yaim options or direct changes to the VO configuration files are not covered and have to be either added to the yaim configuration or added to the VO configuration on the replicated servers manually after running yaim (if the option is not available in yaim). The printed information can be sent to the sites hosting the servers, which have to put it into the services/glite-vomsdaemons configuration file and re-run the configure_slave_yaim script. Site admins should not run yaim directly on the replicated servers, they should always use the wrapper script.

Removing VOs

Replicated Servers

VOs have to be removed from the replicated servers first. The hosting sites have to do the following steps for each VO (replace <vo> with the name of the VO):

  • run 'service voms stop <vo>'
  • remove the /etc/voms/<vo> directory
  • remove <vo> from the VOS variable in site-info.def
  • remove the configuration section of the VO in services/glite-vomsdaemon

There is no need to run yaim on the replicated servers to remove VOs.

Master Server

Once the VO has been removed from all replicated servers, it can be removed from the master. To do this follow the steps for each VO (replace <vo> with the name of the VO):

  • run 'service voms stop <vo>'
  • run 'service voms-admin undeploy <vo>'
  • create a backup of the configuration directories and log files:
export vo=<vo>
(umask 0077; cd /etc/voms; tar cjvf /var/backups/voms/$vo.tar.bz2 $vo)
(umask 0077; cd /etc/voms-admin; tar cjvf /var/backups/voms/$vo-admin.tar.bz2 $vo)
(umask 0077; cd /var/log/voms; tar cjvf /var/backups/voms/$vo-logs.tar.bz2 voms.$vo*)
(umask 0077; cd /var/log/voms-admin; tar cjvf /var/backups/voms/$vo-adminlogs.tar.bz2 voms-admin-$vo*.log)
  • make a final backup of the database, the easiest way is to run the backup script that creates backups for all VOs and copy the VO's backup file.
/opt/bin/voms_dbbackup
export vo=<vo>
(unask 0077; ls /var/backups/voms/mysql/$vo.*.sql | sort | tail -n1 | xargs -n1 -I{} cp {} /var/backups/voms)
bzip2 /var/backups/voms/$vo.*.sql
  • check the database name in the configuration file /etc/voms/<vo>/voms.conf (--dbname=<dbname>)
  • remove the /etc/voms/<vo> and /etc/voms-admin/<vo> directories.
  • delete the database
mysql
mysql> drop database <dbname>;
mysql> quit
  • copy the VO backup files to the backup server
    Log files and database backup files should be kept in compliance with current laws and Grid policies[1].

Dealing With Notification E-Mails

Notification emails are sent to the VO admins or users every time the web application of a VO or the Java service container is restarted (see comments in Restarting Services). The easiest way of avoiding this problem is to prevent sendmail from sending out any emails during maintenance. This can be done by running the command

/opt/bin/disable_sendmail

on the server. The script configures sendmail to redirect all outgoing email to the mailbox of the local root account. Those emails can be viewed with a local email client such as mutt. Important emails that are not related to automated notifications can be forwarded to the users once sendmail is re-enabled.

Currently, there is no script to re-enable sendmail, but the following steps can be used to do it:

cp /usr/share/sendmail-cf/m4/proto.m4.org /usr/share/sendmail-cf/m4/proto.m4
cp /etc/mail/sendmail.mc.org /etc/mail/sendmail.mc
make -C /etc/mail
service sendmail restart

The *.org files were created manually before the first run of disable_sendmail. If they are not present then look for the backup copies that disable_sendmail creates in /var/backups/sendmail.

Undelivered emails should be checked after restoring sendmail. Notification emails can be ignored, all other emails should be forwarded to the respective users. Emails with the following subjects are notification emails that can be ignored:

  • [VOMS Admin] Expired members notice for VO <vo>
  • [VOMS Admin] Membership expiration notice for VO <vo>

Installing Updates

Host Certificate Updates

The server uses two host certificates, the local host certificate (local server name) and the public host certificate (voms.gridpp.ac.uk). All certificates are updated by Puppet. The services are not being restarted by Puppet automatically because of the issues with VOMS-admin sending out emails on each restart. The restart has to be done manually (see Restarting Services). The following services use the local host certificate and have to be restarted when it is renewed:

  • MySQL server

The following services use the public host certificate and have to be restarted when it is renewed:

  • VOMS daemons (service certificate only)
  • VOMS-admin container (service certificate only)

History of host certificate updates

  • 05/06/2018
  • 23/05/2017
  • 10/05/2016
  • 01/05/2015
  • 10/04/2014 (heartbleed bug)
  • 11/03/2014
  • 04/03/2013
  • 14/02/2012
  • 08/02/2011
  • 01/02/2010

References

  1. Grid Security Traceability and Logging Policy

This page is a Key Document, and is the responsibility of Robert Frank. It was last reviewed on 2018-11-06 when it was considered to be 70% complete. It was last judged to be accurate on 2018-11-06.