Difference between revisions of "OldEMITarball"

From GridPP Wiki
Jump to: navigation, search
(Created page with "=='''Overview'''== The production, maintenance and testing of the relocatable EMI2 and EMI3 worker node and UI middleware (the ''tarball'') has been taken over by members of t...")
 
(Obsoleted)
 
Line 1: Line 1:
 
=='''Overview'''==
 
=='''Overview'''==
 +
 +
<b>Note (23 July 2015 ):</b> This page is now obsolete. Please check [https://www.gridpp.ac.uk/wiki/EMITarball].
 +
 +
 
The production, maintenance and testing of the relocatable EMI2 and EMI3 worker node and UI middleware (the ''tarball'') has been taken over by members of the UK NGI (GridPP), with help from David Smith at CERN. Work started fairly late in 2012, the first "production" release was made available on the 14th of January 2013. Work continues to produce EMI3 versions, hoping for a release by mid January 2014.
 
The production, maintenance and testing of the relocatable EMI2 and EMI3 worker node and UI middleware (the ''tarball'') has been taken over by members of the UK NGI (GridPP), with help from David Smith at CERN. Work started fairly late in 2012, the first "production" release was made available on the 14th of January 2013. Work continues to produce EMI3 versions, hoping for a release by mid January 2014.
  

Latest revision as of 09:57, 23 July 2015

Overview

Note (23 July 2015 ): This page is now obsolete. Please check [1].


The production, maintenance and testing of the relocatable EMI2 and EMI3 worker node and UI middleware (the tarball) has been taken over by members of the UK NGI (GridPP), with help from David Smith at CERN. Work started fairly late in 2012, the first "production" release was made available on the 14th of January 2013. Work continues to produce EMI3 versions, hoping for a release by mid January 2014.

Status

The tarball is produced and tested at the Lancaster Tier-2 (UKI-NORTHGRID-LANCS-HEP), and has been deployed successfully at other sites. The setup is slightly more involved then we'd like it to be (no YAIM support - but residual yaim functions remain). Improvements will hopefully come for this.

The Lancaster test cluster runs using torque, interfacing with a DPM SE, so other batch/storage combinations are not as well tested.

Glexec support is also on the roadmap, and the tarball is available in cvmfs (although the version in cvmfs lags the main version by a couple of days).

The EMI2 SL5 version of the tarball has passed staged rollout and is officially "production ready".

Format

Similar to the gLite tarball, the EMI2 tarball is split into two parts: the core EM2 components drawn from epel and the EMI2 repos, and the additional dependencies for these components drawn from the OS repos. These are kept separate so that sites that can install the OS components on their machines (rather then have the tarball provide them) can do so.

In essence the tarballs are unpacked rpms with a few additional steps taken to ensure that they work in a relocatable format.

OS tarball production

The "os" tarball is produced using the dependency list generated on an SL5 machine installed using a minimal install, with the only other package installed being yum-utils. Lighter workernode installations (or other distros) might need additional packages installed, or a custom tarball built (once the tools are available, more on this later), but we hope to have mader this tarball as universal as possible. In addition to these rpms a small handful have been added by hand, to maintain coherency with the gLite tarball. Additional packages to be included or distros to be supported will depend in part on user feedback.

Tarball toolkit

Eventually we aim to release a DIY toolkit, allowing users to produce their own tarball. The general plan will be for users to run the build script on an example workernode, ensuring that nothing is missed. This would also be a possible solution for sites running an odd distro. However our current efforts are focused on releasing production ready tarballs.

Tarball download and setup

Downloads

The official place to get either the production or test releases of the UI and WN tarballs is the EGI repository:</br> http://repository.egi.eu/mirrors/EMI/tarball/

(currently only the SL5 version of the EMI2 emi-wn tarball has passed staged rollout, so the production branch is a little empty)

There is also a scripts directory containing environment setup example (current version is v7).

Development versions of the tarball will be released on the UK HEP sysadmin site: https://www.sysadmin.hep.ac.uk/svn/smwgtest/EMI2tarball/


SL5

The current production release of the wn tarball (2.6.0-1_v1) is available here:</br>

http://repository.egi.eu/mirrors/EMI/tarball/production/sl5/emi2-emi-wn/


SL6

Sl6 versions of the tarball have yet to go through a stage rollout process, although they have been running at a number of sites without too many problems. The recommended version is 2.6.0-1_v1 - although a later version is available (2.4.10-1_v1) this version contains a possibly problematic openssl version that could have problems with 512-bit binaries. A fresh tarball will be made first thing in the New Year.</br> SL6 NOTES:</br> JAVA_HOME within the SL6 release is $base/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre - different from the SL5 version.</br> Also as there is no version of python within these tarballs (python2.6 is native to SL6) the site-python directory has been removed. PYTHONPATH just needs to include $base/usr/lib64/python2.6/site-packages

CVMFS

The contents of the EMI2 wn tarball can be found in cvmfs (thanks to Jakob Blomer), within the grid.cern.ch repository. We've tested the SL5 version at Lancaster and found no problems with it. Some different versions of the tarball, for SL5 and SL6, can be found in the repository.

We hope to make these cvmfs repositories as stand-alone as possible, but currently they require a setup script and X509_CERTIFICATE_DIR.

Setting up grid.cern.ch is fairly simple, just add grid to the list of CVMFS_REPOSITORIES in /etc/cvmfs/default.local and restart cvmfs. You will need to edit a local version of the setup scripts, and have them point at the cvmfs emi wn base.

Setup

This assumes that the workernode has been setup to work within the batch system, and the users and groups have been set up. It would technically be possible to use the config_users function of yaim that is currently still within the tarball, but I this hasn't been, and won't be, tested.

Download and unpack the tarball to a shared area, preferably write only to the clients (at Lancaster we admin our tarball areas through our CEs). Create/edit a user environment script with the following points:

  • NEW EMI_TARBALL_BASE is a flag that should be set for tarball nodes - this will act as a pointer to the tarball area, allowing users to identify the node as using the tarball and showing them the way to the tarball paths. This is not used yet, but will likely be utilised in the near future by nagios probes and atlas install jobs.
  • LD_LIBRARY_PATH, PATH and (ironically) GLITE_LOCATION, as well as other variables (listed in the example) need to point at the corresponding tarball areas, as well as local areas for the path like variables. Tarball paths should proceed locally installed ones.
  • Other site specific environment variables (e.g. VO software areas, default SEs, SITE_NAME, BDII_LIST) need to be set up. All this used to be done by yaim.
  • X509_CERTIFICATE_DIR & X509_VOMS_DIR will need to point at directories that contain the certificates (extracted from the CA rpms) and the contents of the vomsdir respectively. The certificates could be unpacked using a similar method to the tarball creation, in the case of the vomsdir it could be produced using a yaim like method, or both could be shared from another node. Sharing from another node (for example the CE or ARGUS server) is possibly the easiest method, and benefits from not needing any other steps to be taken (e.g. fetch-crl) and being easy to update-but there may be other considerations that need to be took into account.

PYTHONPATH

The PYTHONPATH has been the source of so much trouble that it gets its own section! For SL6 the PYTHONPATH needs to point at the site-python directory within the tarball (with no proceeding backslash, i.e. export PYTHONPATH=/opt/mytarball/site-python:$PYTHONPATH ). The site-python directory contains a custom site.py (written by David Smith) that will setup the system paths dependant on the architecture and version of the python being used- nessicery for python2.4 on the OS and python2.6 in the tarball to coexist. The site-python directory should be the first in the PYTHONPATH, and to work properly certainly needs to contain the first site.py within the PYTHONPATH.

This however is not a perfect solution - a few of the major VOs do extensive modification of the PYTHONPATH that includes their own site.py - and python only expands the first site.py file that it comes across in the PYTHONPATH (ATLAS), or their bespoke python doesn't get on with the tarball's site.py and causes odd behaviour (CMS Production- this might be fixed in the latest version of the site.py).

The solution in the first case is to explicity expand the PYTHONPATH, some examples:</br>

    PYTHONPATH=$base/site-python:$base/usr/lib64/python2.6/site-packages:$PYTHONPATH (thanks to Rod for this one)
    PYTHONPATH=$base/site-python:$base/usr/lib64/python2.6:$base/usr/lib64/python2.6/site-packages:$PYTHONPATH 
    or for one case at Victoria in Canada;
    PYTHONPATH=$base/site-python:$GLITE_LOCATION/lib64/python2.4.site-packages:$PYTHONPATH (https://ggus.eu/ws/ticket_info.php?ticket=91230, 
    thanks to Ryan and Asoka for helping me through that one).

In the latter case, where the site.py is incompatable with whatever the user is doing. it may be easier to simply remove the PYTHONPATH from that user's environment.

Of course as the SL6 has python 2.6 on the node and no other versions of python are used the picture is much simpler.

Final stages

Once you've created the environment scripts distribute them to your workers and ensure that they've mounted the tarball area: if this is replacing an older gLite tarball setup make sure that all trace of the old environment is purged first. It's then advisable to test the setup, checking that users see the correct variables. I helpful test is running "which" for a few variables, and try to "import lfc" within python and python2.6.

Some problems found along the way

  • While there are no world-writable files/directories in the tarball there were a few non-sensical ownerships, fixed in the latest release. Thanks to John Green for spotting this.
  • The LD_LIBRARY_PATH has to explicitly list $TARBALL/usr/lib64/dcap within it, otherwise dccp and other dcap uses will fail. Thanks to Daniela for spotting this. Note that this probably only affects those who wish to use gsi authenticated dcap.
  • To get ops nagios wn-version checks to pass you'll need to link /etc/emi-version to the corresponding file in the tarball. This is a far from ideal situation, we will contact the nagios test writers to see if we can come up with a better solution. I have requested that the tests are updated to take into account the emi2 tarball: https://ggus.eu/ws/ticket_info.php?ticket=90768 The atlas local area setup also seems to use this mechanism, I've contacted atlas about this.
  • ATLAS software setup also seem to check for the existence of /etc/emi-version in $ATLAS_LOCAL_AREA/setup.sh in order to choose which libraries to load . Atlas have been contacted about this.
  • There were some broken links within the tarball, (one in lib64/, several more in usr/lib64), these have been manually cleaned up.
  • In order to pass CA checks Daniela had to install perl-XML-SAX.noarch and perl-XML-Parser.x86_64 on her nodes. These were included in the latest release.
  • The example environment script didn't have an entry for the BDII_LIST variable, the scripts (from v3) do. The latest v4 script includes the 32-bit as well as 64 bit paths for the PERL5LIB variable.
  • Testing of the srm commands shows that they are stuck in an extremely verbose mode. This is under investigation. Thanks to Ryan for bring this to our attention. This issue is being tracked in https://ggus.eu/tech/ticket_show.php?ticket=91007 - This has been fixed in the 2.6.0-1_v1 tarball.
  • CMS production jobs use their own version of python but don't sanistise their environment first - and they don't get on with the tarball site.py. Tarball sites that support CMS are advised to add a line to their cmsset_local.(c)sh in /SITECONF/[sitename]/JobConfig/ that unsets the PYTHONPATH. Ref: https://savannah.cern.ch/support/?135148 Thanks again to Daniela for spotting this and coming up with a solution.
  • The $base/etc/vomses directory doesn't exist in the tarball yet, it will be put into a later release (but not populated). Thanks again to Ryan for this one! I

UI Tarball

We now have an "official" beta EMI2 UI tarball for SL5 and SL6:</br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl5/emi2-emi-ui/ http://repository.egi.eu/mirrors/EMI/tarball/test/sl6/emi2-emi-ui/

It's still largely untested, but contains several features requested by users (such as gsissh). We'll be interested in hearing how it goes if you do try this out. There is still no vomses setup within the tarball, so users will have to cobble that together themselves.


UI Post Install Setup

The "v3" versions of the tarball UI were found to need some post-unpacking tweaking to get to work (thanks again to Andreas and Asoka for finding this out the hard way)- these have been implimented in the "v4" UI tarball with the exception that the SL5 tarball still needs to have the pathList edit in wmsui_checks.py):</br> From Andreas:

I got your UI-tarball for SL6 running; even using yaim.

The following changes were needed:</br>

YAIM:


vi $INSTALL_ROOT/opt/glite/yaim/defaults/emi-ui_tar.post ...

  1. AG FUNCTIONS_DIR="${INSTALL_ROOT}/glite/yaim/functions"</br>

FUNCTIONS_DIR="${INSTALL_ROOT}/opt/glite/yaim/functions"

vi $INSTALL_ROOT/opt/glite/yaim/bin/yaim ...

  1. AG for i in ${YAIM_ROOT}/glite/yaim/etc/versions/*; do</br>

for i in ${YAIM_ROOT}/opt/glite/yaim/etc/versions/*; do

glite-wms-job-status:


vi $INSTALL_ROOT/usr/lib64/python2.6/site-packages/wmsui_checks.py ...

  1. AG pathList = ['/','/usr/local/etc' , ]</br>

installroot = os.environ['EMI_UI_CONF']</br> pathList = ['/','/usr/local/etc', , installroot]</br> ...

(in SL5 this is in python2.4/site-packages).


$INSTALL_ROOT/usr/libexec/grid-env.sh + ENV:


export GLOBUS_LOCATION=${INSTALL_ROOT}/usr</br> export GLITE_LOCATION=${INSTALL_ROOT}/usr

export SRM_PATH=${INSTALL_ROOT}/usr/share/srm

export PATH=${INSTALL_ROOT}/usr/sbin:$PATH

export PERL5LIB=${INSTALL_ROOT}/usr/share/perl5</br> export PYTHONPATH=${PYTHONPATH}:${INSTALL_ROOT}/usr/lib64/python2.6/site/packages:${INSTALL_ROOT}/usr/lib/python2.6/site-packages

From Asoka:

installed it and tested only the grid-cert-info command and it failed because they have hard-coded the paths in the script.</br>

To get it to work, I had to define $GLOBUS_LOCATION and then fix the datadir line in 3 files (grid-cert-info, globus-script-initializer, globus-sh-tools.sh) as such:

> diff grid-cert-info grid-cert-info.old</br> 47c47</br> < datadir="${prefix}/share"</br> ---</br> > datadir="/usr/share"</br>

(Changing the top level will not work since it gets reset ...)

EMI 3

We've started to create EMI3 versions of the tarballs, the SL5 worker node tarball is built and underwent testing at Lancaster. Test EMI3 tarballs for the UI and WN will be produced in early January 2014.

Roadmap

To come

Contact

If you have any questions/comments please e-mail me at:</br> m<dot>doidge@SPAMMENOT<dot>lancaster<dot>ac<dot>uk</br> I'd also appreciate word from any sites trying to install the tarball, their success and failures, and the setup that they tried the tarball on.

There is a current GGUS ticket concerning the production of the EMI tarballs here: https://ggus.eu/ws/ticket_info.php?ticket=81496

Any support requests, bug reports or other tickets can be directed to the new "UI WN Tarball" support unit in GGUS.

Acknowledgements

Thanks again to David Smith for giving us the tools to get started, Jakob Blomer for helping us get stuff into cvmfs, John Green for casting his security-minded eye over the tarball and Daniela and Jean-Michel for being the first admins to try and run the tarball (and giving great feedback). EMI3Tarball 20140523154834 ==Overview== Following on from the work on the EMI2 Tarball the EMI3 version of the SL6 WN Tarball is ready for testing after passing the first round of tests at Lancaster (where it appears to work for a number of different VOs). There also exists a SL5 version, but due to a lack of SL5 machines at Lancaster this has only undergone rudimentary tests.

Sites are encouraged to test the new tarball and provide feedback, either to the tarball support list (tarball-grid-supportATSPAMNOTcern.ch) or by submitting a GGUS ticket to the WN UI Tarball support group.


Instructions

As with the EMI2 Tarball (and gLite before that) the EMI3 tarball comes in two parts - the "base" tarball is constructed from RPMs from the EMI repo and the "os-extras" part, built from the dependencies for these packages obtained from the SL and epel repos. These tarballs are built on a barebones but up to date SL install. It should be possible (but untested) to run with just the base tarball, installing the dependencies listed in the os-extras.txt on the nodes from the SL and epel repositories.

Downloading

The SL6 WN Tarball can be downloaded from here:</br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl6/emi3-emi-wn/emi-wn-3.7.3-1_v2.sl6.tgz

With the corresponding "os-extras" rpm here:</br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl6/emi3-emi-wn/emi-wn-3.7.3-1_v2.sl6.os-extras.tgz

Note that earlier versions of the emi3 tarball (emi-wn-3.7.1-1_v2) have a version of the 32-bit openssl package vulnerable to the HeartBleed exploit (CVE-2014-0160) in the os-extras package. openssl.i686 has been excluded from the tarball as of emi-wn-3.7.3-1_v1. Sites not using the "os-extras" tarball should be safe, for others it's recommended that you upgrade to emi-wn-3.7.3-1_v1 as soon as possible.

For SL5 these files are:</br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl5/emi3-emi-wn/emi-wn-3.7.2-1_v1.sl5.tgz </br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl5/emi3-emi-wn/emi-wn-3.7.2-1_v1.sl5.os-extras.tgz

Within those directories there are corresponding md5sum files and text files containing rpm lists.

Setup

(Out of laziness this section is largely cut and pasted from the EMI2Tarball page - my apologies).

This assumes that the workernode has been setup to work within the batch system, and the users and groups have been set up. It would technically be possible to use the config_users function of yaim that is currently still within the tarball, but I this hasn't been, and won't be, tested.

Download and unpack the tarball to a shared area, preferably write only to the clients (at Lancaster we admin our tarball areas through our CEs). Create/edit a user environment script with the following points:

  • EMI_TARBALL_BASE is a flag that should be set for tarball nodes - this will act as a pointer to the tarball area, allowing users to identify the node as using the tarball and showing them the way to the tarball paths. This is not used yet, but will likely be utilised in the near future by nagios probes and atlas install jobs.
  • LD_LIBRARY_PATH, PATH and (ironically) GLITE_LOCATION, as well as other variables (listed in the example) need to point at the corresponding tarball areas, as well as local areas for the path like variables. Tarball paths should proceed locally installed ones.
  • Other site specific environment variables (e.g. VO software areas, default SEs, SITE_NAME, BDII_LIST) need to be set up. All this used to be done by yaim.
  • X509_CERTIFICATE_DIR & X509_VOMS_DIR will need to point at directories that contain the certificates (extracted from the CA rpms) and the contents of the vomsdir respectively. The certificates could be unpacked using a similar method to the tarball creation, in the case of the vomsdir it could be produced using a yaim like method, or both could be shared from another node. Sharing from another node (for example the CE or ARGUS server) is possibly the easiest method, and benefits from not needing any other steps to be taken (e.g. fetch-crl) and being easy to update-but there may be other considerations that need to be took into account.

PYTHONPATH

(Only really an issue for anyone still running SL5). The PYTHONPATH has been the source of so much trouble that it gets its own section! For SL5 the PYTHONPATH needs to point at the site-python directory within the tarball (with no proceeding backslash, i.e. export PYTHONPATH=/opt/mytarball/site-python:$PYTHONPATH ). The site-python directory contains a custom site.py (written by David Smith) that will setup the system paths dependant on the architecture and version of the python being used- nessicery for python2.4 on the OS and python2.6 in the tarball to coexist. The site-python directory should be the first in the PYTHONPATH, and to work properly certainly needs to contain the first site.py within the PYTHONPATH.

This however is not a perfect solution - a few of the major VOs do extensive modification of the PYTHONPATH that includes their own site.py - and python only expands the first site.py file that it comes across in the PYTHONPATH (ATLAS), or their bespoke python doesn't get on with the tarball's site.py and causes odd behaviour (CMS Production- this might be fixed in the latest version of the site.py).

The solution in the first case is to explicity expand the PYTHONPATH, some examples:</br>

    PYTHONPATH=$base/site-python:$base/usr/lib64/python2.6/site-packages:$PYTHONPATH (thanks to Rod for this one)
    PYTHONPATH=$base/site-python:$base/usr/lib64/python2.6:$base/usr/lib64/python2.6/site-packages:$PYTHONPATH 
    or for one case at Victoria in Canada;
    PYTHONPATH=$base/site-python:$GLITE_LOCATION/lib64/python2.4.site-packages:$PYTHONPATH (https://ggus.eu/ws/ticket_info.php?ticket=91230, 
    thanks to Ryan and Asoka for helping me through that one).

In the latter case, where the site.py is incompatable with whatever the user is doing. it may be easier to simply remove the PYTHONPATH from that user's environment.

Of course as SL6 has python 2.6 on the node and no other versions of python are used the picture is much simpler.

Things to Watch Out for

  • It is advised that nodes running the tarball are up to date, but it is very important that the openssl rpms be updated to ensure everything works as intended. It is also important to have both 32 and 64 bit openssl installed on your system.
  • The voms tools have been rewritten in JAVA and required some comprehensive modification to work within a tarball. This included creating some symlinks within the tarball that might cause problems if sites try to run the base tarball without the os-extras.
  • Additionally, the voms tools only seem to work with the "openjdk" java packages.
  • The arc and xrootd client tools have been added to the WN tarball to provide extra functionality.

UI

Preliminary releases of the SL6 and SL5 UI tarball have been made available. The SL5 UI tarball has the same epel-testing versions of the gfal2 packages as the latest SL5 WN tarball.

SL6 UI</br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl6/emi3-emi-ui/emi-ui-3.7.3-1_v2.sl6.tgz </br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl6/emi3-emi-ui/emi-ui-3.7.3-1_v2.sl6.os-extras.tgz

As with the SL6 EMI WN, the previous verson of the UI tarball has a vulnerable version of openssl.i686 in the os-extras tarball. This has been removed in emi-ui-3.7.3-1_v1, sites using the previous os-extras tarball are advised to upgrade as soon as possible.

SL5 UI</br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl5/emi3-emi-ui/emi-ui-3.7.2-1_v1.sl5.tgz </br> http://repository.egi.eu/mirrors/EMI/tarball/test/sl5/emi3-emi-ui/emi-ui-3.7.2-1_v1.sl5.os-extras.tgz


UI Notes

In order to used the gfal2 features within the SL6 UI tarball you need to impliment two new environment variables:>/br>

    GFAL_CONFIG_DIR=$base/etc/gfal2.d
    GFAL_PLUGIN_DIR=$base/usr/lib64/gfal2-plugins/

Additionally, in order to tell the arcproxy tools where the vomses are kept one needs to set "X509_VOMSES" to point to the right place.

The EMI3 Tarball in CVMFS

EMI3 WN and UI tarballs have been made available in cvmfs for user testing, in the grid.cern.ch repository (add grid to your CVMFS_REPOSITORIES variable).

/cvmfs/grid.cern.ch/emi-ui-3.7.3-1_sl6v2

/cvmfs/grid.cern.ch/emi-wn-3.7.3-1_sl6v2

Example set up scripts for the environment can be found in etc/profile.d/setup-emi3=[wn|ui]-example.sh. vomsdir and vomses directories have been configured within these areas.


Acknowledgements.

Thanks to David Smith for providing the tools that are still used to create the base tarballs, and special thanks to Fabio Martinelli for his preliminary testing efforts that revealed the problems making VOMS relocatable. Thanks also to Asoka for his effort testing the UI tarball, and to Jakob for unpacking the tarballs into cvmfs. Also thanks to everyone for being patient!

Matt Doidge, 5th March 2014. Updated 10th April 2014 Full_Partitions 20070426080151 Partitions completely full should be avoided. The consequences of lack of space are different depending on what partitions become full, but there are few partitions (or they can be directories filling their partitions) whose filling might partially or completely stop the machine from working. Depending on your configuration these directories can belong to the same partition so the filling of one automatically affects the others or you can have each mapped to a partition basically limiting the damage if one fills.