MultiVO admin users

From GridPP Wiki
Jump to: navigation, search

Overview

This guide covers the processes needed to effectively manage and run the multi-VO Rucio instance at RAL. For a general overview of multi-VO (without RAL specific information, see here).

Configuration

Rucio version

M-VO Rucio was introduced with version 1.23.0, however integration with some features didn't occur until later releases:

  • 1.23.2
    • Authentication & Authorisation: Add authentication options for multi-VO
    • Documentation: generic m-VO documentation
    • Testing: Change m-VO tests to use postgres
  • 1.23.4
    • Core & Internals: Script to convert DB between VOs
    • Deletion: VO handling for daemons
    • WebUI: Make m-VO compatible with WebUI
  • 1.23.5
    • Core & Internals: M-VO database conversion improvements
    • Messaging: VO missing from reaper messages

Nodes

The RAL instance is set up across various nodes which handle the different aspects of Rucio. In a multi-VO context, most of these nodes will handle functionality that is "shared" between all VOs. In these cases the config file should have `multi_vo = True` set in `common`. If associated with a specific VO, `vo = <vo>` should also be set in `client`. As with single-VO Rucio, the addresses for the database, server etc. should be set in `etc/rucio.cfg` as usual.

  • Shared Rucio functionality
    • Database: ruciodb02.gridpp.rl.ac.uk
    • Rucio server/auth/WebUI: rucio-server.gridpp.rl.ac.uk
    • Rucio daemons: rucio-daemon01.nubes.stfc.ac.uk
    • bastion (client login): rucio-bastion.gridpp.rl.ac.uk
  • Monitoring
    • ActiveMQ: rucio-activemq01.nubes.stfc.ac.uk
    • External ELK: rucio-mon.gridpp.rl.ac.uk

New VOs

Database conversion

If the prospective VO has data in an existing Rucio instance that they wish to keep, this can be done. First, depending on the Rucio version they were running, the database may need to be upgraded so that both it and the RAL database have the same tables, columns etc. It should be sufficient to have the same main version, such 1.23.X as schema upgrades only occur in these releases. Then, the process described here should be followed. Note that as the `super_root` will already exist, the argument to create it should not be passed. Once the database is in a format compatible with m-VO, the data can then be inserted into the RAL database.

Note that as the account/identity mappings are preserved with the database, no further set up should be required beyond creating a client and pointing it at the server/auth server.

Creating VO from scratch

If there is no pre-existing data, or the VO does not want to use it in the new instance, the VO will need to be added by running the following commands on the server node:

$ python
>>> from rucio.api.vo import add_vo
>>> add_vo(<new_vo>, 'super_root', <description>, <email>, 'def')

Note that this method will create a `root` at the new VO who has all the same identities (authentication options) as the `super_root` had, so it's advisable to create an identity for the new `root` that can be used by the VO admin (as any of the identities it starts with would also give access to `super_root`).

ELK monitoring

The Kibana dashboards used for monitoring the instance have VO specific and instance wide versions with the same panels. Keeping the monitoring data separated is achieved by indexing and use of elastic users/roles.

Indexing

When running in m-VO the messages sent by Hermes include the VO in the payload. The messages are indexed by logstash, at which point the VO is included in the index so that messages can be sorted by VO when creating dashboards in Kibana. The format for m-VO is `rucio-<vo>-YYYY.MM`. If data does not have the VO field then it defaults to `def`.

Historical data from the s-VO SKA instance was indexed as `rucio-YYYY.MM.DD`. Indexing by month was preferred due to optimal shard sizing. It's worth noting that `rucio-*` will match both the old and new format, so can be used to see all Rucio data on the same dashboard.

Creating dashboards

Once a new VO has joined, a dashboard will need to be created for them. First, they need to start generating messages with Hermes so that they can be indexed. The web interface for Kibana allows you to create new index pattens but only if the pattern matches results. The pattern should have the format `rucio-<vo>-*` to include all time data, and have `@timestamp` as it's time field. When creating the pattern, Kibana assigns it an ID. It's also possible to specify a custom one, which is recommended as a consistent naming pattern will make modifying the dashboards a lot easier. This ID can be seen in the url when viewing the pattern, for example the default `rucio-*` pattern index has ID `5daaae20-ee72-11e9-99dc-b5ce30c341b6`. However, it's easier to create index patterns as part of the dashboard import/export process (below) as this will preserve custom fields such as success ratio.

In order to make a copy of an existing dashboard for use by a new VO, first export the dashboard as an NDJSON file. This should have the index pattern as the first line, each of the visualisations on a new line, and the dashboard itself last. All IDs and titles that refer to the existing index/VO need to be replaced to reference the new VO. For randomly generated IDs this can be extremely tedious, so it's recommended to use a dashboard such as the [[TEST] Event Dashboard](https://rucio-mongridpp.rl.ac.uk:5601/app/management/kibana/objects/savedDashboards/event_dashboard_TEST), which has been modified so that every ID and title can be easily renamed by find/replacing `TEST` with the name of the new VO (e.g `SKA`) and then the index pattern `test-*` with the new pattern (e.g. `rucio-ska-*`).

Finally, import the modified JSON. This can be done through the saved objects tab. When importing, there should be no overwrites as all the IDs for every object should be new. If it does prompt you for an overwrite, it's likely that one of the IDs still refers to the old dashboard. Accidental overwrites can be remedied by keeping up to date exported copies of the dashboard, and reimporting these in the event of accidental loss/overwrite.

Security

Security for the entire ELK stack can be enabled/disabled with the `xpack.security.enabled` flag in the `elasticsearch.yml` config file. When disabled, anyone can access and modify the Kibana dashboards. Enabling the security option (and restarting elasticsearch) will mean a username/password is needed to access Kibana. The predefined `elastic` user has all the permissions to access and modify the dashboards, as well as the security options for roles and users.

When adding a new VO, after creating the index and dashboard the next step is to create role(s) for that VO. While logged in as `elastic`, navigate to Stack Management in the Kibana web interface, then Roles. Each role has a set of defined permissions, and there are several pre-defined "reserved" roles that exist by default. Limiting access to a single VO will require the creation of a new role. After naming the role, it should have index privileges set to allow read access for the pattern associated with its VO (or if you wish to allow monitoring for all VOs, a more general pattern like `rucio-*`). You will also need to add a space privilege to Kibana to allow read access. The result is a role which can view but not edit monitoring data, but only when that data matches their index pattern(s). Note that they will still be able to access other VOs dashboards, but none of the events will display in the panels.

Next, navigate to Users; once again this displays the "reserved" defaults. Here the passwords for existing users can be changed, and new users created. Creating a user requires a name, email address and password. It's at this point that the user's role is added. Multiple users can share the same role (so everyone at a VO can have the same permissions) and one user can have multiple roles (so someone associated with multiple VOs can see the data for both).

Further details on how to setup security (including options other than basic username/password) can be found from the elastic documentation: