Difference between revisions of "New Information System"

From GridPP Wiki
Jump to: navigation, search
(Querying Information)
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
  
 +
This document describes a JSON static information system, that serves to replace parts of the BDII. See [https://twiki.cern.ch/twiki/bin/view/EGEE/WLCGISEvolution WLCG Information System Evolution]
  
[https://twiki.cern.ch/twiki/bin/view/EGEE/WLCGISEvolution WLCG Information System Evolution]
 
  
 +
== What's wrong with the current information system ==
  
== CE Information ==
+
The current BDII system in use all over the grid has some problems. The schema is bloated. The information content is huge. It is based on server technology that is very flaky. The software client implementation is hugely over complicated and poorly written in weird ways. There is ambiguity in the field semantics. For most users it is hard to query by hand. It is hard for site admins to configure it in the first place and keep it current on an on-going basis. The data content, when checked, is often unreliable. BDII clients regularly lock up or fall over. It is hard to verify. It is the source of many errors. It is difficult to extend for new requirements. In short, the BDII has had its day and it’s time to try something else.
Example can be found here:
+
https://twiki.cern.ch/twiki/pub/EGEE/WLCGISEvolution/CE-json-proposal.txt
+
some more description can be found in the google doc:
+
https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit
+
  
US description can be found here:
+
== Principle of how this proposal works ==
https://github.com/opensciencegrid/topology/blob/master/topology/Brookhaven%20National%20Laboratory/Brookhaven%20ATLAS%20Tier1/BNL-ATLAS.yaml
+
  
== SE Information ==
+
Site admins will be familiar with GOCDB, since each site must already have an entry describing it. A site entry in GOCDB has the idea of Extension Properties; an extension property is some name/value pair that can be used for arbitrary purposes. In this scheme, a site would enter an extension property named "InformationSystem". This extension property would have a value comprising a URL that links to a JSON document containing a description of the site in question. For example, the Liverpool site (UKI-NORTHGRID-LIV-HEP) has this entry:
  
 +
{|border="1" cellpadding="1"
 +
|+extension property
 +
|-style="background:#7C8AAF;color:white"
 +
!Extension Property Name
 +
!Value
 +
 +
 +
|-
 +
|InformationSystem
 +
|http://hepgrid4.ph.liv.ac.uk/liv.json
 +
|}
 +
 +
 +
 +
A list of all sites’ json can be obrained by querying GODBD.
  
== Querying Information ==
 
The URLs for the jsons should be stored in the GOCDB.
 
 
https://goc.egi.eu/gocdbpi/public/?method=get_service_endpoint&extensions=(InformationSystem=)
 
https://goc.egi.eu/gocdbpi/public/?method=get_service_endpoint&extensions=(InformationSystem=)
 +
 +
How this proposal partially solves the problem
 +
 +
A standard exists for the JSON document that each site must create and deploy on some web-server. At present, the iniotialtive cover inmforamtion for site CEs, but will later be extened to cover storage.
 +
 +
For example, at Liverpool, we have installed a web-server on hepgrid4.ph.liv.ac.uk (which is also our BDII server, as it happens). Information consumers who wish to know about the Liverpool site will access the GOCDB, obtain the InformationSystem properly and access the information at the URL provided in that properly.
  
 
Full documentation can be found here:
 
Full documentation can be found here:
 
https://wiki.egi.eu/wiki/GOCDB/PI/Technical_Documentation
 
https://wiki.egi.eu/wiki/GOCDB/PI/Technical_Documentation
 +
 +
 +
 +
 +
For the time being, the standard for the JSON file is available here.
 +
 +
https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit
 +
 +
Which was partially based on this proposal, which is not superceded.
 +
 +
https://twiki.cern.ch/twiki/pub/EGEE/WLCGISEvolution/CE-json-proposal.txt
 +
 +
There is also a US standard here.
 +
https://github.com/opensciencegrid/topology/blob/master/topology/Brookhaven%20National%20Laboratory/Brookhaven%20ATLAS%20Tier1/BNL-ATLAS.yaml
 +
 +
 +
Various other deployment options (steps etc.)
 +
 +
TDB
 +
 +
Deadlines/timelines for implementation
 +
 +
TBD

Revision as of 14:45, 21 February 2019

Introduction

This document describes a JSON static information system, that serves to replace parts of the BDII. See WLCG Information System Evolution


What's wrong with the current information system

The current BDII system in use all over the grid has some problems. The schema is bloated. The information content is huge. It is based on server technology that is very flaky. The software client implementation is hugely over complicated and poorly written in weird ways. There is ambiguity in the field semantics. For most users it is hard to query by hand. It is hard for site admins to configure it in the first place and keep it current on an on-going basis. The data content, when checked, is often unreliable. BDII clients regularly lock up or fall over. It is hard to verify. It is the source of many errors. It is difficult to extend for new requirements. In short, the BDII has had its day and it’s time to try something else.

Principle of how this proposal works

Site admins will be familiar with GOCDB, since each site must already have an entry describing it. A site entry in GOCDB has the idea of Extension Properties; an extension property is some name/value pair that can be used for arbitrary purposes. In this scheme, a site would enter an extension property named "InformationSystem". This extension property would have a value comprising a URL that links to a JSON document containing a description of the site in question. For example, the Liverpool site (UKI-NORTHGRID-LIV-HEP) has this entry:

extension property
Extension Property Name Value


InformationSystem http://hepgrid4.ph.liv.ac.uk/liv.json


A list of all sites’ json can be obrained by querying GODBD.

https://goc.egi.eu/gocdbpi/public/?method=get_service_endpoint&extensions=(InformationSystem=)

How this proposal partially solves the problem

A standard exists for the JSON document that each site must create and deploy on some web-server. At present, the iniotialtive cover inmforamtion for site CEs, but will later be extened to cover storage.

For example, at Liverpool, we have installed a web-server on hepgrid4.ph.liv.ac.uk (which is also our BDII server, as it happens). Information consumers who wish to know about the Liverpool site will access the GOCDB, obtain the InformationSystem properly and access the information at the URL provided in that properly.

Full documentation can be found here: https://wiki.egi.eu/wiki/GOCDB/PI/Technical_Documentation



For the time being, the standard for the JSON file is available here.

https://docs.google.com/document/d/1pg_5Kibc_-Z4JF4_HJyW5xL6GVYKwXxOU7DXf2QP9Ag/edit

Which was partially based on this proposal, which is not superceded.

https://twiki.cern.ch/twiki/pub/EGEE/WLCGISEvolution/CE-json-proposal.txt

There is also a US standard here. https://github.com/opensciencegrid/topology/blob/master/topology/Brookhaven%20National%20Laboratory/Brookhaven%20ATLAS%20Tier1/BNL-ATLAS.yaml


Various other deployment options (steps etc.)

TDB

Deadlines/timelines for implementation

TBD