Difference between revisions of "Monitoring Resource Usage of Jobs with cAdvisor"
Line 1: | Line 1: | ||
− | Google's cAdvisor (https://github.com/google/cadvisor) provides information about the resources used by containers. A web UI is exposed at http://hostname:port/, and in addition data is exported to a central database. For sites running a batch system with cgroups enabled, cAdvisor can provide information about running jobs. | + | Google's cAdvisor (https://github.com/google/cadvisor) provides information about the resources used by containers. A web UI is exposed at http://hostname:port/, and in addition data is exported to a central database. For sites running a batch system with cgroups enabled, cAdvisor can provide information about running jobs on worker nodes. |
The main page of the UI shows an overview of CPU, memory, network and disk usage of the whole node. | The main page of the UI shows an overview of CPU, memory, network and disk usage of the whole node. |
Revision as of 17:20, 12 May 2015
Google's cAdvisor (https://github.com/google/cadvisor) provides information about the resources used by containers. A web UI is exposed at http://hostname:port/, and in addition data is exported to a central database. For sites running a batch system with cgroups enabled, cAdvisor can provide information about running jobs on worker nodes.
The main page of the UI shows an overview of CPU, memory, network and disk usage of the whole node. Media:Cadvisor1.png Media:Cadvisor2.png Media:Cadvisor3.png Media:Cadvisor4.png Media:Cadvisor5.png
You can then drill down and view information about individual jobs. Media:Cadvisor6.png
Installing InfluxDB
Download and install the rpm:
wget https://s3.amazonaws.com/influxdb/influxdb-latest-1.x86_64.rpm rpm -ivh influxdb-latest-1.x86_64.rpm
then start the service
service influxdb start
In a browser go to http://hostname:8083/ and login using the default username (root) and password (root). To create a database for cAdvisor, specify a database name in the 'Database Details' part of 'Create a Database' and click 'Create Database'. Once the database has been created, click on the database name and create a user by specifying the username and password in the 'Create a New Database User' section.
More information available at http://influxdb.com
Installing Grafana
Download and install the rpm:
rpm -ivh https://grafanarel.s3.amazonaws.com/builds/grafana-2.0.2-1.x86_64.rpm
and start the service
service grafana-server start
Building cAdvisor
Running cAdvisor
Example usage on a HTCondor worker node
/usr/local/bin/cadvisor -storage_driver=influxdb -storage_driver_host=hostname:8086 -storage_driver_db=database_name \ -storage_driver_password=password -storage_driver_user=user -storage_driver_secure=false -storage_driver_table=stats
where the InfluxDB hostname, database name, username and password should be changed as appropriate.