Glasgow Cluster YPF Adding A New Host

From GridPP Wiki
Revision as of 10:24, 5 April 2007 by Andrew elwell (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

To add a new machine to the cluster:

  • Ensure that the machine will netboot from the interface connected to the internal network. This the only interface where DHCP and BOOTP will work.
  • Add the new machine to, at minimum, the hosts database. At the moment this is done with the clusterdb SQL_QUERY command, e.g.,
 clusterdb 'insert into hosts values("svr024.beowulf.cluster","00:30:48:42:95:1A","10.141.255.24");'
 clusterdb 'insert into hosts values("svr024.gla.scotgrid.ac.uk","00:30:48:42:95:1A","130.209.239.24");'

If the host is associated with APC ports or network ports, add entries in these db tables as well.

Note the SQL query has to be a single argument and hostnames are FQDNs. This is terribly crude right now - should have much better (and less dangerous) utilities.

An alternative is to populate the file /home/alt/etc/extra-hosts with IP, Hostnames, MAC address of new hosts, ie:

10.141.255.26 svr026.beowulf.cluster 00:30:48:42:9C:3C
130.209.239.26 svr026.gla.scotgrid.ac.uk 00:30:48:42:9C:3D
10.141.255.27 svr027.beowulf.cluster 00:30:48:42:C8:24
130.209.239.27 svr027.gla.scotgrid.ac.uk 00:30:48:42:C8:25

and use the script extrahosts2clusterdb. When mkhosts runs it will munge the .beowulf.cluster and add a plain hostname alias.

  • Now regenerate the hosts tables for the cluster:
 # mkhosts > /var/cfengine/inputs/skel/common/etc/hosts
 # cp /var/cfengine/inputs/skel/common/etc/hosts /etc/hosts    # N.B. Temporary until svr031 uses cfengine

Send a HUP to dnsmasq on svr031 so the local DNS gets reloaded.

When svr031 uses cfengine, then do a cfagent -qv. This will HUP dnsmasq automatically.

  • And also regenerate the dhcp configuration:
 # mkdhcpdconf > /etc/dhcpd.conf

Restart dhcpd.

Better script could do this for you.

  • Generate appropriate ssh key for the new machine using the gensshkey command (N.B.. here use the short hostname.)
 # gensshkey svr024

At the moment this is stored in a directory tree, but the in clusterdb would be much better.

  • Regenerate the ssh_known_hosts and shosts.equiv.
 # cd /home/alt/private/key
 # genknownhosts *

This is rather crap - much better if ssh keys were properly in clusterdb and knownhosts generated directly from here. Also note the flakiness in working out if a host has a routed address.

NB: Until svr031 is managed by cfengine you need to copy /var/cfengine/inputs/skel/common/etc/ssh/ssh_known_hosts to /etc/ssh

  • Generate a cfengine keypair for the new host.
    No easy way to do this right now - cfengine craply always tries to write to /var/cfengine/ppkeys/localhost.{pub,priv}. Doing it as a non-root user writes, more helpfully, to $HOME/.cfagent/ppkeys. These can then be copied to /home/alt/private/cfengine/HOST. Then copy the localhost.pub key to /var/cfengine/ppkeys/root-10.141.XXX.YYY.pub... As with ssh keys, the cfengine key pairs would be better stored in the clusterdb.
  • If the machine is a server, unpack the host certificate into /home/alt/private/cert/HOST/host{cert,key}.pem. The little script unpackcerts might be useful...