Hall C Compute Cluster
Revision as of 18:23, 26 April 2022 by Brads (Talk | contribs) (Created page with "= Hall C Compute Cluster = == Systems and (nominal) Functions == The Hall C compute cluster is composed of roughly 4 'classes' of machines. Hosts within these classes are i...")
- 1 Hall C Compute Cluster
Hall C Compute Cluster
Systems and (nominal) Functions
The Hall C compute cluster is composed of roughly 4 'classes' of machines. Hosts within these classes are intended to be largely interchangeable, allowing for easier upgrades and failover. === CODA / DAQ nodes (rackmount)
- cdaql5, cdaql6
Compute / Fileserver nodes (rackmount)
'hcdesk' / User nodes (desktop)
- cvideo1 -- rackmount machine that handles munin service and runs motion software that handles the cameras
- cvideo2 -- desktop host that handles the 2 left-most large wall display screens
- cvideo3 -- desktop host that handles the 4 newest display screens
- cmagnets (VM) -- A Win10 VM hosted on cdaqfs1 that handles the 'go_magnets' spectrometer magnet controls
- skylla10 -- a rackmount Win10 host that hosts the Rockwell HMI software used to interact with the SHMS/HMS spectrometer PLCs
- cdaqbackup1 -- a rackmount host used to provide backups of (linux) Hall C systems. See #System Backups
- CNAMES (DNS 'aliases' allowing systems to be pointed at a new physical host with a single DNS change)
- hcpxeboot -> cdaqfs1
Puppet Configuration Management
- The Hall C cluster machines are configured and maintained using the open-source [Puppet] system.
- Main repo is hosted at: email@example.com:brads/hallc-puppet.git
- Updates/Upgrades are handled manually to minimize any surprises during Production
- Brad uses 'cssh' to periodically run global updates and/or push out configuration changes -- bug him for support.
- cdaqbackup1 is as an older rackmount host repurposed to provide backups of some important systems
- All cdaqfs1 NFS exports are backed up nightly (rsync images; no snapshotting)
- cdaqfs1:home/ is backed up nightly with' snapshots
- The backup software is [Borg Backup]
- This is handled by the script: cdaqbackup1:/data1/cdaqfs-backup/BACKUP-borg/borg-backup-cdaqfs-home.sh running on cdaqbackup1
Network Configuration / Management
- All systems on the Hall C network should be registered with the central systems. Talk to Brad Sawatzky and he will set you up quickly.
- Do not throw something on the network with a hardcoded IP address. That was fine 15 years ago, not a good plan in a modern network.
- The network layout is roughly described on the Hall_C_Network page, but that is deprecated and may be out of date. JNET should be considered canonical.
- vxWorks hosts presently boot off cdaql1 (18.104.22.168)
PXE boot (intel/Linux ROCs)
- Intel/Linux ROCs boot using the PXE mechanism. The PXE stanza is delivered by the CNI DHCP service to hosts on the 168 subnet:
tftp-host: hcpxeboot # Bootloader program tftp-path: linux-diskless/pxelinux.0 # TFTP server