Hall C CODA/DAQ Layout

From HallCWiki
Revision as of 20:57, 26 April 2022 by Brads (Talk | contribs) (CODA process and file locations)

Jump to: navigation, search

Hall C CODA Layout

  • Detailed 'User' instructions are on the Hall_C_DAQ page. That includes the ROC layout, standard recovery procedures, etc. Read and understand that first.

CODA process and file locations

See also: Hall C Compute Cluster

There are two primary hosts dedicated to running the SHMS and HMS DAQs:

  • HMS: coda@cdaql5
  • SHMS: coda@cdaql6

When running in 'coincidence' mode, all ROCs (SHMS+HMS) are picked up by the 'SHMS' configuration running under coda@cdaql6

There is nothing 'special' about those machines however. If needed, failover to another host by replacing 'cdaql6' with a new/different host in

  • coda:bin/coda_user_setup
  • coda:bin/run-vncserver
  • CODA msqld server
    • See coda:bin/run-msqld It is presently started through crontab under coda@cdaql6.

They share a common NFS mounted 'coda' directory. The /home/coda mount is 'special'. It is hosted on cdaqfs1 along with the rest of the filesystems, but is handled differently to avoid filesystem size limitations with binary components in the CODA 2.6 environment (and perhaps vxworks cross-compiler toolchain). That limitation should be removed when Hall C migrates to CODA 3.0.

Data Movers

The 'data mover' algorithm takes care of copying CODA data files from the Hall C system(s) to tape.

The initial copy is done using 'coda:coda/scripts/copy_file_to_tape.sh', triggered when each CODA run is stopped. It initiates a 'jput' from the system that has the data drive mounted to minimize unnecessary network traffic. See the script for details.

Clean-up of the local files is managed by the 'jmirror' tool through the following cron entry and associated script running under coda@cdaql1 (again, this should run on the system with the data file system physically attached to avoid unneeded network traffic).

  • crontab on cdaq@cdaql1
## 'jmirror' verifies (via crc/md5sum) that all files in the raw.copiedtotape/ directory
## are in fact on tape.  (If they are not, it will copy them now).   Once they are
## verified to be on tape, it will remove them from the Hall C file system.
## Leave files on local disk for a nominal 48 hours before removing.
##   Files will only be removed from local disk if both the original and 'dup'
##   copies have been written and verified to be on tape.
@daily /$HOME/bin/jmirror-sync-raw.copiedtotape 2>&1 | egrep -v 'Found 0 files, comprising 0 bytes.|^WARN  Ignoring zero-length file:|already exists|^WARN  Unable to load previously calculated MD5'
## Sanity check of file count in ~/data/raw
## (Should be small unless there's an issue with CODA crashing before end of run, or file transport to tape)
## - Should probably convert this crontab entry to munin plugin at some point
@daily if [ `ls /$HOME/data/raw/ | wc -l` -gt 6 ]; then ( date; echo "$USER@$HOST"; echo "Warning: Extra files in $HOME/data/raw.  Verify things are working and manually move (non-active!) files to ../raw.copiedtotape.  A cronjob will ensure they are pushed to tape later." ); fi; 
## clean up $HOME/debug_logs/
@daily /usr/sbin/tmpwatch -qs --ctime 7d /$HOME/debug_logs/

Git repos

All of the CODA configurations, coda:bin/, and other directories are maintained with git.

The remote repos are stored on the 'hallcgit.jlab.org' server.