Difference between revisions of "Hall C CODA/DAQ Layout"

From HallCWiki
Jump to navigationJump to search
Line 30: Line 30:
 
** Follow the instructions and type 'go_hcvme0X' (X:1, 2, 4, 5, 7, 8) after running the above script
 
** Follow the instructions and type 'go_hcvme0X' (X:1, 2, 4, 5, 7, 8) after running the above script
 
** The files are stored on cdaqfs1 and NFS mounted at /net/cdaqfs1/cdaqfs-coda-home/pxeboot/
 
** The files are stored on cdaqfs1 and NFS mounted at /net/cdaqfs1/cdaqfs-coda-home/pxeboot/
 +
** The PXE boot options are delivered by the JLab central DHCP server to all hosts (non-PXE systems ignore them).  At present they are:
 +
    filename "linux-diskless/pxelinux.0";    # Bootloader program
 +
    next-server hcpxeboot.jlab.org;          # TFTP server (hcpxboot is a CNAME for cdaqfs1 at present)
 +
 +
 
* Run 'govxworksroc' as coda@cdaql1 to be moved into the vxworks directory and establish the PPC cross-compiler, etc
 
* Run 'govxworksroc' as coda@cdaql1 to be moved into the vxworks directory and establish the PPC cross-compiler, etc
 
** The files are physically located on the NFS mount (cdaqfs1): /home/coda/coda/{crl,boot}/
 
** The files are physically located on the NFS mount (cdaqfs1): /home/coda/coda/{crl,boot}/

Revision as of 14:07, 9 June 2022

Hall C CODA Layout

  • Detailed 'User' instructions are on the Hall C DAQ page. That includes the ROC layout, standard recovery procedures, etc. Read and understand that first.
    • Includes instructions on updating F250 pedestals, switching DAQ modes, recovery procedures, etc.

CODA process and file locations

See also: Hall C Compute Cluster

There are two primary hosts dedicated to running the SHMS and HMS DAQs:

  • HMS: coda@cdaql5
  • SHMS: coda@cdaql6

When running in 'coincidence' mode, all ROCs (SHMS+HMS) are picked up by the 'SHMS' configuration running under coda@cdaql6

There is nothing 'special' about those machines however. If needed, failover to another host by replacing 'cdaql6' with a new/different host in

  • coda:bin/coda_user_setup
  • coda:bin/run-vncserver
  • CODA msqld server
    • See coda:bin/run-msqld It is presently started through crontab under coda@cdaql6.

They share a common NFS mounted 'coda' directory. The /home/coda mount is 'special'. It is hosted on cdaqfs1 along with the rest of the filesystems, but is handled differently to avoid filesystem size limitations with binary components in the CODA 2.6 environment (and perhaps vxworks cross-compiler toolchain). That limitation should be removed when Hall C migrates to CODA 3.0.

CODA support software

  • The start-/end-of-run scripts, EPICS logger scripts, RunStart GUI, Prescale GUI are located in coda:coda/scripts/
  • There are multiple 'README' files in that directory and its children that describe the intended execution flow and `best practices'
  • Log files in coda:debug_logs/ may be useful in understanding problems.

ROC code

  • Run 'golinuxroc' as coda@cdaql1 to be moved into the ROC software directories for the linux/intel ROCs
    • Follow the instructions and type 'go_hcvme0X' (X:1, 2, 4, 5, 7, 8) after running the above script
    • The files are stored on cdaqfs1 and NFS mounted at /net/cdaqfs1/cdaqfs-coda-home/pxeboot/
    • The PXE boot options are delivered by the JLab central DHCP server to all hosts (non-PXE systems ignore them). At present they are:
    filename "linux-diskless/pxelinux.0";     # Bootloader program
    next-server hcpxeboot.jlab.org;           # TFTP server (hcpxboot is a CNAME for cdaqfs1 at present)


  • Run 'govxworksroc' as coda@cdaql1 to be moved into the vxworks directory and establish the PPC cross-compiler, etc
    • The files are physically located on the NFS mount (cdaqfs1): /home/coda/coda/{crl,boot}/

Experiment Changeover Tasks

  • Update the '/coda/coda/scripts/DATAFILE-Locations.sh' to point at the new 'raw/' tape destination
    • This file is used by CODA and the data mover scripts to move raw CODA files at end of run (and watch for and correct file transfer interruptions, etc)
  • Update the 'T1, T2, ... T6' cables into the Trigger Master(s) modules to match Experimental requirements.
    • The EDTM system is designed to trigger all detector pretriggers (3/4, PbGl, Cerenkovs, etc) with timing similar to what the physics will generate (including SHMS+HMS coincidences) so that can get you quite close pre-beam. However, the timing will need to be checked/tweaked when beam arrives (of course).
  • Confirm the trigger table mapping is consistent with the Experimental requirements.
    • This table sets the 'trigger bits' that the Trigger Master adds to its data header to flag whether an particular trigger involved the SHMS, HMS, etc.
    • See hcvme01.c (SHMS-single and COIN DAQ configurations), and hcvme02.c (HMS single-arm DAQ configuration). Helper scripts to generate the table are in the respective ROC code directories under 'hallC_triggerTable/'
  • See also DAQ/Trigger Run Check Lists (NOTE: this is getting a little dated)
  • Trigger timing log entries / snapshots are in the logbook and are also recorded here: Trigger History

Data Movers

The 'data mover' algorithm takes care of copying CODA data files from the Hall C system(s) to tape.

The initial copy is done using 'coda:coda/scripts/copy_file_to_tape.sh', triggered when each CODA run is stopped. It initiates a 'jput' from the system that has the data drive mounted to minimize unnecessary network traffic. See the script for details.

  • Log files in coda:debug_logs/ may be useful in understanding problems.

Clean-up of the local files is managed by the 'jmirror' tool through the following cron entry and associated script running under coda@cdaql1 (again, this should run on the system with the data file system physically attached to avoid unneeded network traffic).

There are other crontab entries under coda@cdaql1 that monitor the file transfers for 'stuck' files or other issues and will email responsible parties.

  • See 'coda:bin/jmirror-sync-raw.copiedtotape -h' for a list of options (runs jmirror in a few different "modes")

crontab on coda@cdaql1

## 'jmirror' verifies (via crc/md5sum) that all files in the raw.copiedtotape/ directory
## are in fact on tape.  (If they are not, it will copy them now).   Once they are
## verified to be on tape, it will remove them from the Hall C file system.
##
## Leave files on local disk for a nominal 48 hours before removing.
##   Files will only be removed from local disk if both the original and 'dup'
##   copies have been written and verified to be on tape.
@daily /$HOME/bin/jmirror-sync-raw.copiedtotape 2>&1 | egrep -v 'Found 0 files, comprising 0 bytes.|^WARN  Ignoring zero-length file:|already exists|^WARN  Unable to load previously calculated MD5'
## Sanity check of file count in ~/data/raw
## (Should be small unless there's an issue with CODA crashing before end of run, or file transport to tape)
## - Should probably convert this crontab entry to munin plugin at some point
@daily if [ `ls /$HOME/data/raw/ | wc -l` -gt 6 ]; then ( date; echo "$USER@$HOST"; echo "Warning: Extra files in $HOME/data/raw.  Verify things are working and manually move (non-active!) files to ../raw.copiedtotape.  A cronjob will ensure they are pushed to tape later." ); fi; 
## clean up $HOME/debug_logs/
@daily /usr/sbin/tmpwatch -qs --ctime 7d /$HOME/debug_logs/

Git repos

All of the CODA configurations, coda:bin/, and other directories are maintained with git.

The remote repos are stored on the 'hallcgit.jlab.org' server.