Information for advanced cluster users

VIDIA enables select users to submit computing jobs to the UB CCR compute cluster.

UB CCR Computing Center

Monitoring your cluster jobs

As a VIDIA 'submit' user, recall that you will need to submit a small script to the cluster to invokes these SLURM monitoring commands.

Monitor a running cluster job: 'squeue'
Cancel a job: 'scancel'
Check node status: 'sinfo'
Check job status: 'squeue' and 'stimes'
Show job history

Advanced cluster user

Checking cluster metrics with XDMoD

View cluster wait times, jobs running, and more, using the XDMoD tool. For example, to view wait times:

1. Open a browser and navigate to
2. Click the "Usage" tab
3. Change Duration to "7 day", for instance
4. Select Jobs Summary: Wait Hours: Total (left-hand panel)
5. Click on the plot and select from the drill down menu, for example:
 Resource shows the academic cluster (ub hpc)
 Job Size (core count)
 Job Wall Time
 Node Count
 Queue shows general-compute, etc.
 Group/Account shows VIDIA jobs as IITG vidia
6. Drilling down multiple times can display, for instance,
 Wait hours on ub hpc resource
 general-compute queue
 vidia user
 job size (core count)