User Tools

Site Tools


etiquette

This is an old revision of the document!


Etiquette

The BRC Cluster is run largely without restrictions and limits, and that has worked pretty well so far. This is so that you can get a large amount of memory if you need it, or a large number of cores. We would like to continue to run the same way. So, please follow some guidelines…

  • Do not bog down the head node!
  • You can run some stuff on the head node:
    • Editing of scripts.
    • Data preparation (unless it is computationally intensive).
    • Tests of your code (maybe on cut-down data sets).
    • Compilation.
  • Try not to bog a compute node down when other users' jobs are running on that node.
    • Using too much memory.
    • Using more cores than you said you would on the job submission.
    • If in doubt take a whole node (using –exclusive) and monitor progress (example later).
  • Try not to queue hundreds of jobs that will take up the entire cluster.
    • Use an “array job” which lets you control the number of nodes you use very easily. This is the preferred technique.
    • Consider queuing the jobs on a small number of nodes (use -x to exclude some nodes).
    • Consider queuing a subset of your jobs at one time.
    • Write fancier scripts to control your jobs.
  • Try not to take up a large proportion of the cluster
    • If your jobs are really short then it's OK.
    • If they take a really long time, it definitely isn't OK.
    • And there's a grey area in the middle.
      • As a guideline, think twice before taking up more than 4 whole nodes for multiple days.
  • Don't leave interactive jobs running on a node when you are not actually interacting with them.
    • e.g. A shell started with “srun –pty bash -i”.
    • This “uses up” a core on the node and may prevent the node you are running on from being given to a user who needs a node in exclusive mode.
  • Similarly don't deliberately submit one job to each of the nodes that are currently free.
  • Consider using the HPC Center BRC queue.
    • 400 cores (with priority to BRC members).
etiquette.1599675838.txt.gz · Last modified: 2020/09/09 14:23 by root