User Tools

Site Tools


what_is_a_cluster

This is an old revision of the document!


What is a Cluster

A cluster, or compute cluster, is a number of computers that are all connected together by a network. (At least) one of these computers is singled out as a “head node” or “login node” that the users of the cluster can log in to. The other computers are generally referred to as “compute nodes”. The head node runs some job management software that allows users to send “jobs” (in the form of scripts to be executed) to the compute nodes.

The job management software keeps track of the resources available (such as processor cores that are currently in use, or amount of memory that is currently in use) on each compute node. Jobs submitted to the cluster may have to wait in a “queue” because all resources are in use. The job is then run when resources become available.

The job management software provides tools for checking on what jobs are running, or waiting in the queue, and for checking on the outcome of completed jobs.

what_is_a_cluster.1629295986.txt.gz · Last modified: 2021/08/18 10:13 by root