Table of Contents
Fairshare scheduling is our way of making sure that every user in an LSF cluster gets his or her fair share of the CPU resources over time. If we used “first-come, first-served” scheduling instead, a user could submit 500 jobs at once and everybody else who submitted their jobs after that would have to wait until those 500 jobs had finished before their own jobs could run.
Each LSF user has a dynamic priority that is based on the number of shares assigned to that user (usually 1), the dynamic priority formula, and the amount of CPU time and run time used recently by all of that user’s jobs. As you use more resources, your dynamic priority decreases. As your jobs finish, your dynamic priority increases. Note that CPU time used recently is weighted more heavily that CPU time used in the past.
The higher your dynamic priority, the more likely it is that your pending job will be the next one to be run by LSF.
You can redirect your job’s standard output to a file using the “-o” option on the “bsub” command. When you do this, LSF appends to this file the job summary information you would normally receive in email after the completion of the job.
The following LSF job submission command is an example of using the “-o” option with %J:
bsub -q queue -R resource -o out.%J executable
Below is an example of partial LSF job output:
Sender: LSF System <lsfadmin@c-186-30>
Subject: Job 552334: <matlab -nodisplay -nojvm -nosplash -singleCompThread -r testjob
-logfile testjob_run1.out> Done
Job <matlab -nodisplay -nojvm -nosplash -singleCompThread -r testjob
–logfile testjob_run1.out> was submitted from host <killdevil-login2> by user <deep>
in cluster <killdevil>.
Job was executed on host(s) <12*c-186-30>, in queue <gpu>, as user <deep>
in cluster <killdevil>.
</nas02/home/d/e/deep> was used as the home directory.
</nas02/home/d/e/deep/Test/Esub_Test/Matlab_jobs/Commandline_ver> was used as the
Started at Thu Feb 23 16:57:06 2012
Results reported at Thu Feb 23 17:00:49 2012
Your job looked like:
# LSBATCH: User input
matlab -nodisplay -nojvm -nosplash -singleCompThread -r testjob -logfile testjob_run1.out
Resource usage summary:
CPU time : 40.84 sec.
Max Memory : 15 GB
Max Swap : 15 GB
Max Processes : 5
Max Threads : 11
The output (if any) follows:
Notice the “Max Memory” listed under the Resource usage summary section. This number indicates the maximum amount of memory used by your job. Note that for MPI jobs this number is summed across all processes so you will have to divide by the number of MPI processes (i.e. the value specified by the “-n” flag in your submission).
Suppose you are working with a 2700 by 30000 matrix and that you are using double precision (i.e. 8 bytes). Then your matrix would use
Note. A gigabyte (GB) is 2^30 or 1073741824 (sometimes 1*10^9 is used) bytes. This is just a conversion factor to convert the size from bytes to gigabytes.
What is the purpose of the new LSF memory limit for the KillDevil cluster and how will it affect me?
The KillDevil compute cluster is a large shared resource employed by many researchers across campus. Increasingly we are seeing large memory jobs exhausting the memory on a node, disrupting the running of the cluster, and adversely impacting the jobs of other users. One of the goals of the KillDevil cluster is to enable the running of large memory jobs, however, in order to do this effectively without jobs disrupting one another we will have to implement memory limits.
Default memory limit:
The default memory limit, if you don’t specify one, is set to 4 GB. This means if any one process of your job exceeds 4 GB in its resident set size then the job will be terminated by LSF and you can expect to get a message such as the following in your job output:
Note. Most users’ jobs are not exceeding the 4 GB threshold and they will not have to change anything in their job submission command.
The following is an important exception to the 4 GB default memory limit:
- If you indicate exclusive use of a node, this limit will be upped to 46 GB.
Changing the default memory limit:
To change the default memory limit (of 4 GB) submit your job using the “-M m” flag in your LSF job submission command (i.e. bsub command). Here m is an integer number in GB and indicates what you have specified for your memory requirements. For example, the following command
bsub –M 10 –q day …
reserves 10 GB of memory for you (and therefore will only start on a node that has at least 10 GB free). You can specify bsub options in any order. Please note that you can not use decimal numbers or units for the “-M m” flag.
Note. Do not specify memory requirements for your job in other ways. In particular, a resource string such as “rusage[mem=xx]” will be silently ignored. Always, use the –M flag to specify the memory limit in your LSF job submission command if you do not want to use the default memory limit.
Determining how much memory your job is using:
You can find out the maximum memory of your job by looking at the “Resource usage summary” section of the LSF output where it reports “Max Memory“. For an example of this, see above.
Note that for MPI jobs this number is summed across all processes so you will have to divide by the number of MPI processes (i.e. the value specified by the “-n” flag in your submission). We have noticed that the “Max Memory” value may not be reported for smaller memory jobs and for short running jobs and appears to be subject to some sampling error as well.