Getting Started on Emerald

Table of Contents

Introduction

System Information

Logging In

Work/Scratch Space

Mass Storage

Software

Compiling Serial Codes

Compiling Parallel Codes

Submitting Jobs

High Memory/AIX Resources

Transferring Files

Additional Help

Introduction

This document describes how to use the Research Computing cluster called Emerald. The intended use of this cluster is for UNC-CH affiliated researchers to do research-related computing.

System Information

Research Computing manages a heterogeneous cluster of multi-core CPU hosts, collectively known as the Emerald cluster, for campus researchers. Most nodes are based on Red Hat Enterprise Linux 4.0 (32-bit) or Red Hat Linux v5 (64-bit), but the cluster also includes four Power5-based large memory hosts, which run the AIX operating system. The Linux compute nodes are Intel Xeon IBM BladeCenter nodes (1.8, 2.0, 2.4, 2.8, and 3.2 GHz), and they communicate via a 10-Gigabit Ethernet network. Job management is handled by LSF (Load Sharing Facility). While working on Emerald, you will have access to several shared scratch file systems described later in this document.

Logging In

To obtain and/or manage your account on our servers, please visit the Onyen Services page, click on the Subscribe to Services button and select Emerald Cluster. Once you have an Emerald account, you can login using Secure Shell (ssh) to connect to Emerald:

ssh  emerald.unc.edu

Telnet access is not allowed. Even though the cluster has many compute nodes, you never actually login to any of them. Instead, you login as above to the cluster. Successfully logging in takes you to "login node" resources that have been set aside for user access. From here you edit and compile your code, then use the LSF job scheduler to submit your code to the compute nodes for processing. When you login to Emerald, your home directory will be in AFS space rather than being local to Emerald.

Note
LSF jobs that you run on Emerald will not have access to files in your AFS home directory nor to any other AFS file space that requires an AFS token. It is suggested that you use scratch space, described below, for your work files.

Work/Scratch Space

1. GPFS (General Parallel File System) scratch space

Scratch space (temporary storage for files associated with jobs you are currently running) is provided as a shared resource for all users. As of December 2008, additional scratch space, based on GPFS, is available on all Emerald Linux and AIX nodes. This space should be used as your primary working directory now. The scratch space is shared temporary storage intended for work files used in job processes. It uses standard UNIX permissions to control access to files and directories. (AFS is a separate file system which is also mounted on Emerald for legacy purposes, and access permissions there are governed by AFS permissions, known as ACLs). By default other users in your Unix group (graduate, faculty, etc) have read access to your scratch directory. You can easily remove their read access with the “chmod” command.

There are two directories in the GPFS-based scratch space: /smallfs and /largefs.

Note
/smallfs has 15Tb of scratch space intended for research data files smaller than 1 Megabyte and /largefs has 18Tb of scratch space intended for research data files larger than 1 Megabyte . These GPFS file systems are not available on other Research Computing clusters or systems.

To access your directory on GPFS space, use the commands:

cd /smallfs/[onyen]

or

cd /largefs/[onyen]

You can access this space from Emerald, and your jobs running on compute nodes will also be able to access this space. This is the directory where you do your work.

By default other users in your group (graduate, faculty, etc) have read access to your GPFS scratch directory. You can remove this access with the “chmod” command. To see how much scratch file space Emerald has access to, use the “df” command:

df  -h  /smallfs/

or

df –h /largefs/

To see how much space your files are taking up in your GPFS directory use the “du” command:

du -h /smallfs/[onyen]

or

du  -h  /largefs/[onyen]

If you have multiple subdirectories and you just want to see a summarization use the “-s” option:

du  -hs  /smallfs/[onyen]

or

du  -hs  /largefs/[onyen]

Please note that scratch space is not backed up and is, therefore, not intended for permanent data storage. See the "Mass Storage" section below about archiving permanent data.

Note
Since this storage space is shared by many other users, please remove any files there that are not associated with currently running jobs. A policy has been established for cleaning out files from /smallfs and /largefs. Scratch file deletion will be enforced with files older than 21 days being removed. Beginning January 12, 2009, we implemented a 21-day automated scratch file removal. Any file not used or modified in the last 21 days will be deleted. Having an automated deletion policy and process is necessary to ensure that this limited and shared resource is available for all to use. Scratch space is not intended for long-term storage. Without an automated clean up procedure in place, the file system would routinely fill up, causing many users' jobs to suspend or fail.

2. Netscratch space

Note
For many years, the “/netscr” file system served as the working scratch directory for Emerald. With the implementation of the GPFS file systems for scratch, you should routinely use “/smallfs” or “/largefs” for your work – they have more capacity and job performance will be better. For the near term, “/netscr” will continue to be mounted and will remain the default LSF work space.

Netscratch space is a shared temporary work directory space intended for work files used in job processes. Scratch space uses standard UNIX permissions to control access to files and directories. (AFS is a separate file system which is also mounted on Emerald and access permissions are mostly controlled by AFS permissions, known as ACLs). By default other users in your group (graduate students, faculty, employees) have read access to your Netscratch directory. You can easily remove this read permission with the “chmod” command. To see how much scratch file space Emerald has access to, use the “df” command:

df -h /netscr/

To see how much space your files are taking up in your NetScratch directory use the “du” command:

du  -h  /netscr/[onyen]

If you have multiple subdirectories and you just want to see a summarization use the “-s” option:

du -hs /netscr/[onyen]

Please note that scratch space is not backed up and is, therefore, not intended for permanent data storage. See the "Mass Storage" section below to archive store permanent data.

Note
Scratch space is an NFS-mounted file system and thus shared by all Emerald users as well as the users of other Research Computing systems.

After your account has been created the first time you ssh into Emerald, using any ssh client other than X-Win 32's StarNet SSH client, your scratch directory will be created:

/netscr/[onyen]

For example:

/netscr/mason

would be the directory of the person whose Onyen was “mason” where your scratch directory name for work and scratch files is /netscr/[onyen]. You can access this space from Emerald, and your jobs running on compute nodes will also be able to access this space. This is the directory where you do your work.

Note
Since this storage is shared with many other users, please remove any files there that are not associated with currently running jobs. A policy has been established for cleaning out files from /netscr. Scratch file deletion will be enforced with files older than 21 days being removed. Beginning January 12, 2009, we implemented a 21-day automated scratch file removal. Any file not used or modified in the last 21 days will be deleted. This policy is necessary to ensure that all users have access to this limited and shared storage resource. It is not intended for long-term storage. Without an automated clean up procedure in place, the file system would routinely fill up, causing users' jobs to suspend or fail.

Mass Storage

The Mass Storage system (also known as SAM-FS or /ms) is intended for archiving files and storing very large files, files that are too large to fit in your AFS quota. Files located in mass storage are not accessible to jobs running in LSF. Mass storage is not to be used as a work directory or as a backup location for local disk drives, operating systems, or software. In general, files that change often or directories with more than a thousand files in them will cause performance problems and consume tape resources. The Iron Mountain PC backup software provided by UNC might be an alternative solution rather than having to copy your PC files to mass storage.

Mass Storage is similar to an ordinary disk file system in that it keeps an inode (for recording data location, etc.) and data blocks for each file. For the user of mass storage, this file system appears to be a subdirectory of the user's AFS home directory. Files can be moved in and out of mass storage by using simple UNIX commands such as “cp” and “mv” or by using sftp/scp. As the Mass Storage system is optimized for archiving data, your programs should not directly read or write from the Mass Storage system. Instead copy your data from “~/ms” to “/largefs/[onyen]” or “/smallfs/[onyen]”.

If you are routinely storing large numbers of small files (more than several hundred files at a time) in mass storage, you should “tar” or “zip” those smaller files into one tarball or zip file outside of mass storage and then move that tarball or zip file to mass storage. You are not required to compress the tarball or zip file since the mass storage tape drive hardware will compress your data. Reducing the number of individual small files will help the overall performance of the SAM-FS Mass Storage system. See the more detailed list of things to avoid.

To access Mass Storage from Emerald, type:

cd  ~/ms

Any files in the scratch space that you wish to save, can be moved to the mass storage preferably in tar or zip format.

If you are currently doing any large moves or copies of data (as to or from mass storage) we hope you will use the LSF command:

bsub -R ms cp /netscr/myonyen/output/* /ms/home/m/y/myonyen/saved_output

This bsub command, issued with the "-R ms" parameters, will submit your copy or move job to a host with very good connectivity to the mass storage system. We expect these hosts to handle multiple data moves well, removing this burden from the login nodes.

Software

The Emerald Linux cluster mounts software applications from the AFS file system. This provides you access to many scientific, statistical and mathematical software packages. Among the more popular applications are SAS, NWChem, Amber, and Matlab. Several compilers are also available for use on the cluster, including Fortran compilers from Intel, Absoft and the Portland Group.

Note
Though software applications are made available from AFS space, your AFS home directory will not be available to either read or write from a job you submit via LSF, even interactive LSF jobs. Any files you want your job to read or write should be in “/largefs/[onyen]” or “/smallfs/[onyen]”.

Many software applications are installed in AFS; but most are not part of your default working environment. To access a particular software application, you will need to add it to your environment with the ipm command. For example, to add Portland Group compilers to your environment, you would execute the command:

ipm  add  pgi

A subset of the most frequently used utilities and applications has already been added to your working environment. This default set of tools includes a number of editors, including vi, ne, pico, nedit and emacs.

One note of caution for you as you add applications for use. Each package added will increase the length of your $PATH session parameter; if this gets too long, parts of it will be lost and some commands will fail to execute as you expect. If this situation arises, you will need to remove some packages from your environment:

ipm  remove  package_name

We would recommend that you remove some of the less frequently used packages. The command:

ipm  query

will list the packages that are currently in your environment at the very end of the list. As noted above, you can read about ipm in more detail.

Note
To use an X-Win 32 StarNetSSH session to connect to emerald.unc.edu you need to set the location of your ".Xauthority" file in the Command window during the configuration of the StarNetSSH session. Read the section "Creating sessions for Research Computing hosts using the Session Wizard" on the help page for X-Win 32.

Compiling Serial Codes

There are four commonly used compilers, namely, GNU, Intel, Portland Group and Absoft Profortran. The following table lists the coresponding package name and compiler command for FORTRAN 77/90 and C/C++ programming languages:

Table 1. Available Compilers

Compiler

Package Name

Programming Language Command

   

FORTRAN 77

FORTRAN 90/95

C

C++

GNU

gcc

g77

---

gcc

g++

Intel

intel_fortran intel_CC

ifc

ifc

icc

icc

PGI

pgi

pgf77

pgf90

pgcc

pgCC

Absoft

profortran

f77

f90

---

---

To subscribe to one compiler package such as pgi, type:

ipm  add  pgi

After the compiler package has been added, to compile your serial FORTRAN 77 code, for example, “source.f”, type:

pgf77  -O  -o  source.x  source.f

An executable, “source.x”, is then generated.

Experience shows that among the four compilers, the Intel compiler is the best. While we caution that performance of compilers is code-dependent, we encourage use of the Intel compilers on Emerald.

Compiling Parallel Codes

MPI parallel codes in FORTRAN 77/90/95 and C/C++ can be run on the distributed-memory environment of the cluster. To compile your MPI codes, you need to pick a compiler (Intel, PGI, Absoft or GNU) and a kind of machines/CPUs on which your code will run. Possible combinations are tabulated in Table 2 below:

Table 2. Available Compilers and Packages to be added for each kind of compiler and CPU

 

Fortran77

FORTRAN90

C

C++

MPI Command

mpif77

mpif90

mpicc

mpiCC

Intel Blade CPU

Intel

intel_fortran mpich

intel_fortran mpich

intel_CC mpich

intel_CC mpich

PGI

pgi mpich

pgi mpich

pgi mpich

pgi mpich

GNU

gcc mpich

---

gcc mpich_gm

gcc mpich_gm

Absoft

profortran mpich

profortran mpich

---

---

Notice that the order that packages are ipm added is important. Add the compiler first and then the MPICH package. For example, to compile MPI FORTRAN 77 codes with the Intel compiler for IBM Blade CPUs, type:

ipm  add  intel_fortran  mpich

To compile your code, say, “mpi_source.f”, type:

mpif77  -O  -o  mpi_source.x  mpi_source.f

An executable named “mpi_source.x” is generated after compilation.

Submitting Jobs

Once you have decided what software you need to use, added those packages to your environment using ipm (if needed), and you have successfully compiled your serial or MPI parallel code, you can then submit your jobs to run on Emerald. We use LSF (Load Sharing Facility) software to schedule and manage jobs that are submitted to run on Emerald. Emerald has 4 types of CPUs and many processors, and each processor (or “core”) is known as a job slot in LSF. A job slot is the basic unit of processor allocation in LSF. A serial job uses one job slot; a parallel job requesting N processors would use N job slots. Each user can have up to 60 job slots in use at any one time. If you are already using 60 job slots and you submit a job to run, that job will PEND until job slots are freed as your running jobs finish. Similarly, if all the job slots in the cluster are in use when you submit a job, even if you are not using any job slots yourself, your job will PEND.

To submit a job to run, you will need to use the LSF “bsub” command as shown below. LSF submits jobs to particular job queues you specify. So in your “bsub” command, you will need to specify the queue in which the job is to run and the kind of machines/CPUs on which it will run. Different queues have different run time limits including, in some cases, limits on the total slots per user. See Table 3 for details.

Table 3. Available Queues on Emerald

Queue Name

Run time limit

Slot limit

Preemption

now

5 minutes

2 per user

Preempts month

int

10 hours

2 per user, 25 total

No

week (default queue)

7 days

32 per user

No

month

30 days

4 per user, 32 total

By the now queue

patrons (restricted to patrons only)

unlimited

Depends on group

Preempts idle queue

idle

unlimited

unlimited

By the patrons queue

Note
Patrons
can run interactive jobs in the patron queue but the “bsub” option “-Ip” needs to be used:
bsub -Ip -q patrons some_executable  

There are different kinds of CPUs that run Linux in the cluster, Xeon 1.8Ghz (xeon18), Xeon 2.0Ghz (xeon20), 2.4GHz (xeon24), Xeon 2.8 GHz (xeon28), and Xeon 3.2 GHz (xeon32). The Xeon CPUs are IBM Blades (blade) connect with Gigabit Ethernet. Use “–R” to select what kind of CPUs your job will run on.

A list of resources defined for a given node can be seen in the last column of output of the following command:

lshosts  |  more

The basic syntax for submitting a serial job is:

bsub  -q  queuename  -R  resources  executable  options_for_job

For example:

bsub  -q  week  -R  RH4  my_executable

There are both 32-bit and 64-bit machines running on Emerald. Some applications need to be run on either a 32-bit or 64-bit machine. "RH4" specifies that your job will be submitted to a 32-bit machine. Likewise to submit your job to a 64-bit machine or to specify that you do not want your job to be submitted to a large memory resource ( IBM P575), use the resource names "RH5" and "blade" respectively.

Since the “week” queue is the default queue, it does not have to be specified. So this “bsub” submission is the same as the above:

bsub  -R  blade  my_executable

You can raise the priority of your week queue job by estimating how much time you think your job will really run. This is really beneficial if you think your job will take only a day or less or perhaps an hour or less. The “-W” option allows you to basically create your own day queue, hour queue or whatever time frame that is less than a week:

# to run a job with a run time limit of 24 hours and 0 minutes:
bsub -R blade -W 24:0 my_executable
#  to run a job with a run time limit of 30 minutes:
bsub -R blade -W 30 my_executable

Note that jobs submitted to the interactive queue “int” will not run on Emerald login nodes, but run on a compute node instead. You will not be able to read or write to your AFS home directory during an interactive job so “cd” to “/largefs/[onyen]” or “/smallfs/[onyen]” before starting an interactive job. Submitting jobs to the interactive queue requires one additional parameter in the “bsub” command, “-Ip”, as shown below:

bsub -q int -Ip -R blade my_interactive_job

To run a parallel job using four CPUs across IBM BladeCenter nodes:

bsub -q week -n 4 -R blade -a mpichp4 mpirun.lsf my_par_job

or:

bsub -q idle -n 4 -R blade13 -a lammpi mpirun.lsf my_par_job

To run a parallel job on IBM xeon 3.2 GHz machines with, for example, 4 CPUs:

bsub -q patrons -n 4 -R xeon32 -a lammpi mpirun.lsf my_par_job

or:

bsub -q idle -n 4 -R xeon32 -a mpichp4 mpirun.lsf my_par_job

LSF will send email to your email address when the job finishes, whether it completes successfully or not (unless you are running in the interactive queue of course). You can check the status of your submitted LSF jobs with the command “bjobs”. The output of that command will include a Job ID, the status of your job (typically “PEND” or “RUN”), the queue to which you submitted the job, the job name, and other information. Additional details can be obtained with:

bjobs -l [JobID]

If you need to kill/end a running job, use the “bkill” command:

bkill [JobID]

Where JobID is the LSF job ID displayed with the “bjobs” command.

Note
Jobs running outside the LSF queues will be killed. The logon privileges of users who repeatedly run jobs outside of the LSF queues will be suspended.

High Memory/AIX Resources

You are be able to run jobs on any of four P575 servers and the old yatta AIX server. The new servers also run the AIX 5.3 UNIX operating system. Three of these AIX machines have 32 gigabytes of memory and one AIX machine has 64 gigabytes of memory (yatta has 128 gigabytes). Each of the four IBM P575 servers has sixteen 1.5 GHZ POWER5+ processors.

If you have code that was compiled on another server, you will need to recompile the code on Emerald as the operating system is different. If you need to run your compiled code on any of the AIX servers, you need to do the compiling in an LSF job running on one of the AIX servers on which you plan to execute your job. The operating system on the Emerald login node is different than the operating system on these compute servers. You cannot login directly to these compute servers.

Table 4. Available compilers on AIX

Compiler

Programming Language Command

C

C++

Fortran77

Fortran90

Fortran95

Fortran2003

IBM XL C/C++

cc, xlc

xlC

---

---

---

---

IBM XL Fortran

---

---

xlf, f77, fort77

xlf90, f90

xlf95, f95

xlf2003, f2003

(Parallel)

mpcc_r

mpCC_r

mpxlf_r

mpxlf90_r

mpxlf95_r

---

GNU

gcc

g++

---

---

---

---

Compiler Reference Manuals:

IBM XL C/C++ V10.1 for AIX

IBM XL Fortran V12.1 for AIX

Jobs running SAS, Stata, etc. and that require large amounts of memory (more than an Emerald compute node which has up to 3 GB of accessible memory) can be run on these new AIX servers.

You can submit jobs to these servers by using the “bsub –R” resource option like so:

bsub -q week -R p5aix sas -memsize 7G -sortsize 7136M -sysin my_large_memory_job.sas

The "p5aix" is the resource name for a POWER5+ processor server running AIX UNIX. LSF will choose one of the four p575 servers.

The "p5" resource name includes the "yatta" server in addition to the p575 servers.

To run a parallel job on AIX with, for example, 4 CPUs:

bsub -q week -R p5 -n 4 -a poe poejob my_par_job

"poe" stands for the Parallel Operating Environment on AIX.

Machine names

Total Amount of Memory

p575-n00

32 GB

p575-n01

32 GB

p575-n02

32 GB

p575-n03

64 GB

yatta

128 GB

Note
If your job needs 3 to 8 gigabytes of memory, run your job on a 32 gigabyte machine. If your job requires more than 8 gigabytes of memory, submit your job to the high memory machine. Remember that these machines share their memory between all their processors that may be being used by other users.

If you need to run your job on the machine with 64GB of memory you can submit your job using the “bsub –R” resource option specifying "mem64":

  bsub -Ip -q int -R mem64 stata -m20480  

The above command submits an interactive Stata job using the Stata option “-m” to request 20,480 megabytes of memory which is 20GB.

You can also submit jobs to the machine with 64GB of memory using the “bsub –m” machine name option specifying "p575-n03":

  bsub -Ip -q int -m p575-n03 stata -m10240

The above command submits an interactive Stata job using the Stata option “-m” to request 10,240 megabytes of memory which is 10GB.

The following example shows how to create an LSF script file that you can redirect to the “bsub” command to submit a NWChem job. First create this text file and name the file “nwchem.lsf”:

#!/bin/sh
#BSUB -J NWChem_Job
#BSUB -q week
#BSUB -n 4
#BSUB -R p5aix
#BSUB -o %J.out
      mpirun -np 4 nwchem input.nw

and then submit your job to LSF like so:

  bsub < nwchem.lsf

Transferring Files

It is likely that you will need to transfer files between your campus computer systems and Emerald. You will need to use the “sftp” or “scp” command to move your files. The command “sftp” works similarly to the popular “ftp” command but is more secure. From your host UNIX/Linux or Mac computer terminal window, type:

sftp [onyen]@emerald.unc.edu

Enter your Onyen password and you will be presented with the sftp prompt. Use the “put” and “get” commands to transfer files, as you would do with standard ftp.

To use the “scp” command, follow this example to get a file, named “temp.txt”, from Emerald and store it in your local computer's “/tmp” directory with the same file name. You will be prompted for your password.

scp [onyen]@emerald.unc.edu:/largefs/[onyen]/temp.txt /tmp/

You can also copy a whole directory. The following command will recursively copy the whole directory “/tmp/temp_dir/” from your local computer to Emerald, and place it in the “/largefs/[onyen]/” directory with the name “temp_dir”:

scp -r /tmp/temp_dir/ [onyen]@emerald.unc.edu:/largefs/[onyen]/

If you need to copy any large file or the large number of files, use the LSF “bsub” command. For example, instead of executing the “cp” command directly from the login node:

cp /netscr/[onyen]/text.txt /ms/home/o/n/[onyen]/

it will be more efficient for you and other users if you submit your job using the LSF “bsub” command:

bsub -R ms cp /netscr/[onyen]/text.txt /ms/home/o/n/[onyen]/

The above “bsub” command, issued with the "-R ms" parameters, will submit your copy or move job to a host with very good connectivity to the mass storage system rather than using the slender resources of the login node you happen to be on.

Additional Help

Emerald Short Course

Research Computing home page


Top
University of North Carolina - Chapel Hill