Getting Started on Topsail
Introduction
- The Topsail cluster is a Linux-based computing system available to researchers across the campus. With more than 4000 computing cores across 520 blade servers and a large scratch disk space, it provides an environment that can accommodate many types of computational problems. The blades are interconnected with a high speed Infiniband network, making this especially appropriate for large parallel jobs.
System Information
- Login node : topsail.unc.edu 8 CPUs @ 2.3 GHz Intel EM64T with 2x4M L2 cache (Model E5345/Clovertown), 12 GB memory
- Compute nodes: 520 blade servers, each with 2 quad-core 2.3 GHz Intel EM64T processors, 2x4M L2 cache (Model E5345/Clovertown), and 12 GB memory. Total 4160 processing cores
- Operating System - RHEL 5
- Shared Filesystem : (/ifs1) 39 TB IBRIX Parallel File System
- Interconnect: Infiniband 4x SDR
- Resource management is handled by LSF, through which all jobs are submitted for processing
Getting an Account
You may request an account on Topsail by sending an email to research@unc.edu. Please include the following information:
- Onyen,
- email.unc.edu OR first_last@unc.edu address
- full name
- campus address
- a phone number where you can be reached
- department
- faculty sponsor's name if you are not a faculty member
- a description of the work you plan to do
You must be a UNC-Chapel Hill faculty member, staff member or a graduate student with a faculty sponsor in order to get an account. Requests for time-limited undergraduate student accounts should come directly from a sponsoring faculty member.
Logging In
Use ssh to connect to topsail.unc.edu and login with your Onyen. At the time of your first login, ssh-keygen will run. Accept the defaults. If this environment is not established correctly, jobs will fail with "permission denied" messages. .
Your home directory will be “/ifs1/home/Onyen” and will have a 15GB limit for backup purposes. Any user home directory larger than 15GB will be excluded from backups. Home is not intended to hold more than basic code and environment files. To easily tell how large your home directory is, use the command “du –skh /ifs1/home/Onyen”. All data and output files should be directed to “/ifs1/scr/Onyen”. This directory will be created for you when your account is created. Note that “/ifs1/scr” will never be backed up and files in “/ifs1/scr” are automatically deleted after 21 days. Each user has a hard quota of 500GB for data stored in “/ifs1” – that is, for the total of your home directory and your scratch directory. Jobs will fail once that quota is reached.
Even though the Topsail cluster has many compute nodes, you never actually login to any of them. Instead, you login as above to the cluster. A successful login takes you to "login node" resources that have been set aside for user access. On the login node, you will edit and compile your code, and then you will use the LSF job scheduler to submit your code to the compute nodes for processing. Interactive use on the login node must be restricted to compiling and debugging. Other processes running on the login node are subject to immediate termination by the system administrators.
Development and Application environment
The environment on Topsail is presented as modules. The basic module commands are
module [ avail | list | load | initadd | unload | initrm | show ]
When you first log in you should run
module list
And the response should be one and only one of the following:
hpc/mvapich-[intel*|gcc*]
hpc/openmpi-[intel*|gcc*]
Please refer to the Help document on modules for further information.
Applications available
Applications used by many groups across campus have been compiled and made available on Topsail. Examples of some applications are:
- Amber
- Bioperl
- Fftw
- Gaussian / Gaussview
- Gromacs
- Globus
- netCDF and NCO
- Nwchem
- R
To see the full list of applications currently available run
module avail apps
Software Development Tools
- Intel Compiler Suite
- v. 11 for Fortran77, Fortran90, C and C++, Math Kernel Library is the default, with previous versions available
- MPI for parallel programming via OFED v.1.3.1
- MVAPICH
- OPENMPI
- Totalview Debugger
- Job scheduler
- LSF v. 7 update 4
To see the full list of the available environments provided run
module avail hpc
Note that one and only one of these environments can be loaded at a given time.
Compiling
In either environment the available mpi compiler commands are:
mpicc
mpif77
mpif90
mpiCC
mpicxx
Once you have the default compile module added to your environment with the command
module initadd hpc; module load hpc
then both your compiles and your job submissions will have available all the appropriate environment including man pages, paths, libraries, include files and any required environment variables.
To run parallel jobs in this shared environment we pay close attention to the behavior of multiple jobs running on each node. We have found a case where certain settings controlling CPU affinity which would be a help in some instances, actually dramatically reduces efficiency in this environment. If we identify this behavior with your parallel code we will notify you that an edit has made to your .cshrc or .bashrc file, depending upon your shell. Further information about CPU affinity on Topsail is available atthe Topsail FAQ
Submitting Parallel Jobs
bsub -n "< number CPUs >" -o out.%J -e err.%J \
[-a openmpi| -a mvapich] mpirun ./mycode
or
#### run_mycode ####
#BSUB -n "< number CPUs >"
#BSUB -e err.%J
#BSUB -o out.%J
mpirun ./mycode
##### end of run_mycode ####
bsub < run_hpl
For more basic LSF commands refer to the Help document on LSF (Load Sharing Facility).
Queue Structure
- A short description of the queues available to users in the Topsail cluster can be found below. The list is in order of priority. You can also use the “bqueues” command to list the properties of a specific queue. For example, you could type "bqueues -l debug" to find out more about the debug queue. Additional queues may be added as need dictates.
- All queues share a common fairshare allocation policy that governs what job will be dispatched next based on the recent runtime history of each user or group with jobs queued to run.
- Time limits are measured by wallclock time.
- Note that there is an overall per user CPU limit of 1024 across the cluster.
|
Queue name |
Job Duration |
CPU Range/Job |
Default # CPUs |
Total # CPUs/User across all jobs in queue |
|
int |
2 hr. |
1 |
1 |
128 |
|
debug |
2 hr. |
1 |
1 |
128 |
|
day |
1 day |
4 - 128 |
32 |
1024 |
|
week (default queue) |
7 days |
4 - 128 |
128 |
1024 |
|
512cpu (restricted) |
4 days |
32 - 1024 |
256 |
1024 |
|
128cpu |
4 days |
32 - 128 |
128 |
1024 |
|
32cpu |
2 days |
4 – 32 |
32 |
1024 |
|
Swingshift |
n/a |
1 |
1 |
n/a |
|
Staff |
n/a |
n/a |
n/a |
n/a |
|
chunk |
4 days |
1 |
1 |
1024 |
Mass Storage
- The Mass storage system (also known as SAM-FS or “/ms”) is intended for archiving files and storing very large files that cannot be recreated through programming steps.
- It is not intended to be used as a backup location for disk drives, operating systems, software or scratch files that can be recreated by re-running jobs.
- Files that are changed often or directories with many files in them will cause performance problems and consume too many storage resources.
- We monitor the use of mass storage and will inform you if you are using it inappropriately or if you need to purchase tapes to accommodate the volume of data that you need to store.
For further information on using Mass Storage, please see Mass Storage
Scratch/Work Space
- Approximately 39 terabytes of scratch space are available for jobs running on Topsail. Mounted as “/ifs1/scratch”, this space is not backed up. It is therefore NOT intended to be permanent file storage.
- There is a 21-day automated scratch file removal policy. Any file not used or modified in the last 21 days will be deleted.
Additional Help
Be sure to check the Research Computing home page for information about other resources available to you.
We encourage you to attend a training session on Getting Started on Topsail and other related topics. Please refer to the Research Computing Training site for further information.
If you have any questions, please feel free to call 962-HELP or submit an Online Web Ticket.


