Lustre Clustered Parallel Filesystem: File and Directory Striping Best Practices

Lustre Filesystem Striping

Lustre filesystem striping has a couple of advantages and a couple of disadvantages. Consideration should be made when applying stripes to the workload operation type and file size- otherwise, it is best to allow the default to manage the filesystem stripes.

Two advantages are that available bandwidth and maximum file size is increased. Files can span multiple OST’s or Lustre filesystem disks which allow for greater file sizes and
multiple disk performing i/o.
 
Two disadvantages include increased overhead and risk.  OST’s and OSS’s must coordinate file i/o operations together. Increased risk is evident when you consider the example of striping each file across all servers. In this case, if any one OSS catches on-fire, a small part of every file is lost. By comparison, if each file has exactly one stripe, you lose fewer files, but you lose them in their entirety. Most users would rather lose some of their files entirely than all of their files partially.
 
A general rule of thumb is to use a stripe count approximately equal to the number of gigabytes in the file. Use this rule within reason. A Terabyte file should not be striped to 1024. That’s too much considering there are currently only 30 OST’s and 4 OSS nodes on the UNC Lustre filesystem.

Additional rules and best practices apply, such as:

  1. Make the stripe count be an integral factor of the number of processes performing the write in parallel, so that you achieve load balance among the OSTs. For example, set the stripe count to 16 instead of 15 when you have 64 processes performing the writes.
  2. A good stripe size for sequential I/O using high-speed networks is between 1 MB and 4 MB. In most situations, stripe sizes larger than 4 MB do not parallelize as effectively because Lustre tries to keep the amount of dirty cached data below 32 MB per server (with the default configuration).

Set Striping Patterns
 
Files and directories inherit striping patterns from the parent directory. However, you can change them for a single file, multiple files, or a directory using the lfs setstripe  command. The lfs setstripe command creates a new file with a specified stripe configuration or sets a default striping configuration for files created in a directory. The usage for the command is:
 
Use the lfs setstripe command to create new files with a specific file layout (stripe pattern) configuration.

lfs setstripe [--size|-s stripe-size] [--count|-c stripe-count] [--index|-i start-ost] <filename|dirname>

stripe-size:
Stripe size is how much data to write to one OST before moving to the next OST. The default stripe-size is 1 MB, and passing a stripe-size of 0 causes the default stripe size to be used. Otherwise, the stripe-size must be a multiple of 64 KB.

stripe-count:
Stripe count is how many OSTs to use. The default stripe-count is 1. Setting stripe-count to 0 causes the default stripe count to be used. Setting stripe-count to -1 means stripe over all available OSTs (full OSTs are skipped).

start-ost [DO NOT USE THIS OPTION FLAG!!!]:
Start ost is the first OST to which files are written. The default start-ost is -1, and passing a start-ost of -1 allows the MDS to choose the starting index. This setting is strongly recommended, as it allows space and load balancing to be done by the MDS as needed. Otherwise, the file starts on the specified OST index, starting at zero (0).
 
Note. If you pass a start-ost of 0 and a stripe-count of 1, all files are written to OST #0, until space is exhausted. This is probably not what you meant to do. If you only want to adjust the stripe-count and keep the other parameters at their default settings, do not specify any of the other parameters:

lfs setstripe -c <stripe-count> <file>

Although when using lfs setstripe you can specify option values based on position, it is best to use the explicit rather than the positional options. Using the positional options are error-prone and often misused. For example, it is best to use the following command:

lfs setstripe $NAME -s 1m -c 16
 
rather than
 
lfs setstripe $NAME 1m -1 16
 
Note that not specifying an option keeps the current value.

Setting the Striping Pattern for a Single File
 
You can specify the striping pattern of a file by using the lfs setstripe command to create it. This enables you to tune the file layout more optimally for your application. For example, the following command will create a new zero length file named file1 with a stripe size of 2MB, and a stripe count of 40:
 
lfs setstripe file1 -s 2m -c 40
 
Note that you cannot alter the striping pattern of an existing file with the lfs setstripe command. If you try to execute this command on an existing file, it will fail. Instead, you can create a new file with the desired attributes using lfs setstripe and then copy the existing file to the newly created file.

Setting the Striping Pattern for a Directory
 
Invoking the lfs setstripe command on an existing directory sets a default striping configuration for any new files created in the directory. Existing files in the directory are  not affected. The usage is the same as lfs setstripe for creating a file, except that the directory must already exist. For example, to limit the number of OSTs to 2 for all new  files to be created in an existing directory dir1 you can use the following command:
 
lfs setstripe dir1 -c 2

Setting the Striping Pattern for Multiple Files
 
You can’t directly alter the stripe patterns of a large number of files with lfs setstripe but you can do it by taking advantage of the fact that files inherit the directory’s  settings. First, create a new directory setting its striping pattern to your desired settings using the lfs setstripe command. Then copy the files to the new directory and the files will inherit the directory settings that you specified.

Using the Non-striped Option
 
There are times when striping will not help your application’s I/O performance. In those cases, it is recommended that you use Lustre’s non-striped option. You can set the  non-striped option by using a stripe count of 1 along with the default values for stripe index and stripe size. The lfs setstripe command for the non-striped option is as  follows:
 
lfs setstripe dir1 -c 1

Striping across all OSTs
 
You can stripe across all or a subset of the OSTs by using a stripe count of -1 along with the default values for stripe index and stripe size. The lfs setstripe command for
striping across all OSTs is as follows:
 
lfs setstripe dir1 -c -1