On all of our clusters, users have access to a global home, project, and group space (if they belong to a group that has purchased additional storage). This means that files created from compute nodes on one cluster can be used for calculations on any of our other clusters. All storage systems use GPFS as the file system and certain filesystems are backed up nightly.
|Filesystem||Location||Soft Limit||Hard Limit|
||100 GB||1 TB|
||1+ TB||2+ TB|
||no quotas||no quotas|
We also offer the ability for groups to purchase their own storage resources to be hosted with us. In doing so you get access to the storage space across all of our clusters (unless otherwise specified), and we take care of supporting the system, replacing parts, and when possible tuning the storage resources to fit your data model.
If you are interested in learning more about purchasing additional storage resources please contact us at email@example.com
You also have access to scratch space on the compute nodes while you have a job running on the node.
In order to prevent individual users from hogging all of the available storage space, a quota is enforced on home and project directories. Home directories have a quota of 100 GB, while project directories can have 1 or more TBs.
These quotas are technically soft limits. If a running job outputs more data than you expected, you can continue writing to your home or project directories up to the hard limit. This prevents jobs in an infinite loop from crashing the filesystem. However, once you are over your soft limit, you only have a 2 week grace period to go below your quota again.
Once you are over your quota and your grace period has expired, you can no longer write files to your home directory, including the cache file used by SoftEnv. This means that your software environment could become corrupted, preventing you from finding executables you’ve previously used in the past. If you are unexpectedly seeing error messages like “command not found” or you see cryptic error messages upon login, check to make sure you aren’t over your quota with the following command:
If you believe your project requires additional storage space, contact us and tell us why this is the case. At this time we are only accepting requests for project quota increases. If you run out of room in your home directory, you’ll need to either delete some of the data or move it to your project or group directories.
Home, Project, and Group
Your home, project, and group directories are located on separate GPFS filesystems that are shared by all nodes on the cluster. These filesystems are located on a raid array and are served by multiple file servers. This provides both a performance increase and protection against the filesystems being inaccessible. If one server goes down, the other servers can continue to serve the filesystems.
- Global namespace
- Multi-TB filesystem
- Large file support (> 2GB)
- Backed up
- Raid protection
- Stable hardware
- Native InfiniBand support
- Moderate performance
Local Scratch Disk
If you need a place to put temporary files that don’t need to be accessed by other nodes, we recommend that you put them into the local scratch disk on the nodes during job runs. All jobs create a job specific directory with local storage which can be referenced from your job submission script using the variable
$TMPDIR. The normal publicly available Blues nodes offer 15 GB of scratch space while the ‘biggpu’ queue offers 1 TB.
- Fast access
- Large file support (> 2GB)
- Unique to each node; not shared between nodes
- GB filesystem
- Not backed up
- Cleared out at the end of your job
- No raid protection
Backups and Archives
As previously mentioned, all storage systems use GPFS as their filesystem. Currently we are only backing up the home filesystem. This is happening nightly.
Backups are written to both a disk and tape enclosure in LCRC. Our backup policy is that we maintain ONLY the current version of each file for 90 days. If you delete a file, we will maintain the most recent copy of it for 90 days. After 90 days, this file will be completely removed from LCRC systems. Please note that if you make changes to a file or the file becomes corrupt in some way, this will be overwritten nightly and the previous version will no longer exist. This also means that if you delete a file and within 90 days recreate it with the original name and in the original path, the next backup that occurs will overwrite the copy stored in our backup system making the old version also obsolete.
If you need to restore a lost file, please contact firstname.lastname@example.org and we will make a best effort to restore this for you.
We currently do not offer the ability for users to archive their own files on demand for long term recovery, but we hope to re-implement this in the near future.