Overview of SHPC Condos

All researchers in the ORNL Science and Technology Directorates have access to CADES resources at no initial cost. The CADES Service Suite includes four core CADES services: Cloud Computing, Scalable HPC, Dedicated Storage, and High Speed Data Transfer capabilities.

One set of SHPC condos, intended for open publishable research, sits in the ORNL Open protection zone (CADES Open) and another intended for sensitive codes and data, sits in the ORNL Moderate protection zone (CADES Mod). The protection zones contain and control both the software base and the data produced on those systems. Most users will join a condo in CADES open.

All ORNL staff members may have access at no initial cost to 10 nodes of the open SHPC condo, called birthright condo. Many more nodes have been purchased in both Open and Moderate condos by specific ORNL divisions or research groups. Access to those is described below.

Condo What is it Who May Join Cost
Birthright Access to 36 nodes, 10 of which have 2 GPUs each, sits in CADES Open Enclave All ORNL Staff No initial cost
CADES Open SHPC research condos Access to more nodes, the node count depends on the condo, that have been purchased by one of ORNL's research divisions, sits in the CADES Open research protection zone Researchers doing open research who are collaborating with the division or group that purchased the condo's nodes. (post-docs, students, ORNL staff members, visiting researchers with cyber access PAS) No initial cost to join, but access is subject to approval by condo owner
CADES Moderate SHPC condos Exclusive access to nodes in the Moderate protection zone Researcher who are working with sensitive codes or data, who are collaborating with a current Moderate condo owner. If you have sensitive codes or data and you are not working with one of the current owners, contact cades-help@ornl.gov No initial cost to join, but access is granted solely through approval by a current condo owner
Purchase a new Condo Access to your own set of resources as defined by you and CADES Any ORNL research or technical staff member Please contact cades-help@ornl.gov to begin for purchase information

If you are not sure what you should join, write cades-help@ornl.gov or simply start with the birthright condo.

SHPC Condos Resources in Brief

More detail about the specific types of processors and a list of current SHPC Condos can be found here.

SHPC Condo Hardware Configuration

The SHPC Condos are commodity x86_64 clusters that contains a set of MPPs (Massive Parallel Processors). The hardware differs for each of the 12 condo groups, however there are some basic similarities. A processor in this cluster is commonly referred to as a node and has its own CPU, memory, and I/O subsystems. There are 2 CPUs per node with between 32 and 128 cores between them. Nodes with GPUs have either Tesla K80, P100, or V100 GPUs. Each node has between 128 and 512 GB of RAM, and is connected to a condo-wide FDR InfiniBand network.

Node Types

The SHPC Condos have two types of nodes: Login and Compute. While these are similar in terms of hardware, they differ considerably in their intended use.

Node Type Description
Login When you connect to either the Moderate or Open SPHC condos, you are placed on a login node. This is the place to write/edit your code, compile small programs, manage data, submit jobs, etc. You should never run large parallel compilations or jobs on the login nodes. Login nodes are shared resources that are in use by many users simultaneously.
Compute Most of the SHPC condo nodes are compute nodes. These are where your parallel jobs execute, and where you should compile large programs. They are accessed via the sbatch or qsub command depending on whether your condo is using Slurm or Moab.

For guidelines on compiling your code on compute nodes, see the software section

SHPC Condo Storage Configuration

Lustre

Lustre is an on-premises, high performance, parallel file system that utilize technologies such as key, value, and set of attributes to compute data in the following environments:

Open Lustre:

Your temporary local storage is located at:/lustre/or-scratch/group/username

Replace group with your group name, and username with your XCAMS/UCAMS ID.

Moderate Lustre:

Your temporary local storage is located at: /lustre/hydra/group/username

Replace group with your group name, and username with your XCAMS/UCAMS ID.

Lustre Best Practices

Lustre is for files that you plan to immediately work on, such as input decks and output. The data should be moved or deleted within days of being generated. Lustre is not for persistent storage of data or software, not for building software or applications (but you may have your executables in Lustre if you have them backed up elsewhere too). If your application generates a lot of small files or log files, plan to compress them if you are not immediately using them, and don’t plan to store them on Lustre for much longer than they are actively used. And Lustre is not for end-storage of any sort.

Lustre is best suited for large files:

Please use Lustre as a fast scratch for compute jobs:

Delete or Move files that are no longer used:

Avoid building software in Lustre, but it’s fine to run executable in Lustre

Avoid running/building Conda or or Containers on Lustre:

Avoid ‘stat’ commands at all costs:

Lustre Purge Policy

Files that have not been used in 90 days are continuously purged from Lustre. Delete files that are no longer used and move important files to nfs.

For well justified cases, CADES will temporarily exempt a luster directory from the purge. The exemption is approved by the CADES RUC.

Use this form to request a purge exemption: https://cades.ornl.gov/special-request-forms/

Lustre is not backed up.

NFS

NFS (Network File System) is a service that allows shared directories and files with others over a network. Home, software, and project directories have been set up on NFS and are permanently available through the network.

Open NFS:

Moderate NFS:

📝 Note: If your needs differ from what is listed here, fill out the NFS Quota form:https://cades.ornl.gov/special-request-forms or contact us to discuss options.

Your persistent NFS storage location(s):

Who Location Access
All Users All Users This is where you land when you login
Condo owners and special projects /nfs/data/project-or-condo-name cd /nfs/data/condo_name

Getting Started

Other pages in this guide that help you get started are:

SHPC Condo Training

Below are links to tutorials and recordings of SHPC Condo training.