DGX User Policy
Oak Ridge National Laboratory's (ORNL) Compute and Data Environment for Science (CADES) provides eligible customers with DGX resources. These resources are primarily for running GPU workloads in containers. Users should note that the software environment available on the DGXs vary significantly from that found in the Condo and this is by design. The DGXs are configured with miminal in-OS software support with the expectation that users will provide their software environments in their containers.
Computers, software, and communications systems provided by DGX are to be used for work associated with, and within the scope of, an approved project. The use of DGX resources for personal or non-work-related activities is strictly prohibited. All computers, networks, email, and storage systems are property of the US Government. Any misuse or unauthorized access is prohibited and is subject to criminal and civil penalties. CADES systems are provided to users without any warranty. CADES will not be held liable in the event of any system failure or data loss or corruption for any reason, including, but not limited to: negligence, malicious action, accidental loss, software errors, hardware failures, network losses, or inadequate configuration of any computing resource or ancillary system.
All CADES DGX must comply with ORNL security rules and with the following:
- DO NOT share your credentials, passwords, private keys, or certificates, with anyone.
- Treat facility staff with respect and courtesy.
- Conduct activities with the highest scientific, professional, and ethical standards.
- Users must not intentionally introduce or use malicious software such as computer viruses, Trojan horses, or worms.
- Users may not deliberately interfere with other users accessing system resources.
- Users are accountable for their actions and may be held accountable to applicable administrative or legal sanctions.
- Users are prohibited from taking unauthorized actions to intentionally modify or delete information or programs.
- Use DGX resources responsibly, recognizing that both staff and equipment are in high demand.
- Lead effort to analyze and publish results in a timely manner.
Application for Resources
Access to DGX is available to ORNL research and technical staff, by request, through CADES. The request is made through the ORNL XCAMS portal and requires your UCAMS ID. An activation notice will be sent when your resources are ready for use.
DGX Software Policy
The DGXs are provided to support container workloads requiring GPUs. The user is expected to build all software in support of her/his containerized application in NFS projects areas or on Lustre. The DGXs will provide:
- 16 physical GPUs per DGX.
- The Nvidia driver to run on these GPUs.
- Software like NVIDIA GPUDirect and GDRCopy for enhanced GPU operations.
- The container runtime(Singularity) to run container images.
Additionally, some very helpful, vendor-supplied containers are available in '/containers.' Users should feel free to peruse containers in '/containers' to determine if any proves useful.
Users can build their containers in Cades for use with Singularity in several ways as indicated in the Cades Container Policy. CADES is not obligated to provide software on demand for DGXs.
DGX Storage Policy
Persistent NFS and Lustre storage is available on the DGXs in a manner consistent with Cades Open Research Condo. In addition, there is high performance, NVME-based, local scratch storage that is availabe for jobs requiring frequent, very fast, disk I/O during execution. Users who need this capability should stage their job's data into this area at the beginning of their job and out again, to either Lustre or NFS, at the end of their job. Please note, as with the rest of the Cades OR Condo, local scratch areas provide no guarantee of long term data persistence. Also, please note, it is solely the responsibility of the user to copy data off to either NFS or Lustre for longer term storage.