AWS S3 Overview

Amazon S3 provides pay-as-you-go, off site, cloud based storage with an array of tools for access controls and sharing of data. Common use cases include backups, hosting datasets and portals, and programmatic access to objects, though there are many areas where S3 can integrate into and improve data workflows.

Capability Highlights:

Graphical and cli clients
Easy to use from CADES environments
Long term Archival storage via Glacier
Data tiering and aging policies
Stored objects can (if desired) be exposed via https://
Host static websites from an S3 bucket
Create privately sharable pre-signed urls, with or without expiration
No cost to retrieve data to AWS VMs within same region
Support for multipart uploads and custom metadata tags
Supported by Globus Premium Connectors
Traffic is over port 433
S3 interactions from ORNL systems does not (usually) require firewall exceptions
Protocol is used by many providers and vendors of on prem object storage - not limited to AWS.

Getting Started with S3

While the official AWS S3 Docs are extensive, the below quick start guides are designed to assist ORNL scientific users become familiar with S3. Interacting with object storage is dissimilar to posix based filesystems, and the guides here are intended to introduce working with S3 data workflows.