Harvard T.H. Chan School of Public Health + FASRC

Best Practices for Current Odyssey Resources and the Procurement of Future Resources

The relationship between the Harvard T.H. Chan School of Public Health and FAS Research Computing provides researchers with high performance computing resources necessary to achieve their scientific and academic goals. Best practices associated with utilizing this resource include appropriate use of the compute environment to process jobs and communicating resource needs at onset of project development. Utilizing the correct directory for the size project you’re working on enables effective and efficient use of the compute environment for all users. Additionally, in order for FASRC to stay ahead of the increasing demand for compute resources, Harvard Chan researchers must communicate their needs to the FASRC team early and often. Researchers are encouraged to contact James Cuff, Assistant Dean for Research Computing at FAS(with a cc toKrista Coleman, Office of Research Strategy and Development) when developing their grant proposals to discuss specific computing needs. Doing so enables the FASRC team to anticipate and plan for the ever-growing computing environment at Harvard, ensuring continued support for its researchers.

Upon account request approval, each user is provided with 40GB (gigabyte) of home storage. Additionally, labs using Odyssey receive 1TB of storage at no cost. These initial storage allowances are built into the $3,000/FTE per year rate agreement between the Harvard Chan School and FASRC. Please see below for additional information about the compute environment.

Data Security

By default, research computing storage resources are not considered above University level 2 data security. If you have any confidential data that you need stored, please email o schedule an appointment. We will be happy to discuss your particular needs and design a storage solution that is compliant to Federal, State and University regulations. We also bring to your attention the following security pages discussing security levels and policy:


Home Directory Storage (40GB):/n/home*

This is the main location used for everyday data storage for analysis and data processing. Home storage has good performance for most tasks; however,I/O (input/output) intensive or large numbers of jobs should not be processed in home directories.Widespread computation against home directories results in poor performance for all users. For these types of tasks one of the "scratch" file systems is better suited.

Lab Storage (No size limit)

Each laboratory using the Odyssey cluster is granted an initial storage allocation of 1TB. These conventional disk arrays are mounted via NFS (Network File System) and can be used for a variety of purposes. Laboratories may purchase additional storage and backup space as needed. Please contactrchelpfor more details.

Although lab volume has good performance for most tasks,I/O intensive or large numbers of jobs should not be processed in these directories.Similar to home directory storage, widespread computation against these directories would result in poor performance for all users. For these types of tasks one of the "scratch" file systems is better suited.

Local (per node), Shared Scratch Storage (270GB total per node):/scratch

Each node contains a disk partition/scratch, and is useful for large temp files created while an application is running.

The/scratchvolumes are a directly connected (and therefore, fast), temporary storage location for large temporary files that are created while an application is running. /scratchvolumes are local to the compute nodes as opposed home directories that are attached to the network storage. If you can direct your application to use /scratch for temp files, you can gain significant performance improvements and ensure that large files can be supported.

/scratchshould only be used for temporary files written and removed during the running of a process. Although a 'scratch cleaner' does run daily, we ask that at the end of your job you delete the files that you've created.

Networked, High-performance Shared (Scratch) Storage (1.2PB total):regal

/n/regalis additional short-term, shared space for large data analysis projectsthat would fill a user’s 40GB of home space. Once analysis has been completed, however, data you wish to retain must be moved elsewhere; the retention policy will remove data from scratch storage after 90 days.

Custom Storage (1TB+):

In addition to the storage tiers listed above, Research Computing hosts a number of owned, custom storage systems. Storage sizes/specifications are proposed by RC,paid for by specific user groups, are housed and maintained by Research Computing, and integrated into the existing infrastructure like any other system.

These systems range from dedicated group storage w/ backups, to scratch-style systems, to dedicated parallel systems. These are many times designed for very specific application/instrument requirements, or when the cost model of our shared storage no longer makes sense for the amount of storage desired. Researchers whose computing needs exceed the baseline storage allowances should contact rchelpto discuss purchasing additional hardware, software, and storage to accommodate special computing needs

Cost

Basic fixed price backed-up storage with backup is $900/TB per 3 years. For quotes on custom storage systems please contactrchelp.

Billing requirements: A 33 digit billing code, unit details and description of the storage (for the invoice line item) from the PI or their faculty/financial admin. For more information visit