/tag/allocations
Using UVA’s High-Performance Computing Systems
Afton is the University of Virginia’s newest High-Performance Computing system. The Afton supercomputer is comprised of 300 compute node each with 96 compute cores based on the AMD EPYC 9454 architecture for a total of 28,800 cores. The increase in core count is augmented by a significant increase in memory per node compared to Rivanna. Each Afton node boasts a minimum of 750 Gigabytes of memory, with some supporting up to 1.5 Terabytes of RAM memory. The large amount of memory per node allows researchers to efficiently work with the ever-expanding datasets we are seeing across diverse research disciplines. The Afton and Rivanna systems provide access to 55 nodes with NVIDIA general purpose GPU accelerators (RTX2080, RTX3090, A6000, V100, A40, and A100), including an NVIDIA BasePOD.
Allocations
Time on Rivanna/Afton is allocated as Service Units (SUs). One SU corresponds to one core-hour. Multiple SUs make up what is called an allocation (e.g., a new allocation = 1M SUs). Allocations are managed through Grouper (requires VPN connection) groups that should be created by Principal Investigators (PIs) before they submit an allocation request. Eligibility and Account Creation All UVA faculty are eligible to serve as PI and request access to RC services (e.g. storage & HPC allocations, virtual machines, microservices) for their research group. Postdocs and staff are encouraged to use an allocation provided by a faculty sponsor, although they may request their own allocation pending departmental or RC approval.
Pricing
Below is a schedule of prices for Research Computing resources.
High Performance Computing Allocations Type SU Limits Cost SU Expiration Standard None Free 12 months Purchased None $0.01 Never Instructional 100,000 Free 2 weeks after last training session A service unit (SU) resembles usage of a trackable hardware resource for a specified amount of time. In its simplest form 1 SU = 1 core hour, but the SU charge rate can vary based on the specific hardware used. Resources like GPUs and memory may incur additional SU charges. About Allocations
Storage Name Security Cost Research Project Standard $70/TB per year Research Standard Standard $45/TB per year (Each PI with an RC account will be granted up to 10 TB of Research Standard Storage at no charge1) High-Security Research Standard High $45/TB/year Storage Details Request Storage
ACCORD: Jupyter Lab
Back to Overview
Jupyter Lab allows for interactive, notebook-based analysis of data. A good choice for pulling quick results or refining your code in numerous languages including Python, R, Julia, bash, and others.
Learn more about Jupyter Lab
ACCORD: RStudio
Back to Overview
RStudio is the standard IDE for research using the R programming language.
Learn more about RStudio
ACCORD: Theia IDE
Back to Overview
Theia Python is a rich IDE that allows researchers to manage their files and data, write code with an intelligent editor, and execute code within a terminal session.
Learn more about the Theia Python IDE
FastX Web Portal
Overview FastX is a commercial solution that enables users to start an X11 desktop environment on a remote system. It is available on the UVA HPC frontends. Using it is equivalent to logging in at the console of the frontend.
Using FastX for the Web We recommend that most users access FastX through its Web interface. To connect, point a browser to:
https://fastx.hpc.virginia.edu
Off Campus? Connecting to Rivanna and Afton HPC systems from off Grounds via Secure Shell Access (SSH) or FastX requires a VPN connection. We recommend using the UVA More Secure Network if available. The UVA Anywhere VPN can be used if the UVA More Secure Network is not available.
Open OnDemand
Overview Open OnDemand is a graphical user interface that allows access to UVA HPC via a web browser. Within the Open OnDemand environment users have access to a file explorer; interactive applications like JupyterLab, RStudio Server & FastX Web; a command line interface; and a job composer and job monitor.
Logging in to UVA HPC The HPC system is accessible through the Open OnDemand web client at https://ood.hpc.virginia.edu. Your login is your UVA computing ID and your password is your Netbadge password. Some services, such as FastX Web, require the Eservices password. If you do not know your Eservices password you must change it through ITS by changing your Netbadge password (see instructions).
Open OnDemand: File Explorer
Open OnDemand provides an integrated file explorer to browse and manage small files. Rivanna and Afton have multiple locations to store your files with different limits and policies. Specifically, each user has a relatively small amount of permanent storage in his/her home directory and a large amount of temporary storage (/scratch) where large data sets can be staged for job processing. Researchers can also lease storage that is accessible on Rivanna. Contact Research Computing or visit the storage website for more information.
The file explorer provides these basic functions:
Renaming of files Viewing of text and small image files Editing text files Downloading & uploading small files To see the storage locations that you have access to from within Open OnDemand, click on the Files menu.
Open OnDemand: Job Composer
Open OnDemand allows you to submit Slurm jobs to the cluster without using shell commands.
The job composer simplifies the process of:
Creating a script Submitting a job Downloading results Submitting Jobs We will describe creating a job from a template provided by the system.
Open the Job Composer tab from the Open OnDemand Dashboard.
Go to the New Job tab and from the dropdown, select From Template. You can choose the default template or you can select from the list.
Click on Create New Job. You will need to edit the file that pops up, so click the light blue Open Editor button at the bottom.
Slurm Job Manager
SLURM Would you like to take an interactive SLURM quiz? y/N |
Overview UVA HPC is a multi-user, managed environment. It is divided into login nodes (also called frontends), which are directly accessible by users, and compute nodes, which must be accessed through the resource manager. Users prepare their computational workloads, called jobs, on the login nodes and submit them to the job controller, a component of the resource manager that runs on login nodes and is responsible for scheduling jobs and monitoring the status of the compute nodes.
We use Slurm, an open-source tool that manages jobs for Linux clusters.