Applying for a grant or publishing a paper? The information below about our capabilities and services may be helpful to your endeavor.
If you use UAHPC, CHPC, or UA’s ArcGIS platform in your research or creative efforts, we would appreciate you adding the following citation in your Acknowledgement section:
UAHPC or CHPC
“We would like to thank The University of Alabama and the Office of Information Technology for providing high-performance computing resources and support that have contributed to these research results.”
“We would like to thank The University of Alabama and the Office of Information Technology for providing GIS resources and support that have contributed to these research results.”
UA’s Facilities Plan
The University of Alabama’s (UA) Office of Information Technology (OIT) manages two primary high-performance computing (HPC) clusters: UAHPC and CHPC. OIT hosts the clusters at DC BLOX Tier III data center in Birmingham, Alabama. OIT also supports researchers by providing project data storage, open-sourced and licensed HPC software, a high-bandwidth science network, connectivity to national research networks, and dedicated support personnel.
High Performance Computing (HPC) Clusters
OIT manages two HPC clusters: UAHPC and CHPC. UAHPC follows a “condo model”, in which individual researchers purchase nodes and share unused cycles with other researchers while retaining priority use on their purchased nodes. Currently, UAHPC has 1664 cores across 54 nodes providing over 56 TFLOPs theoretical sustained double precision performance. The cluster includes three GPU nodes and three high memory nodes. Users are limited to 1500 jobs within a 24-hour period on UAHPC. All UAHPC MXXX nodes are connected internally within their Dell M1000e chassis via InfiniBand FDR10 at a throughput of 40 Gbps. C6420 and R series nodes are directly connected to external InfiniBand. All the chassis are interconnected through a pair of external InfiniBand switches at 56 Gbps or 100 Gbps (2:1 oversubscribed). Storage is shared between nodes using NFS on IPoIB.
CHPC, a CC* grant awarded HPC, is available to all researchers on an equal basis. CHPC has 2112 cores across 34 nodes. In addition to 25 compute nodes, there are four 2 CPU AMD 7532 48 core 2 TB high memory nodes and five Dell PowerEdge R7525 GPU nodes each with one NVIDIA A100 GPU and room for two additional GPU. CHPC has 80 TFLOPS theoretical sustained performance. If used for machine learning applications, the cluster is capable of 3.2 PFLOPs tenser double-precision with sparsity. Twenty percent of unused cycles on CHPC are shared via Open Science Grid (OSG). The clusters attach independently to a shared Panasas Ultra storage array with 750TB of usable storage. Technical specifications for each cluster are found in Table 1 and Table 2. All CHPC nodes are connected internally within their Dell chassis via InfiniBand FDR10 at a throughput of 40 Gbps. All nodes are directly connected to external InfiniBand. All the chassis are interconnected through a pair of external InfiniBand switches at 56 Gbps or 100 Gbps (2:1 oversubscribed). Storage is shared between nodes using NFS on IPoIB.
Table 1. UAHPC Technical Specifications
|Nodes||Cores/ Node||CPU||Memory/ Node||QoS (q) Partition (p)|
|11||24||2 x 2.6 GHz Intel Xeon Gold 6126||96 GB||Heterogenous Compute|
|11||16||2 x 2.6 GHz Intel Xeon E5-2640 v3||64 GB||Heterogenous Compute|
|4||20||2 x 2.4 GHz Intel Xeon E5-2640 v4||96 GB||Heterogenous Compute|
|4||16||2 x 2.0 GHz Intel Xeon E5-26500||192 GB||Heterogenous Compute|
|3||16||2 x 2.0 GHz Intel Xeon E5-2640 v2||64 GB||Heterogenous Compute|
|3||36||2 x 2.6 GHz Intel Xeon Gold 6240||192 GB||Heterogenous Compute|
|3||48||2 x 2.1 GHz Intel Xeon Silver 8160||192 GB||Heterogenous Compute|
|2||32||2 x 2.5 GHz AMD Opteron 6380||128 GB||Heterogenous Compute|
|2||64||4 x 2.4 GHz AMD Opteron 6378||256 GB||Heterogenous Compute|
|2||16||2 x 2.0 GHz Intel Xeon E5-2640 v2||384 GB||High Memory|
|2||40||2 x 3.2 GHz Intel Xeon E7-8891v2||2688 GB||High Memory|
|2||128||2 x 2.3 GHz AMD EPYC 7742||1024 GB||Heterogenous Compute|
|1||20||2 x 2.4 GHz Intel Xeon E5-2640 v4||128 GB||High Memory|
|1||48||2 x 2.6 GHz Intel Xeon E7-4860v2||1024 GB||?|
|1||88||2 x 2.2 GHz Intel Xeon E7-4669v4||1024 GB||?|
|1||24||2 x 2.1 GHz Intel Xeon Silver 4116||96 GB||GPU node with Tesla V100|
- Compute Nodes: 1664 cores across 54 Dell PowerEdge M620, M630, M640, C6420, R715, R740xd, R7525, R815, R830, and R920 nodes.
- Compute nodes have anywhere from 16 to 48 cores per node with 4 GB per core
- There are five high memory nodes (384 GB up to 2688 GB)
- Head Node: Compute nodes are controlled by a Dell PowerEdge M830 head node.
- The head node has two 10-core processors and 3 TB of 15,000 RPM SAS6 Hard Drive capacity for sharing applications and home directories across the cluster. These directories will be migrated to the new Panasas Ultra Storage Array in 02/21.
- Data Storage
- Two dedicated storage nodes are connected via PERC H700 or H810 controllers to offer a total of 100 TB of storage in five Dell PowerVault MD1200s
- Additional 20 TB of internal disks in the second storage node.
- Storage nodes are all Globus endpoints
Table 2. CHPC Technical Specifications
|Nodes||Cores/ Node||CPU||Memory/ Node||Type|
|25||64||2 x 2.4 GHz AMD EPYC 7532||250 GB||Dell R6525 Compute|
|5||64||2 x 2.4 GHz AMD EPYC 7532||250 GB||Dell R7525 GPU (w/1 NVIDIA A100 each)|
|4||48||2 x 2.3 GHz AMD EPYC 7352||2048 GB||Dell R7525 High Memory|
- Head Node: Compute nodes are controlled by two Dell PowerEdge R6525 head nodes.
- The head nodes have two 32-core AMD processors and 480GB SSD SATA drives for sharing applications and home directories across the cluster.
- Data Storage
- CHPC nodes have four 8 TB internal usable HD with 16 TB usable storage
For UAHPC and CHPC, the following software is supported: Rocks 7 (deprovisioned), CentOS 7.4 (UAHPC), CentOS 7.9 (CHPC), SLURM 18.08 (UAHPC) SLURM 19.05 (CHPC), 2-seat license for Intel Parallel Studio XE for Linux, NVIDIA HPC Software Development Kit (SDK), MATLAB, ANACONDA, Apache SPARK (CHPC), VASP, Quantum Expresso, and other licensed and open-sourced software.
Research Data – HPC Attached
UAHPC and CHPC share a Panasas Ultra storage array providing 622 TB usable storage. OIT provisions users 50 GB storage in their /home/$USER/ directory and 500 GB longer-term storage in /bighome/$USER/. Researchers can purchase additional storage on the Panasas Ultra at $50/Tb/Year.
General Research Data Storage
For research data storage requests of 100TB or less not requiring direct HPC connectivity, researchers can purchase storage on UA’s Enterprise storage array at $120/Tb/Year. Researchers receive a free replicated backup of their data to UA’s off-site DR center in Atlanta with this service. OIT is currently configuring a 700TB BeeGFS storage array to provide a lower cost option for commodity storage with no replicated backup. For >100TB storage requests, OIT can help the researcher engage Microsoft to purchase storage in UA’s Azure cloud environment.
For research projects with CUI (CTI) data requirements, the university offers UAResCloud, a Microsoft Azure GovCloud environment supported through UA’s Office of Grants and Contracts.
OIT hosts UAHPC and CHPC clusters at DC BLOX colocation facility in Birmingham, Alabama. DC BLOX is an Uptime Institute Tier III rated data center offering highly available and secure cyberinfrastructure hosting.
For single nodes or small cluster hosting, OIT manages the university’s primary 5,000 sq. ft. secure data center located at Gordon Palmer Hall on the UA campus (GPDC). GPDC hosts over 400 physical devices and 1500 virtual machines. The data center was refit in 2022 to provide redundant chillers and power circuits, hot-aisle containment for energy efficient design, and a Seimen’s Sinorix 1230 fire suppression system. Power is distributed through redundant overhead busways and managed by a 250 kW UPS and backup 300 kW UPS. The data center also has a 1,000-gallon backup generator providing 580 amps power and 24-hour runtime. OIT maintains a secondary campus data center for business continuity and disaster recovery in the Ridgecrest parking garage. Power in Ridgecrest data center is distributed through overhead busways and managed by a 100kW UPS with generator backup. Enterprise data is replicated to OIT’s storage systems in Peachtree data center, Atlanta for geo-diversity.
SciNet Research Network
Researchers requiring network bandwidth management for large data transfers can utilize UA’s SciNet. This campus network connects data-intensive science areas and HPC/research storage via dedicated switches providing 32 10 Gbps and 4 40 Gbps ports in targeted buildings. Specially configured data transfer nodes (DTN) and high-performance storage is hosted in the research facilities, DC BLOX, or in the UA primary data center. SciNet includes a robust performance monitoring and problem resolution component using perfSONAR.
Outside of UA, SciNet traverses the University of Alabama System Regional Optical Network’s (UAS-RON) 100Gbps backbone to DC BLOX hosting facility. UAS-RON also connect to the public Internet and Internet2 via Southern Crossroads (SOX) in Atlanta. In addition to on-campus shared resources, researchers have access to resources at the state and national level such as the Alabama Supercomputer Center (ASC), XSEDE, and Open Science Grid (OSG).
UA’s Office of Information Technology commits offers a high-performance computing support team for HPC management, software installation, workflow support, research consultation, documentation and training, and other research community requests. Further information and requests for accounts can be found at https://hpc.ua.edu/research-computing-support/
Other UA Technology Capabilities
The Office of Information provides wired and wireless internet connections to all buildings on the UA campus. The current network bandwidth is 40Gbps. Eduroam is the official wireless network for UA. Students, faculty, and staff may use their ua.edu ID to connect to Eduroam networks at other Eduroam institutions.
All buildings on the UA campus have a redundant connection to the core network infrastructure. All core network equipment exists in redundant pairs. UA also provides multiple external connections to the internet through different ISPs offering diverse routes. Redundant, physically diverse network connections and appliances on campus and out to the internet reduces the risk of network outage.
UA data stored securely on OIT’s central storage platform is backed up on a customer-defined schedule to UA’s off-site storage platform located in Atlanta. OIT is able to recover customer data from this site. OIT also offers research storage space for data that requires quick access to the high-performance research cluster. A research file share for additional long-term storage is available for purchase by the GB. More information is available on the Research Data Storage webpage.
Students, faculty, and staff can download many software packages at no cost. OIT maintains a software catalog that lists all software available for download. Microsoft Office, Qualtrics, RefWorks and MATLAB are just a few of the software offerings available. OIT is working to increase the number of enterprise-grade software packages that are offered to the campus.
Labs and Classrooms
OIT supports instructional technologies in classrooms and labs across campus. The Center for Instructional Technology within OIT provides training on instructional technologies, and the audio/visual and desktop support teams work to ensure that all classrooms and labs have necessary programs. Many colleges maintain their own labs, with OIT providing the necessary software.
To establish secure connectivity to the network, UA uses a diverse range of security appliances including multiple redundant pairs of firewalls, packet-shaping devices, and intrusion-detection systems. The OIT security team also offers several tools directly to students, faculty, and staff. On campus users can download McAfee antivirus software, DUO two-factor verification, and LastPass password management tool to ensure device and account security.
UA uses Splunk to search, monitor and analyze log data from network, system, and security appliance sources. Splunk provides continual security auditing of critical UA systems. UA also uses a range of other monitoring and alerting software and business processes to provide continual auditing of system and network performance and availability. Continual auditing from all sources reduces the risk of outage and maximizes service availability and security.
OIT’s Director of Compliance and Risk Assessment evaluates research projects and IT purchases for risk and compliance to data classification requirements.
If you have any additional questions about other OIT capabilities for your grant proposal, please contact Deputy CIO Mike Shelton.
UA High-Performance Computing Clusters (UAHPC/CHPC)
Please visit the “HPC details” section of the Research Computing Support page for more information:Research Computing Support