This is a repository of event and performance data for scientific code execution jobs submitted to Purdue University's Conte cluster between March 2015 and June 2017, and,
to the University of Texas at Austin's Stampede 1 cluster between 2013 and 2016.
The Conte cluster comprises 580 nodes totaling 9280 cores with 40 Gbps Infiniband interconnects. Each node in the cluster has 64 GB of RAM and includes two additional 60-core Xeon Phi accelerators. The repository contains data for 10.8M jobs run on Conte over the 28-month period between March 2015 and June 2017.The Stampede 1 cluster at the time of decommissioning consisted of 6400 nodes with a total of 522,080 processing cores. The repository contains data for 8.7M jobs during the 2013 - 2016 period.
Accessing the repository
You can browse and download individual datasets from this repository by visiting the links under the Data Sets menus, or use Globus to download the entire data repository.
-Instructions on using Globus can be found in the Documentation.
How to cite this dataset:
Saurabh Bagchi, Todd Evans, Rakesh Kumar, Rajesh Kalyanam, Stephen Harrell, Carolyn Ellis, Carol Song "FRESCO: Job failure and performance data repository from Purdue University", March, 2018. At: https://www.datadepot.rcac.purdue.edu/sbagchi/fresco