Expanding Computing Power to Support Research Needs while being environmentally sensitive and energy efficient

Thursday, December 16, 2010 - 7:00pm
James Cuff, Harvard University
James Cuff

In the past four and a half years, Harvard University's research computing resources have grown from 200 to over 12,000 processing cores, putting significant strain on traditional data center resources and the wide area networking infrastructure available within the Cambridge campus. I will discuss the tactics for building both the organizational and physical infrastructure which now supports over 2,000 researchers in fields as diverse as astrophysical modeling of the early universe, high speed genomic sequencing whose data output more than doubles each year, the search for the Higgs boson and advanced economic and financial modeling. This research involves large amounts of data and algorithms which don't always scale very well. (Some of the algorithms are NP complete.) Economies can be achieved by using a shared physical infrastructure operated by a core team of research computing associates and staff. In this context the research computing group have deployed approx. 2PB of assorted storage and 40TF of GPGPU computing to support and compliment traditional 12,000 core x86_64 infiniband connected systems. I will also explain the now very obvious need and requirement for Harvard's active involvement in the new multi institutional Massachusetts Green High Performance Computing Center (MGHPCC).

This is a joint meeting of the GBC/ACM and Boston Chapter of the IEEE Computer Society.

James Cuff is Director of Research Computing and Chief Technology Architect at Harvard. He was appointed Director of Research Computing for the Faculty of Arts and Sciences in 2007, previously directing Research Computing for the Life Sciences Division. In 2003 he moved from the UK to the Broad Institute of Harvard and MIT. As Group Leader for Applied Production Systems, he managed high performance technical computing alongside large scale storage and relational database systems. Previously he held a position at the Wellcome Trust Sanger Institute as Group Leader for the Informatics Systems Group. There he built the large scale high performance computing infrastructure to support the Ensembl genome analysis project. Prior to the position at the Sanger Institute, James worked at Inpharmatica in London and the European Bioinformatics Institute in Cambridge. In those positions he focused on using high performance computing to study genome sequences and designed ab initio algorithms for protein secondary structure prediction. James holds a D. Phil in Molecular Biophysics from Oxford University and a B.Sc. (Hons) in Chemistry with Industrial Experience from Manchester University. Within FAS IT, James has overall responsibility for High Performance Technical Computing, Life Sciences and Computational Biology, Social Sciences Research Computing, and Physical Sciences Research Computing.