Computing That Serves

The Other HPC: High-Productivity Computing in Polystore Environments


Thursday, March 10, 2016 - 11:00am


Bill Howe


Christophe Giraud-Carrier

The Other HPC: High-Productivity Computing in Polystore Environments
Thursday, March 10, 2016
11:00am  1170 TMCB

There has been a "Cambrian explosion" of big data systems proposed and evaluated in the last decade, but relatively little understanding of how these systems or their capabilities compare and interact. In enterprise and science situations, “one size is unlikely to fit all”: we see analytics teams operating multiple systems simultaneously and struggling to manage the heterogeneous ecosystems that result.  

The light in the tunnel is that despite diversity in systems there has been some convergence in data models, programming models, and algorithmic techniques. We see an opportunity to provide common interfaces across these polystore environments to make them easier to compare, easier to use together, and easier to assemble into high-performance workflow by exploiting specialization opportunities. In particular, we see a role for these polystore environments in bridging the gap between specialized high-performance computing approaches and more general high-throughput computing favored in the enterprise, allowing different kinds of platforms and design alternatives to co-exist in a single ecosystem.
I'll describe the Myria system we are building at the University of Washington to achieve these goals.  We provide a common programming algebra and compile this intermediate representation to multiple back-end systems, including our own backend system called MyriaX, dataflow engines like Spark, and a parallel global address space (PGAS) model designed for use on high-performance clusters.  
I'll also describe a family of algorithms for graph analytics we are exploring for use with Myria as a challenge problem.


Bill Howe is the Associate Director of the UW eScience Institute and an Affiliate Associate Professor in Computer Science & Engineering. His research interests are in data management, curation, analytics, and visualization in the sciences. Howe has received two Jim Gray Seed Grant awards from Microsoft Research for work on managing environmental data, has had two papers selected for VLDB Journal's "Best of Conference" issues (2004 and 2010), and co-authored what are currently the most-cited papers from both VLDB 2010 and SIGMOD 2012. Howe serves on the program and organizing committees for a number of conferences in the area of databases and scientific data management, and developed a first MOOC on data science that attracted over 200,000 students across two offerings. He has a Ph.D. in Computer Science from Portland State University and a Bachelor's degree in Industrial & Systems Engineering from Georgia Tech.