Spark in the Dark 101

This is an introductory session on Apache Spark, a framework for large-scale data processing. We will introduce high level concepts around Spark, including how Spark execution works and it’s relationship to the other technologies for working with Big Data. Following this introduction to the theory and background, we will walk workshop participants through hands-on usage of spark-shell, Zeppelin notebooks, and Spark SQL for processing library data. The workshop will wrap up with use cases and demos for leveraging Spark within cultural heritage institutions and information organizations, connecting the building blocks learned to current projects in the real world.

Speaker(s)

Audrey Altman

Mark Breedlove

Michael Della Bitta

Erin Fahy

Christina Harlow

Corey Harper

Scott Williams

Location

Floor Plan

February 13^th

9:30-12:30

West End Neighborhood Library