Storage for Data Platforms in 10 minutes

OpenStack Summit Banner

Kyle Bader and I teamed up to deliver a quick (and hopefully painless) review of what types of storage your Big Data strategy needs to succeed alongside the better-understood (and more traditional) existing approaches to structured data.

Data platform engineers need to receive support from both the Compute and the Storage infrastructure teams to deliver. We look at how the public cloud, and Amazon AWS in particular, tackle these challenges and what are the equivalent technology strategies in OpenStack and Ceph.

F2 in demo session

Tradeoffs between IO latency, availability of storage space, cost and IO performance lead to storage options fragmenting into three broad solution areas: network-backed persistent block, application-focused object storage (also network based), and directly-attached low-latency NVME storage for highest-performance scratch and overflow space.

Ideally, the infrastructure designer would choose to adopt similarly-behaving approaches to the public and private cloud environments, which is what makes OpenStack and Ceph a good fit: scale-out, cloud-native technologies naturally have much more in common with public cloud than legacy vendors. Interested? Listen to our quick survey of the field, the OpenStack Foundation kindly published a recording of our session:

Our slides are available as a PDF download and can be viewed inline below.

Screen Shot 2018-05-30 at 6.33.26 PM.png

 
12
Kudos
 
12
Kudos

Now read this

A Year in the Cloud

The Ubuntu Cloud Zeitgeist: a year in retrospective # The new year is a perfect occasion to reflect on the road we just covered and how far we have come just in this past year. We opened 2014 shipping the new Ubuntu Server 14.04 on April... Continue →