Storage for Data Platforms in 10 minutes

OpenStack Summit Banner

Kyle Bader and I teamed up to deliver a quick (and hopefully painless) review of what types of storage your Big Data strategy needs to succeed alongside the better-understood (and more traditional) existing approaches to structured data.

Data platform engineers need to receive support from both the Compute and the Storage infrastructure teams to deliver. We look at how the public cloud, and Amazon AWS in particular, tackle these challenges and what are the equivalent technology strategies in OpenStack and Ceph.

F2 in demo session

Tradeoffs between IO latency, availability of storage space, cost and IO performance lead to storage options fragmenting into three broad solution areas: network-backed persistent block, application-focused object storage (also network based), and directly-attached low-latency NVME storage for highest-performance scratch and overflow space.

Ideally, the infrastructure designer would choose to adopt similarly-behaving approaches to the public and private cloud environments, which is what makes OpenStack and Ceph a good fit: scale-out, cloud-native technologies naturally have much more in common with public cloud than legacy vendors. Interested? Listen to our quick survey of the field, the OpenStack Foundation kindly published a recording of our session:

Our slides are available as a PDF download and can be viewed inline below.

Screen Shot 2018-05-30 at 6.33.26 PM.png


Now read this

News from the Zypper Revolution

Maybe revolution is a bit strong, but at this hour in the night I can probably be excused for using a bit of hyperbole – besides, nothing great in the world has ever been accomplished without passion as Hegel would have it, so I am on... Continue →