Storage for Data Platforms in 10 minutes

OpenStack Summit Banner

Kyle Bader and I teamed up to deliver a quick (and hopefully painless) review of what types of storage your Big Data strategy needs to succeed alongside the better-understood (and more traditional) existing approaches to structured data.

Data platform engineers need to receive support from both the Compute and the Storage infrastructure teams to deliver. We look at how the public cloud, and Amazon AWS in particular, tackle these challenges and what are the equivalent technology strategies in OpenStack and Ceph.

F2 in demo session

Tradeoffs between IO latency, availability of storage space, cost and IO performance lead to storage options fragmenting into three broad solution areas: network-backed persistent block, application-focused object storage (also network based), and directly-attached low-latency NVME storage for highest-performance scratch and overflow space.

Ideally, the infrastructure designer would choose to adopt similarly-behaving approaches to the public and private cloud environments, which is what makes OpenStack and Ceph a good fit: scale-out, cloud-native technologies naturally have much more in common with public cloud than legacy vendors. Interested? Listen to our quick survey of the field, the OpenStack Foundation kindly published a recording of our session:

Our slides are available as a PDF download and can be viewed inline below.

Screen Shot 2018-05-30 at 6.33.26 PM.png


Now read this

Ceph Storage Roadmap: Past, Present and Future

Neil, Uday and I joined forces to deliver a roadmap update at the annual Red Hat Summit, hosted this year at Moscone West in San Francisco. But before that, we started the week by running a meeting of the somewhat secretive (and... Continue →