Storage for Data Platforms in 10 minutes

OpenStack Summit Banner

Kyle Bader and I teamed up to deliver a quick (and hopefully painless) review of what types of storage your Big Data strategy needs to succeed alongside the better-understood (and more traditional) existing approaches to structured data.

Data platform engineers need to receive support from both the Compute and the Storage infrastructure teams to deliver. We look at how the public cloud, and Amazon AWS in particular, tackle these challenges and what are the equivalent technology strategies in OpenStack and Ceph.

F2 in demo session

Tradeoffs between IO latency, availability of storage space, cost and IO performance lead to storage options fragmenting into three broad solution areas: network-backed persistent block, application-focused object storage (also network based), and directly-attached low-latency NVME storage for highest-performance scratch and overflow space.

Ideally, the infrastructure designer would choose to adopt similarly-behaving approaches to the public and private cloud environments, which is what makes OpenStack and Ceph a good fit: scale-out, cloud-native technologies naturally have much more in common with public cloud than legacy vendors. Interested? Listen to our quick survey of the field, the OpenStack Foundation kindly published a recording of our session:

Our slides are available as a PDF download and can be viewed inline below.

Screen Shot 2018-05-30 at 6.33.26 PM.png

 
12
Kudos
 
12
Kudos

Now read this

ARM 64 Bit: walkthrough of the Mustang

The Dragon Propulsion Laboratory recently acquired an Applied Micro Mustang board thanks to a developer promotion Jon pointed me to, and the arrival of the holidays finally gave me a chance to take the device for a spin. The Mustang, or... Continue →