Storm + Jepsen

Introduction

Wojtek Gawronski (@afronski, afronski.pl) - 2014 © License: CC BY-ND 3.0 PL

Tools

Storm

Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.

[...] a million tuples processed per second per node

It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.

Use Cases

  • realtime analytics
  • online machine learning
  • continuous computation
  • distributed RPC
  • ETL

Topology

Storm - Visual representation.

Stream

Stream - Visual representation.

Spout

Spout symbol.

Bolt

Bolt symbol.
Intermediate effects of created topology.

Internals

High Availability and process hierarchy.
Reliability explanation.

Jepsen

Breaking distributed systems so you don't have to.

Braking distributed system.

Why Jepsen?

Demo

Thanks!

References:

Photo credits: