Why We Love Apache Flink

Flink Logo

Latency has become more and more important as the processing capacities are increasing. Nowadays, companies have realized that the value of information is higher in the moment when the creation of data occurs and therefore need a real-time processing system.

Flink has adopted a continuous flow, operator-based streaming model. Flink processes data as true streams with built-in backpressure, i.e., data elements are immediately “pipelined. This also allows performing flexible window operations.

Flink streaming supports event time and out of order streams as well as extremely flexible windowing semantic out of the box.

Flink has built in fault tolerance and high availability; support for master failover eliminates single point of failure while lightweight distributed snapshots enable stronger consistency.

Flink streaming achieves high throughput rates and low latency. Here is a real world streaming benchmark conducted by Yahoo! Engineering comparing Storm, Flink and Spark.

Flink is a streaming engine that also supports batch processing. Stream and batch are not different; batch processes are streaming processes that work on bound data as opposed to unbound data.

Flink offers a wide ecosystem of libraries with high-level API for Relational Data Processing, Machine Learning, Graph Analytics and Complex Event Processing.

Flink is well integrated with many other projects such as Kafka, Hadoop, Cassandra, Alluxio and more.
We at Radicalbit use Flink; we’ve made few contributions to the code base in the last few months and we are planning to commit even more.

We believe Flink is a great technology that will definitely find its space in this ever crowded market.

Leave a Reply

Your email address will not be published. Required fields are marked *