Apache Flink is an open-source, unified stream-processing and batch-processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming data-flow engine written in Java and Scala. Flink executes arbitrary dataflow programs in a data-parallel and pipelined (hence task parallel) manner. Flink’s pipelined runtime system enables the execution of bulk/batch and stream processing programs. Furthermore, Flink’s runtime supports the execution of iterative algorithms natively.
Flink provides a high-throughput, low-latency streaming engine as well as support for event-time processing and state management. Flink applications are fault-tolerant in the event of machine failure and support exactly-once semantics. Programs can be written in Java, Scala, Python, and SQL and are automatically compiled and optimized into dataflow programs that are executed in a cluster or cloud environment.
Flink does not provide its own data-storage system, but provides data-source and sink connectors to systems such as Apache Doris, Amazon Kinesis, Apache Kafka, HDFS, Apache Cassandra, and ElasticSearch.