Introducing Official Docker Images for Apache Flink®
Patrick Lucas is a Senior Data Engineer at data Artisans, and Ismaël Mejía (@iemejia) is a software engineer at Talend.
Thepost originally appeared on the Apache Flink blog. It was reproduced here under the Apache License, Version 2.0.
For some time, the Apache Flink community has provided scripts to build a Docker image to run Flink. Now, starting with version 1.2.1, Flink will have an official Docker image. This image is maintained by the Flink community and curated by the Docker team to ensure it meets the quality standards for container images of the Docker community.
A community-maintained way to run Apache Flink on Docker and other container runtimes and orchestrators is part of the ongoing effort by the Flink community to make Flink a first-class citizen of the container world.
If you want to use the official Docker image today you can get the latest version by running:
docker pull flink
And to run a local Flink cluster with one TaskManager and the Web UI exposed on port 8081, run:
docker run -t -p 8081:8081 flink local
With this image there are various ways to start a Flink cluster, both locally and in a distributed environment. Take a look at the documentation that shows how to run a Flink cluster with multiple TaskManagers locally using Docker Compose or across multiple machines using Docker Swarm. You can also use the examples as a reference to create configurations for other platforms like Mesos and Kubernetes.
While this announcement is an important milestone, it’s just the first step to help users run containerized Flink in production. There are improvements to be made in Flink itself and we will continue to improve these Docker images and for the documentation and examples surrounding them.
This is of course a team effort, so any contribution is welcome. The docker-flink GitHub organization hosts the source files to generate the images and the documentation that is presented alongside the images on Docker Hub.
From Kappa Architecture to Streamhouse: Making the Lakehouse Real-Time
From Kappa to Lakehouse and now Streamhouse, explore how each help addres...
Fluss Is Now Open Source
Fluss, a real-time streaming storage system for data analytics, is now op...
Announcing Ververica Platform: Self-Managed 2.14
Discover the latest release of Ververica Platform Self-Managed v.2.14, in...
Real-Time Insights for Airlines with Complex Event Processing
Discover how Complex Event Processing (CEP) and Dynamic CEP help optimize...