Last week, more than 1,400 members of the Apache Flink community signed up for the first-ever Flink Forward Virtual conference. If you missed the opportunity to join the conference, you can find all session recordings on the Flink Forward Youtube channel, while presentations from the conference will be uploaded on Slideshare.
In this post, we will highlight some of our observations and conference impressions and major trends from this virtual event.
Even during challenging times like the ones we are facing at the moment, it is great to see the Flink community coming together (this time in a virtual setup) to discuss the current state and future direction of Flink. Flink Forward Virtual 2020 gathered more than 700 participants online and demonstrated how Flink is expanding its original scope beyond stream processing to also power batch processing and stateful, event-driven applications with the same engine. The keynotes revealed how Flink is becoming the centerpiece for the data architectures of many major organizations worldwide and enabling them to accelerate their data transformations.
The keynote by Stephan Ewen, gave an overview of the latest Flink developments and presented Flink’s newest API, Stateful Functions. The release of Stateful Functions 2.0 marks a big milestone and brings the first event-driven database that is built directly on Flink. Konstantin Knauf showcased how companies can leverage Ververica Platform Community Edition to get a production-ready Flink environment in less than 5 minutes, while Eric Sammer from Splunk, Marton Balassi and Joe Witt from Cloudera, and Srikanth Satya from DellEMC uncovered how Flink acts as a backbone of their technology infrastructures and completes their technology products to their full capacity.
At Flink Forward Virtual 2020, more than 50 speakers presented their use cases, technical talks, showcasing how Flink powers some of the biggest and most innovative organizations globally. Some of the session we particularly enjoyed where the following:
Justin Cunningham showcased how Netflix expanded the scope of the Keystone pipeline — built on top of Apache Flink — into the company’s Data Mesh for real-time, general-purpose, data transportation different Netflix systems.
Yu Qian explained how Weibo uses Flink for real-time data processing and online machine learning in the company’s recommendation systems.
Jark Wu from Alibaba took us through a Deep-Dive in Flink SQL, showing attendees the Flink SQL’s unified architecture that handle streaming and batch queries and explained how Flink translates queries into relational expressions — leveraging Apache Calcite for optimization — to generate efficient runtime code for execution.
Tzu-Li (Gordon) Tai, in his talk Stateful Functions: Polyglot Event Driven Functions for Stateful Distributed Applications, gave an in-depth overview of how the Stateful Functions runtime works, and what are the building blocks provided to users to structure a Stateful Functions application.
Robert Crowe, Reza Rokni and Ahmet Altay discussed distributed processing for Machine Learning production pipelines and specifically reviewed different frameworks, including TensorFlow Extended (TFX), Apache Flink, Apache Beam, and Google’s experience with Machine Learning in production.
Reviewing the talks, online discussions and questions at the virtual event, it became clear that the following topics were major trends in stream processing and Apache Flink:
Flink SQL has evolved into a truly unified batch and streaming API for all kinds of processing workloads. Some of the talks around streaming SQL included a Flink SQL demo, a session on how to write an interactive streaming SQL and materialized view engine using Flink and a deep-dive session of the latest developments in Flink’s SQL engine.
Stateful Functions continues to generate a lot of attention and excitement in the community, especially with the release of Stateful Functions 2.0 that transforms Flink into an event-driven database for stateful, serverless applications.
Many new users presented their exciting use cases throughout the conference, such as talks by Adobe, Weibo, GoDaddy, Meeshkan and HyperConnect for real-time streaming and machine learning pipelines, real-time identity graphs and streaming analytics with Apache Flink.
While the virtual Flink Forward this year was tailored to the US time zone — because of the unfortunate cancelation of the physical conference due to COVID-19 — we were amazed by the sheer amount of international registrations from countries across Europe, the Middle East and Asia eager to find out the all latest Flink developments and interesting directions Flink is developing towards. In addition to this, Flink Forward was followed by a 2-day event on DingTalk to recap the conference for the Chinese Flink community with more than 10,000 people attending!
We are truly grateful to be part of such a diverse, strong and engaged community that is working on innovative solutions for some of the toughest technological problems out there, even in hard times like the ones we are all facing at the moment. This virtual conference showcased once again how great technology can be developed in open source.
While we’re still digesting all the exciting content from Flink Forward Virtual 2020, we are already preparing future events (either physical or virtual) for Flink Forward Europe (Fall 2020) and China (December 2020)! Be sure to follow @FlinkForward on Twitter to stay informed about all the upcoming details and CFP announcements for our future events.