Stream Processing & Apache Flink - News and Best Practices

Announcing Early Access Program for Flink SQL in Ververica Platform

Written by Konstantin Knauf | 06 July 2020

Been wondering what's next for Ververica Platform? Maybe you've already guessed: Flink SQL is coming to Ververica Platform later this year! Today we are excited to announce our Early Access Program for Flink SQL in Ververica Platform with selected users and partners.

Over the years, we have helped support many large-scale Apache Flink adopters such as Netflix, Alibaba and Lyft as they developed in-house Apache Flink-based low- or no-code streaming platforms. A decisive factor in choosing a low- or no-code solution has always been to increase developer efficiency and make streaming data accessible to a larger set of developers and data scientists within their organization. In this process, Flink SQL has emerged as the de facto standard for low-code data analytics and processing across both bounded and unbounded data. It is highly performant, standard-compliant and feature-rich.

Over the last few months, we have been working to extend Ververica Platform to become an integrated environment for Flink SQL. You will be able to manage your tables, functions and catalogs, develop SQL scripts and operate and monitor the derived continuous queries directly in the Platform. We would now like to involve the wider Apache Flink community in our development process through our early access program: interested companies and partners can sign up for our exclusive program and receive access to our product offering around Flink SQL before it becomes generally-available in fall 2020. Besides access to this pre-release version of Ververica Platform, selected companies will benefit from direct access to Ververica’s product team and will have a direct impact on the roadmap of the Ververica Platform SQL offering leading up to general availability later this year.

If you want to get a first look at Flink SQL in Ververica Platform. Over the next few weeks, we will go through a selection process and schedule introductory calls with all selected users.

What can you expect from the program?

  • Pre-release access to Ververica Platform SQL including regular product updates

  • Direct access to Ververica’s product team and regular feedback sessions

  • Shared Slack channel for Q&A, feedback and support

Figure 1: The SQL Early Access Program at a glance

Since the initial release of Flink SQL in 2016 (Flink 1.1.0) the Apache Flink community has put countless development hours into extending, evolving and hardening the API. And as more and more language features, connectors and formats have been introduced, the set of use cases for Flink SQL has naturally grown, too. So, let’s have a look at three of the most popular usage scenarios for Flink SQL today.

Materialized View Maintenance

A materialized view is a pre-computed data set derived from a SQL query. As such, materialized views can greatly speed up query processing and reduce load on source systems, such as legacy mainframe systems. On the other hand, they need to be actively maintained by the SQL engine, a process that is very closely related to stream processing. With Flink SQL you will benefit from our years of experience in the field of stateful stream processing leading us towards a materialized view maintenance engine that goes well beyond what traditional relational databases and data warehouses offer today:

  • very low latency, high resource efficiency & scalability

  • strong end-to-end consistency guarantees and a mechanism to trade off latency & completeness depending on your requirements

  • expressive language features: various streaming joins, user-defined (aggregate/table) functions, stream-table joins

  • easy interoperability (e.g. out-of-the-box support for Debezium & Canal CDC formats) and operational flexibility

Figure 2: Materialized View Maintenance with Flink SQL

Streaming ETL

Processing and moving real-time data from one place to another (streaming ETL) is a common use case for Apache Flink adopters.

With Flink SQL you will get the best of both worlds: you will benefit from the operational simplicity, low latency and expressiveness of Apache Flink without learning a new language or API. In addition, you can re-use your ETL queries for both bounded data (e.g. during reprocessing or bulk imports) and unbounded data. The results will be consistent across the different processing modes. A particular highlight for this use case is the new Flink SQL file system connector added in Flink 1.11. It allows you to stream the results of your transformations directly into Hive partitions while also consistently updating HiveMetastore. It has never been easier to integrate your streaming data sources with your existing Hive infrastructure.

Complex Event Processing & Pattern Matching

Complex Event Processing (CEP) is fast becoming a critical paradigm for staying ahead of the curve. The introduction of row pattern recognition in the SQL standard and the rise of MATCH_RECOGNIZE as a native SQL clause made Complex Event Processing in Flink more efficient, scalable and accessible. Such patterns and rules often tend to be highly-specific, requiring deep knowledge of the specific domain by subject matter experts in the organization. With Flink SQL & MATCH_RECOGNIZE these rules and patterns can now be defined directly by product managers, product analysts and data scientists who hold domain knowledge and expertise in an organization.

We would love to learn more about your use cases and areas where Flink SQL with Ververica Platform could be a perfect fit. Make sure to sign up for our early access program today and tell us more about the data processing challenges that you would like to address with Flink SQL and Ververica Platform. We are very excited to bring this new addition to our product offering later this year and want to make you a part of it!