Yesterday, the Uber engineering team introduced its open source AthenaX, a SQL-based and Apache Flink-powered streaming analytics platform. There’s a detailed announcement on the Uber engineering blog, and it’s full of sample use cases and sample queries that the platform supports. If you haven't seen it already, be sure to check it out.
The Uber engineering team spoke in depth about AthenaX at Flink Forward San Francisco in April 2017, and the keynote from staff software engineer Chinmay Soman or the AthenaX session from senior software engineers Haohui Mai and Bill Liu provide a lot of background if you’d like to hear directly from some of the people who’ve worked on the project.
We at data Artisans are especially excited by the news of the announcement because AthenaX and streaming SQL’s role at Uber is representative of a trend we’ve seen in the Apache Flink community over the course of the past year: that the time has come to deliver the power of stream processing to a more diverse user base via streaming SQL.
In its announcement post, Uber says that a core requirement for its AthenaX platform was that it should be “easily navigable by all users regardless of technical expertise.” The company met this requirement by providing its users with a SQL-only interface to describe what to process while AthenaX compiles these queries and manages the execution. In other words, the end user deals with SQL, and the platform handles the rest.
Notably, Uber says that 70% of the streaming applications that it’s running in production can be expressed with SQL.
At this September’s Flink Forward Berlin, both Huawei and Alibaba announced streaming SQL platforms also powered by Apache Flink, and the Flink open source community has broken ground on a SQL CLI (complete with a REST API) as a user-friendly supplement to Flink’s streaming SQL API. Flink 1.4 will include the streaming SQL API’s first support for joins, broadly expanding its expressiveness.
On an open souce note: our team is appreciative that Uber has been giving back to the Apache Flink community throughout the process, with their work on group windows and support for complex data types included in the Flink 1.3 release and a JDBC table sink planned for Flink 1.4.
The progress in the Flink streaming SQL space this year has been astounding, and we look forward to seeing an uptick in adoption as new tooling like AthenaX makes the technology accessible and easily manageable for an even broader range of users. We’d like to offer a big congrats to the Uber team on this milestone!