Stream Processing & Apache Flink - News and Best Practices

Ververica celebrates as Apache Paimon Graduates to Top-Level Project

Written by Kaye Lincoln and Karin Landers | 18 April 2024

Congratulations to the Apache Software Foundation and each individual contributor on the graduation of Apache Paimon from incubation to a Top-Level Project!

Apache
Paimon is a data lake format that enables real-time lakehouse architectures with Apache Flink®, allowing for both streaming and batch operations. Paimon combines lake format and Log-structured merge-tree (LMS), enabling the construction of a real-time lakehouse architecture, and bringing real-time streaming updates into the data lake. You can learn more about Paimon in the blog Apache Paimon: The Streaming Lakehouse.

Key features and benefits delivered by Paimon include:

  • The ability to process data in both batch and streaming modes.
  • High speed, large scale batch and streaming processing capability with Paimon’s append table.
  • Flexibility when updating records, including first-row updates, partial updates, aggregation records and more.
  • Simplified analytics with accurate and complete changelog updates for merge engines.
  • File layout optimization and data compaction with z-order sorting.
  • Data skipping to allow faster queries.

"As we celebrate Apache Paimon’s graduation, we are reminded of the power of community and open-source innovation," said Vladimir Jandreski, Ververica’s Chief Product Officer. "Having Paimon as a cornerstone of the streaming lakehouse (Streamhouse) support in Ververica Cloud is more than a technical achievement; it's a step towards a future where stream processing on a data lake is accessible, efficient, and indispensable for businesses worldwide. We are excited to continue contributing to and supporting Apache Paimon's journey, delivering cutting-edge solutions to our customers."

Ververica Cloud with Apache Paimon

While Apache Paimon was originally created as a lakehouse table format, it is much more than simply that. Paimon extends Apache Flink’s capabilities on the data lake and allows developers to truly leverage stream processing on the data lake. Read more about this functionality in the blog: Streamhouse: Data Processing Patterns.

The easiest way to get started with Apache Paimon is to use Ververica Cloud, where it exists as a Sink and Source Connector. Sign up for free and get $400 free credits.

With Ververica Cloud, businesses can focus on creating valuable data streaming solutions without the additional burden of having to maintain complex tech stacks for their streaming lakehouse projects.

Ververica Cloud with Apache Paimon offers:

  • Enhanced Flexibility: Tailor your data products within Ververica Cloud, adapting swiftly to evolving data needs.
  • Operational Excellence: Benefit from Ververica's expertise in stream processing, ensuring your operations are both efficient and scalable.
  • Community, Knowledge and Support: Get help and access from the original creators of Apache Flink®, as well as Apache Paimon contributors and experts.

Ververica remains committed to advancing real-time data processing technologies, guided by the belief that the best innovations come from collaborative, community-driven efforts. As Apache Paimon enters its next chapter as a top-level project, Ververica looks forward to supporting its growth and integration, while ensuring that Ververica Cloud remains the best-in-class platform for leveraging the power of real-time data.

 

Additional Resources