VERA Blog Series Part 3: Full Stream Ahead!
VERA: The Cloud Native Engine Revolutionizing Apache Flink® Blog Series
Welcome to the final installment of the three-part blog series that introduces Ververica Runtime Assembly (VERA), the cloud-native, ultra-high-performance engine that powers Ververica’s Streaming Data Platform. In this final blog, let's dive into the features, capabilities and benefits of VERA.
VERA's Capabilities
To recap quickly, in the first blog of our series introducing the VERA engine, we discussed the evolution of VERA and the pressures and limitations of other streaming data solutions (including OS Apache Flink,) that helped to drive the creation of this powerful streaming data engine. In the second blog, we dove into the technical capabilities of the Three Core Pillars that form the VERA engine, and what that means for the user.
In this final piece, let's take a look at VERA’s capabilities, features and benefits, as well as share some of the performance metrics real users are experiencing using VERA today.
Key Features and Benefits
VERA offers a range of key features and benefits. While its capabilities are extensive, here are a few unique aspects that set VERA apart:
- Open Core
VERA is Open Core, designed with additional features that are built in alignment with existing Open Source Apache Flink functionalities. As a result, VERA is 100% compatible with Flink, ensuring there is never vendor lock-in. It is engineered to keep all the advantages of OS Flink, including low latency processing, high throughput, stateful stream processing, and the unified programming model, allowing VERA to seamlessly run Flink applications without any modifications or compatibility issues. In addition, VERA offers multiple deployment options, including both on-prem and cloud options.
- Ultra-high Performance
VERA is able to process billions of events per second with sub-second latency that is up to 2X faster than self-managed open-source Flink on similar hardware, reducing data processing delays to the millisecond level. VERA’s performance allows quick deployment of streaming apps in a scalable and cost-effective way.
VERA also disaggregates individual components within the network, compute, metadata, and storage layers to further increase performance, including optimization on the SQL engine and state access.
- Infinitely Scalable and Elastic
VERA reduces costs and enhances performance by separating the compute and storage layers. This architecture eliminates concerns about disk capacity planning and job rescaling due to storage limitations. It also resolves latency issues and accommodates large stateful applications, while lowering both storage and operational expenses.
- Always Available
Because VERA is cloud native, your data streaming apps can run in a cloud environment, which lowers your state storage and operational costs, while also allowing for fast rescaling and the ability to hyperscale.
VERA also offers uptime/runtime redundancies that won’t fail, and it is highly reliable, with a 99.99% uptime Service Level Agreement (SLA) using Ververica Cloud deployment. Finally, VERA provides highly-available functionalities like hot updates to rescale your jobs, and dynamic complex event processing (CEP) to change rules mid-flight without restarting jobs. This allows VERA to perform job updates with minimal to zero downtime.
- Fault tolerant
VERA has no single point of failure, with built-in easy rollover and recovery. VERA’s fault tolerance features ensure reliable, uninterrupted operation, even when faced with the inevitable failures that occur from time to time.
VERA achieves this tolerance and ensures your deployment is functional, robust, and resilient as a result of the following features:
-
- Tiered state
- Lazy state loading
- Delayed state pruning
- Faster and more stable checkpoints
- Failure detection with task-local recovery
- Operator isolation
- Dynamic re-scaling
- Efficient recovery mechanisms
- Faster state rebuilding and job startup process
- Secure
VERA prioritizes security, streamlining data access management and reducing maintenance overhead. Leveraging Ververica's security certifications, it ensures appropriate data access while minimizing time spent on updates and patches. This allows teams to focus on building applications and solving use cases rather than constant manual maintenance.
- Multi-tenancy
VERA allows many independent applications to run in a shared environment, with the ability to isolate data at both the namespace and tenant level, utilizing configurable access management and role-based access control. This allows workspaces to be quickly provisioned and released on demand, reducing cost by meeting operational demands, even as they change over time and circumstances.
- Data Governance
VERA is designed to keep data organized, secure, and consistent. Users can move, process, use, and store data while continuously tracking data origin and utilization. This allows for quick discovery and resolution, and ensures data credibility and accuracy.
Democratizing Stream Processing
This is just a sampling of the capabilities and features that VERA offers in the pursuit of how Ververica is democratizing stream processing. This democratization allows users to access both fresh, current data and historical data to make informed business decisions at the right time, in the right place, with VERA handling both batch and real-time streaming data use cases.
Figure 1: VERA FEATURES, CAPABILITIES, AND BENEFITS
"VERA makes it easier, faster, and more cost effective for businesses to build their streaming apps and get meaningful insights from their data."
Alex Walden, Ververica CEO
Figure 2: FROM SOURCE TO SOLUTION
VERA was built with users in mind, allowing them to maximize performance and productivity, while minimizing resource use with enterprise-grade flexible tools. By removing the complexities of OS Flink, developers can focus on delivering business outcomes, and in turn, businesses are empowered with tools that make managing, monitoring and maintaining their data streaming solution faster, easier, and cheaper.
The goal of Ververica’s Streaming Data Platform is to remove the complexity of Flink entirely, while still benefiting from all the advantages and power of the Flink solution. In the future, you will simply choose your deployment method from both on-premise and cloud offerings, then choose your sources, tune the solution, and get results, regardless of your use case.
Next, let’s explore the impact VERA has on cost effectiveness and Return on Investment (ROI) when deployed.
ROI and Cost Effectiveness
Thanks to the benefits and features listed above, VERA also reduces operational costs and increases ROI. As we know, open source software itself is free. However, the overhead required to successfully run open source software increases costs exponentially. The people, operations, and infrastructure, (including servers, observability, governance, security, storage, and networking) required to run OS software is expensive when compared to investing in a managed product. This is particularly true when that managed solution is also faster, higher performing, and more secure.
Ververica’s solution handles continuous updates, feature innovations, and constant monitoring for you, eliminating the need for self-management of your data system. Ververica also ensures rapid rollout of updates with minimal interruptions, providing a more stable Flink experience and continuously improving service without manual effort. This approach significantly reduces operational overhead for your team.
Performance Results
It’s one thing to say what a technology is designed to be doing, but another to see actual results. The VERA engine has been tested at scale, and current deployments report truly impressive numbers, including:
Figure 3: CURRENT DEPLOYMENT REPORTED METRICS
- Highly scalable compute resource allocation, 2 million+ cores
- Processing ability up to 6.9 billion records per second
- Task size flexibility, one install reports 35,000+ jobs running on a single cluster
- 10+ petabytes per day ingestion speed
- 10 trillion records ingested per day
Nexmark Benchmark
In addition to exploring current Use Cases, We’d also recommend checking out the Nexmark Benchmark results comparative analysis of query execution times.
The most recent evaluation of VERA highlights the impact of VERA’s optimizations and demonstrates the impressive processing speed VERA reaches when compared to OS Flink. From the test results below, you can see that VERA has a significant effect on improving job performance (single-core throughput), with an average of 50% improvement (with some stateful workloads improving even more significantly).
Figure 4: JOB PERFORMANCE (LOWER = FASTER)
You can also access the updated Nexmark Benchmark to run your own testing:
https://github.com/nexmark/nexmark
VERA: Full Stream Ahead!
Throughout this blog, we’ve shared many of the capabilities, key features and benefits of VERA, the engine powering Ververica’s Streaming Data Platform. To conclude, VERA is built to be fast, easy to use and cost effective, and designed to democratize stream processing and, as a result, revolutionize Open Source Apache Flink and the future of stream processing.
In addition, throughout this three-part blog series, we’ve explored why VERA was created, taken a look at the three Core Pillars that power the engine, and discussed the features and benefits that VERA offers users. Stay tuned for more content, including deep technical dives into the Core Pillars, and further explorations and examples of the myriad of interesting streaming data use cases that the VERA engine helps solve.
Finally, it would be poor form not to provide a shout-out to the many PMC Members, engineers, Ververicans, developers, and other community members who’ve spent long days creating VERA, contributing to the Apache Flink project and repo, and supporting projects including Apache Paimon, and Flink CDC. Thank you!
More Resources
- Learn more about VERA.
- Watch Ververica Field CTO Ben Gamble discuss VERA on YouTube.
- Access the VERA docs.
- Ready to get started? Take VERA for a test run by spinning up a Ververica Cloud deployment.
- Have questions? Our team can help! Contact us.
- Join the Apache Flink Community at Flink Forward Berlin 2024, filled with Flink training courses, expert speakers, networking, an entire track dedicated to Flink use cases, and much more.
- Review the Nexmark Benchmark Report.
Did you enjoy this blog series? Consider sharing the links below and visit the VERA page.
VERA: The Cloud Native Engine Revolutionizing Apache Flink® Blog 3-Part Blog Series:
From Kappa Architecture to Streamhouse: Making the Lakehouse Real-Time
From Kappa to Lakehouse and now Streamhouse, explore how each help addres...
Fluss Is Now Open Source
Fluss, a real-time streaming storage system for data analytics, is now op...
Announcing Ververica Platform: Self-Managed 2.14
Discover the latest release of Ververica Platform Self-Managed v.2.14, in...
Real-Time Insights for Airlines with Complex Event Processing
Discover how Complex Event Processing (CEP) and Dynamic CEP help optimize...