Flink Forward Session Preview: Not So Big - Flink as a true Application Framework
How does an engineering team start thinking of Apache Flink more like an “application framework” instead of a “stream processor”? This will be one of the discussion items in our Flink Forward Europe 2019 talk scheduled for October 8, 2019. MotaWord’s session will be one of 50+ talks scheduled for the event, ranging from Apache Flink use cases and ecosystem sessions, all the way to technical deep-dives and research talks scheduled in this year’s conference program.
And if you’re new to Flink, this year’s training program on October 7, includes 4 different sessions from Apache Flink Developer and Operations Training to SQL and Tuning & Troubleshooting training sessions for you to choose from. Register for the event and become part of the Apache Flink community at the beginning of October in Berlin!
Learn more about Motaword’s talk at Flink Forward Europe 2019 below.
Not So Big – Flink as a true Application Framework
We started MotaWord, years ago, as a collaborative human translation solution. That meant getting multiple human translators, proofreaders and managers together on the same project (e.g. to translate a document, presentation, game or website), give them a singular platform with tools optimized to support and improve collaboration; in an effort to provide visibility into the translation process, automate management tasks and get projects completed faster. This has been the core model of the MotaWord ecosystem since then, with everything else in a typical translation workflow automated. Our models now allow us to scale our services to the need of our clients, from startups to large corporations, have a historical understanding of the behavior of our system, translators, clients and managers, thus create real-time actionable insights to customize and scale automatically.
We have multiple facades, continuously creating data (web, mobile, API, plugins, extensions, deep integrations), whether it is a QA warning on a glossary term during translation, or a translator responding to our project notifications, or a client submitting this or that kind of projects to be completed on a holiday weekend. This flow of data enabled us to imagine analysis. Analysis such as “life-time responsivity trends of our vendors”, or “types of projects a specific client is sending within a period”, or “quality trend of a proofreader in the medical field”.
Starting from this need (and imagination), we explored the typical players of the data scene: Apache Flink, Apache Storm, Apache Samza, WSO2, etc. Even though we also looked at some older rule-based approaches, we defined a strong constraint in our search: everything needs to be primarily a stream. We wanted to have only one primary data behavior in our new analysis system and it was streaming, no bulk data, no cron-like behavior, no monthly reports. Another, secondary but very crucial constraint we put in our criteria was single, central responsibility. We didn’t want any of this stream and analysis component to spill out its behavior to other systems; our analysis had to be done within a single platform and presented to the world from this single platform.
We decided to go with Apache Flink and not a day we’ve looked back so far! It checked everything.
We realized our desire for a centralized, single platform to consume, generate and present real-time, continuous data analysis was key in our success in a small team. In this talk, I am going to talk about how Apache Flink helps us think about this centralization, what tools it gives us (Queryable State anyone?), where it excels and where it can be improved.
Unless I receive fundamentally perspective changing comments during the talk (which I look forward to!), my takeaways from our experience with Apache Flink, and from the prospective talk are going to be:
-
Apache Flink is the strongest application framework in the ecosystem. Stream-first mindset is a game changer in how you write and maintain code.
-
A single “centralized stream analysis application” requires complete rewire of your typical software engineering mindset (a good thing).
-
There will be workarounds… some of them ugly, some of them necessary. The ecosystem is still trying to make sense of “stream processing” as the primary data flow in “application framework design”.
-
The future is bright. Stay up to date with the Apache Flink team (thumbs up for FLINK-12047).
At MotaWord, we are basically automating the behavior of our platform stakeholders. This is no easy task, and Apache Flink has helped us pave the way, made it more fun than it could be. Join me at Flink Forward Europe 2019 for a more technical take on our journey.
We look forward to sharing our experience with Apache Flink with the global Flink community and learn more about what other key players in this space are doing with stream processing and Apache Flink! Come join us in October!
From Kappa Architecture to Streamhouse: Making the Lakehouse Real-Time
From Kappa to Lakehouse and now Streamhouse, explore how each help addres...
Fluss Is Now Open Source
Fluss, a real-time streaming storage system for data analytics, is now op...
Announcing Ververica Platform: Self-Managed 2.14
Discover the latest release of Ververica Platform Self-Managed v.2.14, in...
Real-Time Insights for Airlines with Complex Event Processing
Discover how Complex Event Processing (CEP) and Dynamic CEP help optimize...