Introduction to Apache Flink by Friedman Ellen & Tzoumas Kostas
Author:Friedman, Ellen & Tzoumas, Kostas
Language: eng
Format: epub
Publisher: O'Reilly Media
Published: 2016-10-18T16:00:00+00:00
Figure 4-3. Implementing continuous applications using a streaming architecture. The message transport (Kafka, MapR Streams) is shown here as a horizontal cylinder. It supplies streaming data to the stream processor (in our case, Flink) that is used for all data processing, providing both real-time results and correct results.
The event stream is again served by the message transport and simply consumed by a single Flink job that produces hourly counts and (optional) early alerts. This approach solves all the previous problems in a straightforward way. Slowdowns in the Flink job or throughput spikes simply pile up in the message-transport tool. The logic to divide events into timely batches (called windows) is embedded entirely in the application logic of the Flink program. Early alerts are produced by the same program. Out-of-order events are transparently handled by Flink. Grouping by session instead of a fixed time means simply changing the window definition in the Flink program. Additionally, replaying the application with changed code means simply replaying the Kafka topic. By adopting a streaming architecture, we have vastly reduced the number of systems to learn, administer, and create code in. The Flink application code to do this counting is straightforward:
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Access | Data Mining |
Data Modeling & Design | Data Processing |
Data Warehousing | MySQL |
Oracle | Other Databases |
Relational Databases | SQL |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8304)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(6769)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(6743)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(6631)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(6412)
Driving Data Quality with Data Contracts by Andrew Jones(6352)
Machine Learning Model Serving Patterns and Best Practices by Md Johirul Islam(6118)
Learning SQL by Alan Beaulieu(5999)
Weapons of Math Destruction by Cathy O'Neil(5786)
Big Data Analysis with Python by Ivan Marin(5375)
Data Engineering with dbt by Roberto Zagni(4380)
Solidity Programming Essentials by Ritesh Modi(4027)
Time Series Analysis with Python Cookbook by Tarek A. Atwan(3888)
Pandas Cookbook by Theodore Petrou(3593)
Blockchain Basics by Daniel Drescher(3302)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2911)
Feature Store for Machine Learning by Jayanth Kumar M J(2816)
Learn T-SQL Querying by Pam Lahoud & Pedro Lopes(2799)
Mastering Python for Finance by Unknown(2746)
