Getting Started with Stream Processing Using Apache Flink

Pluralsight

Course Summary

Flink is a stateful, tolerant, and large scale system with excellent latency and throughput characteristics. It works with bounded and unbounded datasets using the same underlying stream-first architecture, focusing on streaming or unbounded data.

+
Course Description

Apache Flink is a distributed computing engine used to process large scale data. Flink is built on the concept of stream-first architecture where the stream is the source of truth. This course, Getting Started with Stream Processing Using Apache Flink, walks the users through exploratory data analysis and data munging with Flink. You'll start off learning about simple data transformations on streams such as map(), filter(), flatMap(), reduce(), sum(), min(), and max() on simple DataStreams and KeyedStreams. You'll then learn about window transformations in detail using tumbling, sliding, count, and session windows. You'll wrap up the course explore operations on multiple streams such as union and joins. All of this with hands on demos using Flink's Java API along with a real world project using Twitter's streaming API. After you've watched this course you'll have a strong foundation for stream processing concepts using Apache Flink.

Course Description

Apache Flink is a distributed computing engine used to process large scale data. Flink is built on the concept of stream-first architecture where the stream is the source of truth. This course, Getting Started with Stream Processing Using Apache Flink, walks the users through exploratory data analysis and data munging with Flink. You'll start off learning about simple data transformations on streams such as map(), filter(), flatMap(), reduce(), sum(), min(), and max() on simple DataStreams and KeyedStreams. You'll then learn about window transformations in detail using tumbling, sliding, count, and session windows. You'll wrap up the course explore operations on multiple streams such as union and joins. All of this with hands on demos using Flink's Java API along with a real world project using Twitter's streaming API. After you've watched this course you'll have a strong foundation for stream processing concepts using Apache Flink.

+
Course Syllabus

Course Overview
- 1m 34s

â€”Course Overview 1m 34s

Understanding Streaming Data and Stream Processing
- 33m 37s

â€”Why Stream Processing? 2m 16s
â€”Batch Processing vs. Stream Processing 7m 3s
â€”Requirements of Stream Processing Systems 5m 12s
â€”Micro-batches for Stream Processing 2m 17s
â€”Introducing Apache Flink for Stream Processing 4m 51s
â€”Clients, Masters, and Workers 4m 13s
â€”Install and Set up Flink 7m 43s

Implementing Basic Operations on Streaming Data
- 41m 35s

â€”Data Representation and Transformations on a Stream 4m 11s
â€”The Filter Transformation 8m 31s
â€”The Map Transformation 4m 0s
â€”The FlatMap Transformation 6m 26s
â€”Stateless and Stateful Transformations 2m 21s
â€”Keyed Streams 2m 18s
â€”Transformations on Keyed Streams 6m 17s
â€”The Reduce Operation 7m 27s

Windowing Operations on Streams
- 42m 45s

â€”Introduction to Window Transformations 3m 56s
â€”Tumbling Windows 2m 36s
â€”Sliding Windows 2m 22s
â€”Count, Session, and Global Windows 4m 46s
â€”Event Time, Ingestion Time, and Processing Time 6m 41s
â€”Implementing Tumbling and Sliding Windows 5m 23s
â€”Implementing the Count Window 4m 43s
â€”Implementing the Session Window 3m 5s
â€”Getting the Twitter Consumer Keys and Access Tokens 2m 33s
â€”Connecting to the Twitter Streaming API 6m 35s

Fault Tolerance with State and Checkpoints
- 33m 35s

â€”Categories of State 5m 21s
â€”Rich Functions to Store State 3m 26s
â€”Making Transformations Stateful: ValueState<T> 6m 22s
â€”Making Transformations Stateful: ListState<T> 4m 6s
â€”Making Transformations Stateful: ReducingState<T> 3m 37s
â€”Fault Tolerance with Checkpoint 6m 12s
â€”Restart Strategies 4m 27s

Working with Multiple Stream Sources
- 11m 37s

â€”The Union Operation 4m 7s
â€”The Join Operation 7m 29s