Big Data Analytics by Unknown

Big Data Analytics by Unknown

Author:Unknown
Language: eng
Format: epub
Publisher: Packt Publishing


Let's check the lag between the offsets of Kafka and the Spark Streaming consumer using the following command:

bin/kafka-consumer-offset-checker.sh --zookeeper localhost:2181 --topic test --group spark-streaming-consumer Group Topic Pid Offset logSize Lag spark-streaming-consumer test 0 18 18 0 spark-streaming-consumer test 1 16 16 0

Note

Note that the lag is 0 for both partitions.

Direct approach (no receivers)

This was introduced in Spark 1.3 to ensure exactly once semantics of receiving data even in case of failures. The direct approach periodically queries Kafka for the latest offsets in each topic and partition, and accordingly defines the offset ranges to process in each batch as shown in Figure 5.6.

Figure 5.6: Spark Streaming with Kafka direct approach



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.