Understanding Azure Data Factory by Sudhir Rawat & Abhishek Narain

Understanding Azure Data Factory by Sudhir Rawat & Abhishek Narain

Author:Sudhir Rawat & Abhishek Narain
Language: eng
Format: epub
ISBN: 9781484241226
Publisher: Apress


13)Open the file named part-00000 to view the total number of words in an input document.

Spark Activity

Apache Spark provides primitives for in-memory cluster computing. The main difference between Spark and Hadoop is that Spark uses memory and can use the disk for data processing, whereas Hadoop uses the disk for processing.

Azure Data Factory provides a Spark activity (that can run on an HDInsight cluster) for data transformation. In this example, assume you received data from all the stores and you want to figure out what the average sale is for each store. In this example, let’s explore how to leverage an existing HDInsight cluster to build this small solution.1)Switch to Azure.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.