Modern Big Data Architectures by Dominik Ryzko

Modern Big Data Architectures by Dominik Ryzko

Author:Dominik Ryzko [Ryżko, Dominik]
Language: eng
Format: epub
ISBN: 9781119597933
Publisher: Wiley
Published: 2020-04-14T00:00:00+00:00


where

Additionally, the function can be provided to perform reduction of a parallel tree to combine the results of multiple parallel folds.

6.1.3 All-Pairs

All-Pairs is a high-level abstraction designed for expressing data intensive workloads, which allows efficient execution of jobs submitted by non-experts. The simplest implementation of the All-Pairs problem is just a nested loop, see Algorithm 2.

However, the naive approach of running such jobs directly on large data sets usually leads to poor performance. A more sophisticated approach proposed in Moretti et al. [2008] allows greater performance for All-Pairs computations to be achieved. In the proposed approach four phases of the process can be distinguished: model the system, distribute the data, dispatch batch jobs, clean up the system.

In the first stage the system uses a model for estimation of turnaround time , which is calculated as a sum of data transfer , time of computation and dispatch latency. In the end the following equation is proposed:



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.