The Analytics Revolution: How to Improve Your Business By Making Analytics Operational In The Big Data Era by Bill Franks

The Analytics Revolution: How to Improve Your Business By Making Analytics Operational In The Big Data Era by Bill Franks

Author:Bill Franks [Franks, Bill]
Language: eng
Format: mobi, pdf
ISBN: 9781118976760
Publisher: Wiley
Published: 2014-09-15T14:00:00+00:00


Any Data, Any Format, Any Volume

Hadoop's ability to handle any volume of data in any format makes it an important pillar of a unified analytics environment.

For example, you can't ask for a mean using a node- or worker-level process because each worker will compute the mean of the data on that worker and then report back its own mean. However, you might remember from your Statistics 101 class that you cannot take the mean of means and get the right answer. What you must do is get the overall sum and count to compute the overall mean. (For an illustration, see Figures 5.5 and 5.6.) The programmer must ensure code is at the right level of parallelism to make computations happen correctly within Hadoop. In contrast, a parallel relational environment is constructed so that system-level parallelism is the standard.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.