Big Data: Principles and best practices of scalable realtime data systems by Nathan Marz & James Warren
Author:Nathan Marz & James Warren
Language: eng
Format: epub, mobi
Publisher: Manning Publications
The next step is to select a single user identifier for each person. This is the most sophisticated portion of the workflow, as it involves a fully distributed iterative graph algorithm. Despite its complexity, it only requires a few small pipe diagrams to solve it. With the appropriate tooling, you can implement it in only about 100 lines of code (as will be demonstrated in the next chapter).
User IDs are marked as belonging to the same person via equiv edges. If you were to visualize these edges from a dataset, you’d see numerous independent subgraphs, as shown in figure 8.7.
Download
Big Data: Principles and best practices of scalable realtime data systems by Nathan Marz & James Warren.mobi
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(7305)
Supercharging Productivity with Trello by Brittany Joiner(6566)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6413)
Mastering Tableau 2023 - Fourth Edition by Marleen Meier(6325)
Inkscape by Example by István Szép(6179)
Visualize Complex Processes with Microsoft Visio by David J Parker & Šenaj Lelić(5880)
Build Stunning Real-time VFX with Unreal Engine 5 by Hrishikesh Andurlekar(4872)
Design Made Easy with Inkscape by Christopher Rogers(4577)
Customizing Microsoft Teams by Gopi Kondameda(4117)
Linux Device Driver Development Cookbook by Rodolfo Giometti(3932)
Extending Microsoft Power Apps with Power Apps Component Framework by Danish Naglekar(3710)
Business Intelligence Career Master Plan by Eduardo Chavez & Danny Moncada(3656)
Salesforce Platform Enterprise Architecture - Fourth Edition by Andrew Fawcett(3585)
Pandas Cookbook by Theodore Petrou(3564)
The Tableau Workshop by Sumit Gupta Sylvester Pinto Shweta Sankhe-Savale JC Gillet and Kenneth Michael Cherven(3366)
TCP IP by Todd Lammle(2982)
Drawing Shortcuts: Developing Quick Drawing Skills Using Today's Technology by Leggitt Jim(2910)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2857)
Work Smarter with Microsoft OneNote by Connie Clark(2842)
