Big Data For Dummies by Judith Hurwitz
Author:Judith Hurwitz [Hurwitz, Judith]
Language: eng
Format: epub
Published: 2013-03-21T16:23:17+00:00
Chapter 12: Defining Big Data Analysis 149
to turn to Chapter 4 for more details on infrastructure issues. Suffice it to say that if you’re looking for a platform, it needs to achieve the following:
✓
Integrate technologies: The infrastructure needs to integrate new big data technologies with traditional technologies to be able to process all
kinds of big data and make it consumable by traditional analytics.
✓
Store large amounts of disparate data: An enterprise-hardened Hadoop
system may be needed that can process/store/manage large amounts of
data at rest, whether it is structured, semi-structured, or unstructured.
✓
Process data in motion: A stream-computing capability may be needed
to process data in motion that is continuously generated by sensors,
smart devices, video, audio, and logs to support real-time decision
making.
✓
Warehouse data: You may need a solution optimized for operational or
deep analytical workloads to store and manage the growing amounts of
trusted data.
And of course, you need the capability to integrate the data you already have in place along with the results of the big data analysis.
Studying Big Data Analytics Examples
Big data analytics has many different use cases. We mention examples
throughout this book, but we now look at a few others from Internet compa-
nies and others.
Orbitz
If you’ve ever looked for deals on travel, you’ve probably been to sites like Orbitz (www.orbitz.com). The company was established in 1999, and its
website went live in 2001. Users of Orbitz perform over a million searches a day, and the company collects hundreds of gigabytes of raw data each day
from these searches. Orbitz realized that it might have useful information in the web log files that it was collecting from its web analytics software that contained information about consumer interaction with its site.
In particular, it was interested to see whether it could identify consumer
preferences to determine the best-performing hotels to display to users so
that it could increase conversions (bookings). It had not been utilizing this data in the past because it was too expensive to store all of it. It implemented
150 Part IV: Analytics and Big Data
Hadoop and Hive running on commodity hardware to help. Hadoop provided
the distributed file system and Hive provided an SQL-type interface. It took a series of steps to put the data into Hive. After the data was in Hive, the company used machine learning — a data-driven (and data-mining; see the sidebar earlier in this chapter) approach to unearthing patterns in data and helping to analyze the data. For more details about Hadoop and Hive, turn to Chapters 9 and 10.
Nokia
Nokia provides wireless communication devices and services. The com-
pany believes that its data is a strategic asset. Its big data analytics service includes a multipetabyte platform that executes over tens of thousands of
jobs each day. This includes utilizing advanced analytics over terabytes of
streaming data. For example, the company wants to understand how people
interact with its different applications on its phones. Nokia wants to understand what features customers use, how they use a feature, and how they
move from feature to feature and whether they get lost in the application as they are using it. This level
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(7423)
Supercharging Productivity with Trello by Brittany Joiner(6683)
Mastering Tableau 2023 - Fourth Edition by Marleen Meier(6449)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6425)
Inkscape by Example by István Szép(6300)
Visualize Complex Processes with Microsoft Visio by David J Parker & Šenaj Lelić(5995)
Build Stunning Real-time VFX with Unreal Engine 5 by Hrishikesh Andurlekar(4998)
Design Made Easy with Inkscape by Christopher Rogers(4647)
Customizing Microsoft Teams by Gopi Kondameda(4186)
Linux Device Driver Development Cookbook by Rodolfo Giometti(3941)
Business Intelligence Career Master Plan by Eduardo Chavez & Danny Moncada(3786)
Extending Microsoft Power Apps with Power Apps Component Framework by Danish Naglekar(3774)
Salesforce Platform Enterprise Architecture - Fourth Edition by Andrew Fawcett(3652)
Pandas Cookbook by Theodore Petrou(3629)
The Tableau Workshop by Sumit Gupta Sylvester Pinto Shweta Sankhe-Savale JC Gillet and Kenneth Michael Cherven(3427)
TCP IP by Todd Lammle(2995)
Drawing Shortcuts: Developing Quick Drawing Skills Using Today's Technology by Leggitt Jim(2924)
Exploring Microsoft Excel's Hidden Treasures by David Ringstrom(2897)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2885)
