Big Data Analytics Methods by Peter Ghavami
Author:Peter Ghavami
Language: eng
Format: epub, pdf
Publisher: De Gruyter
Published: 2019-11-18T06:19:23.453000+00:00
Treatment phase:
Data cleansing (data scrubbing) removes invalid data points from a data set. You may delete data that does not fit the data series, pattern or frequency distribution. You must apply data transformations to de-duplicate data. But, first you should test data for finding matching data records to identify duplicate records.
Do not summarily remove outlier data. Be careful about deleting outlier data that may have significance. Cleansing (deleting) data can be done by human judgment if source or data collection processes are not trusted. Constraint tests can detect inaccurate data (e.g. SSN No: 999-99-9999), data out of range, format mismatch, and foreign key checks. You can detect issues such as missing data, “0” when blank or N/A was expected, “999” or “9999” indicating no data in your data set.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Bad Blood by John Carreyrou(5766)
Principles: Life and Work by Ray Dalio(5318)
Rich Dad Poor Dad by Robert T. Kiyosaki(5147)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4128)
The Confidence Code by Katty Kay(3565)
Thinking in Bets by Annie Duke(3530)
American Kingpin by Nick Bilton(2969)
Playing to Win_ How Strategy Really Works by A.G. Lafley & Roger L. Martin(2937)
Delivering Happiness by Tony Hsieh(2922)
Project Animal Farm: An Accidental Journey into the Secret World of Farming and the Truth About Our Food by Sonia Faruqi(2658)
Brotopia by Emily Chang(2591)
I Live in the Future & Here's How It Works by Nick Bilton(2524)
Mastering Bitcoin: Programming the Open Blockchain by Andreas M. Antonopoulos(2509)
The Content Trap by Bharat Anand(2493)
The Power of Habit by Charles Duhigg(2486)
The Marketing Plan Handbook: Develop Big-Picture Marketing Plans for Pennies on the Dollar by Robert W. Bly(2410)
The Tyranny of Metrics by Jerry Z. Muller(2401)
Building a StoryBrand by Donald Miller(2360)
Applied Empathy by Michael Ventura(2327)