Home > Computers & Technology > Software

Learning Spark by Holden Karau Andy Konwinski Patrick Wendell and Matei Zaharia

Author:Holden Karau, Andy Konwinski, Patrick Wendell, and Matei Zaharia , Date: March 16, 2018 ,Views: 148

Learning Spark by Holden Karau Andy Konwinski Patrick Wendell and Matei Zaharia

Author:Holden Karau, Andy Konwinski, Patrick Wendell, and Matei Zaharia
Language: eng
Format: mobi, pdf
Publisher: O'Reilly Media, Inc.
Published: 2015-01-26T05:00:00+00:00

Storage on the cluster

Spark EC2 clusters come configured with two installations of the Hadoop filesystem that you can use for scratch space. This can be handy to save datasets in a medium that’s faster to access than Amazon S3. The two installations are:

An “ephemeral” HDFS installation using the ephemeral drives on the nodes. Most Amazon instance types come with a substantial amount of local space attached on “ephemeral” drives that go away if you stop the instance. This installation of HDFS uses this space, giving you a significant amount of scratch space, but it loses all data when you stop and restart the EC2 cluster. It is installed in the /root/ephemeral-hdfs directory on the nodes, where you can use the bin/hdfs command to access and list files. You can also view the web UI and HDFS URL for it at http://masternode:50070.

A “persistent” HDFS installation on the root volumes of the nodes. This instance persists data even through cluster restarts, but is generally smaller and slower to access than the ephemeral one. It is good for medium-sized datasets that you do not wish to download multiple times. It is installed in /root/persistent-hdfs, and you can view the web UI and HDFS URL for it at http://masternode:60070.

Download

Learning Spark by Holden Karau Andy Konwinski Patrick Wendell and Matei Zaharia.mobi
Learning Spark by Holden Karau Andy Konwinski Patrick Wendell and Matei Zaharia.pdf

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

Linux & Unix	iPhone & iOS
Macintosh	Android
Business Technology	Certification
Computer Science	Databases & Big Data
Digital Audio, Video & Photography	Games & Strategy Guides
Graphics & Design	Hardware & DIY
History & Culture	Internet & Social Media
Mobile Phones, Tablets & E-Readers	Networking & Cloud Computing
Operating Systems	Programming
Programming Languages	Security & Encryption
Software	Web Development & Design

Popular ebooks

Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7811)
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(7635)
Supercharging Productivity with Trello by Brittany Joiner(6902)
Mastering Tableau 2023 - Fourth Edition by Marleen Meier(6657)
Inkscape by Example by István Szép(6521)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(6445)
Visualize Complex Processes with Microsoft Visio by David J Parker & Šenaj Lelić(6205)
Build Stunning Real-time VFX with Unreal Engine 5 by Hrishikesh Andurlekar(5229)
Design Made Easy with Inkscape by Christopher Rogers(4748)
Customizing Microsoft Teams by Gopi Kondameda(4297)
Business Intelligence Career Master Plan by Eduardo Chavez & Danny Moncada(3996)
Linux Device Driver Development Cookbook by Rodolfo Giometti(3958)
Extending Microsoft Power Apps with Power Apps Component Framework by Danish Naglekar(3886)
Salesforce Platform Enterprise Architecture - Fourth Edition by Andrew Fawcett(3766)
Pandas Cookbook by Theodore Petrou(3749)
The Tableau Workshop by Sumit Gupta Sylvester Pinto Shweta Sankhe-Savale JC Gillet and Kenneth Michael Cherven(3539)
Exploring Microsoft Excel's Hidden Treasures by David Ringstrom(3024)
TCP IP by Todd Lammle(3012)
Drawing Shortcuts: Developing Quick Drawing Skills Using Today's Technology by Leggitt Jim(2940)
Applied Predictive Modeling by Max Kuhn & Kjell Johnson(2907)