Big Data and Analytics by Samiya Khan
Author:Samiya Khan
Language: eng
Format: epub
ISBN: 9798885304887
Publisher: Notion Press
Published: 2021-11-15T00:00:00+00:00
7.3 CLASSIFICATION OF NOSQL DATABASES
The common thing between all NoSQL databases is that they donât support SQL. The four categories of NoSQL databases are given below.
7.3.1 Key-Value Stores
In such databases, all the operations are performed using a key. For example, if you have a session database, then data retrieval from this table will be via SessionID, the key for the table.
â¢Databases:
Dynamo (Amazon)
â¢Dynamo is Amazonâs offering. They provide it in the form of Database-as-a-Service.
â¢MemBase
â¢CitrusLeaf
â¢Voldemort (LinkedIn)
â¢Riak
7.3.2 Big Data Clones
This class of NoSQL databases is completely based on the Googleâs whitepaper that introduced BigTable concept. Databases that fall under this category include â
â¢HBase
It is an Apache project that is implemented in Java. We will discuss this database in detail later in the chapter.
â¢BigTable (Google)
This is the base project, which is based on the whitepaper published by Google.
â¢Hypertable
It is a C++ implementation of the Googleâs whitepaper.
â¢Cassandra
Facebook developed this database system. The team that developed Cassandra was also a part of the development team of Amazonâs DynamoDB. Therefore, Cassandra is majorly a combination of BigTable and DynamoDB.
7.3.3 Document Databases
This class of databases is extremely useful for storing data in XML or JSON format and it includes the following â
â¢MongoDB
â¢CouchOne
â¢OrientDB
â¢TerraStore
7.3.4 Graph Databases
These databases are based on the graph concept. The best use-case for such databases is social media network that can be simulated in the form of a graph with users as vertices and connections between them as edges. The graph concept is an efficient method for managing social media data. Included solutions are â
â¢FlockDB
â¢Neo4J
â¢InfoGrid
â¢Sones
7.3.5 CAP Theorem
The decision of which database to use from the given list is majorly dependent on your performance and system requirements. CAP theorem is a standard approach that is used for assistance in making such decisions. The CAP theorem uses three parameters namely, consistency, availability and partition tolerance and states that it is possible to get only two of these characteristics in a NoSQL solution. Therefore, the decision of which database to use must be purely based on which characteristics are important for you. The three characteristics are as follows â
1.Consistency
The commits performed in a database are atomic across the whole database.
2.Availability
The database is available and accessible at all times.
3.Partition Tolerance
The system responds correctly in all conditions except when there is a total network failure.
4.The distribution of databases across these characteristics is illustrated in Fig. 7.1.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Distributed Machine Learning with Python by Guanhua Wang(3608)
Getting Started with CockroachDB by Kishen Das Kondabagilu Rajanna(2576)
Exploratory Data Analysis with Python Cookbook by Ayodele Oluleye(1417)
Getting Started With CockroachDB: A Guide to Using a Modern, Cloud-Native, and Distributed SQL Database for Your Data-Intensive Apps by Kishen Das Kondabagilu. Rajanna(1238)
R Web Scraping Quick Start Guide by Olgun Aydin(1082)
PostgreSQL 13 Cookbook: Over 120 recipes to build high-performance and fault-tolerant PostgreSQL database solutions by Vallarapu Naga Avinash Kumar(1016)
Mastering PostgreSQL 15 - Fifth Edition by Hans-Jürgen Schönig(689)
Apache Hadoop 3 Quick Start Guide by Hrishikesh Karambelkar(450)
Pandas for Everyone: Python Data Analysis, 2nd Edition by Daniel Y. Chen(447)
Learn SQL with MySQL: Retrieve and Manipulate Data Using SQL Commands with Ease by Ashwin Pajankar(406)
SQL Query Design Patterns and Best Practices by Steve Hughes & Dennis Neer & Dr. Ram Babu Singh & Shabbir H. Mala & Leslie Andrews & Chi Zhang(391)
Deploy Node.js on GCP: A comprehensive guide to deploying Node.js on Google Cloud Platform by Jonathan Lin(378)
Configuring Sales and Distribution in SAP ERP by Unknown(360)
Leveling Up with SQL by Mark Simon(336)
Learning Data Science by Sam Lau(325)
Intermediate Python by Oswald Campesato(321)
The Definitive Guide to Data Integration by Pierre-Yves BONNEFOY Emeric CHAIZE Raphaël MANSUY Mehdi TAZI(304)
Data Engineering with AWS: A Comprehensive Guide to Building Robust Data Pipelines by Paul Brian(297)
Pandas Basics by Oswald Campesato(294)
