Real-Time Analytics by Byron Ellis
Author:Byron Ellis
Language: eng
Format: epub, mobi
ISBN: 9781118837917
Published: 2014-07-14T00:00:00+00:00
Like SQL databases, MongoDB also offers facilities for grouping and aggregating data in queries. The original facility for aggregation was either the group() or mapReduce() commands, but versions of MongoDB after 2.2 also support an optimized aggregate() command.
Unlike SQL, the pipeline command uses a pipeline approach for computing its results, taking an array of filtering and grouping commands used to reach a final result. This is easiest to understand in action, so first build a collection with some example data:
> abc = ['A','B','C','D','E','F','G','H','I','J','K','L', 'M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z']; > db.createCollection("aggtest"); > for(var i=0;i<1000;i++) { ... db.aggtest.insert({ ... first:abc[Math.floor(Math.random()*abc.length)], ... second:abc[Math.floor(Math.random()*abc.length)], ... count:Math.floor(1000*Math.random()) ... }); ... } > db.aggtest.find({}) { "_id" : ObjectId("53213bc8ae5fcad63d0563e9"), "first" : "S", "second" : "W", "count" : 762 } { "_id" : ObjectId("53213bc8ae5fcad63d0563ea"), "first" : "E", "second" : "V", "count" : 381 } { "_id" : ObjectId("53213bc8ae5fcad63d0563eb"), "first" : "Q", "second" : "O", "count" : 143 } { "_id" : ObjectId("53213bc8ae5fcad63d0563ec"), "first" : "C", "second" : "I", "count" : 601 } { "_id" : ObjectId("53213bc8ae5fcad63d0563ed"), "first" : "B", "second" : "C", "count" : 413 } { "_id" : ObjectId("53213bc8ae5fcad63d0563ee"), "first" : "M", "second" : "D", "count" : 790 } { "_id" : ObjectId("53213bc8ae5fcad63d0563ef"), "first" : "S", "second" : "Q", "count" : 699 } { "_id" : ObjectId("53213bc8ae5fcad63d0563f0"), "first" : "A", "second" : "M", "count" : 615 } ... other output omitted Type "it" for more
The first stage of an aggregation pipeline is usually a filtering step that acts like the WHERE clause of a SQL statement. It is identified by a $match statement, as in this example, which selects all of the elements with the “A” as their value for the “first” element:
> db.aggtest.aggregate([{$match:{first:"A"}}]); { "result" : [ { "_id" : ObjectId("53213bc8ae5fcad63d0563f0"), "first" : "A", "second" : "M", "count" : 615 }, { "_id" : ObjectId("53213bc8ae5fcad63d0563f4"), "first" : "A", "second" : "F", "count" : 806 }, { "_id" : ObjectId("53213bc8ae5fcad63d056402"), "first" : "A", "second" : "Q", "count" : 377 }, ...more content omitted... { "_id" : ObjectId("53213bc9ae5fcad63d0567c5"), "first" : "A", "second" : "G", "count" : 769 } ], "ok" : 1 }
Other filtering options are $limit and $skip. Mostly used for testing as an initial filter, the $limit filter restricts the number of elements entering the aggregation, as in this example:
> db.aggtest.aggregate([{$limit:1}]); { "result" : [ { "_id" : ObjectId("53213bc8ae5fcad63d0563e9"), "first" : "S", "second" : "W", "count" : 762 } ], "ok" : 1 }
The $limit command is more typically used after a grouping and sorting operation to limit the output to the user. Similarly, the $skip command will ignore some number of documents entering the filter. Combined with $limit, it is often used after grouping, as well as to implement pagination:
> db.aggtest.aggregate([{$skip:10},{$limit:1}]); { "result" : [ { "_id" : ObjectId("53213bc8ae5fcad63d0563f3"), "first" : "M", "second" : "E", "count" : 437 } ], "ok" : 1 }
After filtering commands are applied in the pipeline, group management commands are applied. The most commonly used command is the $group operator, which specifies an identifier field and some number of accumulators. For example, to sum
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Disaster & Recovery | Email Administration |
Linux & UNIX Administration | Storage & Retrieval |
Windows Administration |
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7781)
Grails in Action by Glen Smith Peter Ledbrook(7696)
Configuring Windows Server Hybrid Advanced Services Exam Ref AZ-801 by Chris Gill(6565)
Azure Containers Explained by Wesley Haakman & Richard Hooper(6554)
Running Windows Containers on AWS by Marcio Morales(6083)
Kotlin in Action by Dmitry Jemerov(5066)
Microsoft 365 Identity and Services Exam Guide MS-100 by Aaron Guilmette(4916)
Combating Crime on the Dark Web by Nearchos Nearchou(4498)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4414)
Microsoft Cybersecurity Architect Exam Ref SC-100 by Dwayne Natwick(4337)
The Ruby Workshop by Akshat Paul Peter Philips Dániel Szabó and Cheyne Wallace(4172)
The Age of Surveillance Capitalism by Shoshana Zuboff(3950)
Python for Security and Networking - Third Edition by José Manuel Ortega(3740)
Learn Windows PowerShell in a Month of Lunches by Don Jones(3508)
The Ultimate Docker Container Book by Schenker Gabriel N.;(3407)
Mastering Python for Networking and Security by José Manuel Ortega(3344)
Mastering Azure Security by Mustafa Toroman and Tom Janetscheck(3330)
Blockchain Basics by Daniel Drescher(3294)
Learn Wireshark by Lisa Bock(3261)
