Aggregation Examples¶
There are several methods of performing aggregations in MongoDB. These examples cover the new aggregation framework, using map reduce and using the group method.
Setup¶
To start, we’ll insert some example data which we can perform aggregations on:
>>> from pymongo import MongoClient
>>> db = MongoClient().aggregation_example
>>> result = db.things.insert_many(
... [
... {"x": 1, "tags": ["dog", "cat"]},
... {"x": 2, "tags": ["cat"]},
... {"x": 2, "tags": ["mouse", "cat", "dog"]},
... {"x": 3, "tags": []},
... ]
... )
>>> result.inserted_ids
[ObjectId('...'), ObjectId('...'), ObjectId('...'), ObjectId('...')]
Aggregation Framework¶
This example shows how to use the
aggregate()
method to use the aggregation
framework. We’ll perform a simple aggregation to count the number of
occurrences for each tag in the tags
array, across the entire collection.
To achieve this we need to pass in three operations to the pipeline.
First, we need to unwind the tags
array, then group by the tags and
sum them up, finally we sort by count.
Python dictionaries prior to 3.7 don’t maintain order. You should use SON
or collections.OrderedDict
where explicit ordering is required for an older Python version
eg “$sort”:
Note
aggregate requires server version >= 2.1.0.
>>> from bson.son import SON
>>> pipeline = [
... {"$unwind": "$tags"},
... {"$group": {"_id": "$tags", "count": {"$sum": 1}}},
... {"$sort": SON([("count", -1), ("_id", -1)])},
... ]
>>> import pprint
>>> pprint.pprint(list(db.things.aggregate(pipeline)))
[{'_id': 'cat', 'count': 3},
{'_id': 'dog', 'count': 2},
{'_id': 'mouse', 'count': 1}]
To run an explain plan for this aggregation use PyMongoExplain, a companion library for PyMongo. It allows you to explain any CRUD operation by providing a few convenience classes:
>>> from pymongoexplain import ExplainableCollection
>>> ExplainableCollection(collection).aggregate(pipeline)
{'ok': 1.0, 'queryPlanner': [...]}
Or, use the command()
method:
>>> db.command('aggregate', 'things', pipeline=pipeline, explain=True)
{'ok': 1.0, 'stages': [...]}
As well as simple aggregations the aggregation framework provides projection capabilities to reshape the returned data. Using projections and aggregation, you can add computed fields, create new virtual sub-objects, and extract sub-fields into the top-level of results.
See also
The full documentation for MongoDB’s aggregation framework