Type Hints¶
As of version 4.1, PyMongo ships with type hints. With type hints, Python type checkers can easily find bugs before they reveal themselves in your code.
If your IDE is configured to use type hints, it can suggest more appropriate completions and highlight errors in your code. Some examples include PyCharm, Sublime Text, and Visual Studio Code.
You can also use the mypy tool from your command line or in Continuous Integration tests.
All of the public APIs in PyMongo are fully type hinted, and several of them support generic parameters for the type of document object returned when decoding BSON documents.
Due to limitations in mypy, the default
values for generic document types are not yet provided (they will eventually be Dict[str, any]
).
For a larger set of examples that use types, see the PyMongo test_typing module.
If you would like to opt out of using the provided types, add the following to your mypy config:
[mypy-pymongo]
follow_imports = False
Basic Usage¶
Note that a type for MongoClient
must be specified. Here we use the
default, unspecified document type:
>>> from pymongo import MongoClient
>>> client: MongoClient = MongoClient()
>>> collection = client.test.test
>>> inserted = collection.insert_one({"x": 1, "tags": ["dog", "cat"]})
>>> retrieved = collection.find_one({"x": 1})
>>> assert isinstance(retrieved, dict)
For a more accurate typing for document type you can use:
>>> from typing import Any, Dict
>>> from pymongo import MongoClient
>>> client: MongoClient[Dict[str, Any]] = MongoClient()
>>> collection = client.test.test
>>> inserted = collection.insert_one({"x": 1, "tags": ["dog", "cat"]})
>>> retrieved = collection.find_one({"x": 1})
>>> assert isinstance(retrieved, dict)
Typed Client¶
MongoClient
is generic on the document type used to decode BSON documents.
You can specify a RawBSONDocument
document type:
>>> from pymongo import MongoClient
>>> from bson.raw_bson import RawBSONDocument
>>> client = MongoClient(document_class=RawBSONDocument)
>>> collection = client.test.test
>>> inserted = collection.insert_one({"x": 1, "tags": ["dog", "cat"]})
>>> result = collection.find_one({"x": 1})
>>> assert isinstance(result, RawBSONDocument)
Subclasses of collections.abc.Mapping
can also be used, such as SON
:
>>> from bson import SON
>>> from pymongo import MongoClient
>>> client = MongoClient(document_class=SON[str, int])
>>> collection = client.test.test
>>> inserted = collection.insert_one({"x": 1, "y": 2})
>>> result = collection.find_one({"x": 1})
>>> assert result is not None
>>> assert result["x"] == 1
Note that when using SON
, the key and value types must be given, e.g. SON[str, Any]
.
Typed Collection¶
You can use TypedDict
(Python 3.8+) when using a well-defined schema for the data in a
Collection
. Note that all schema validation for inserts and updates is done on the server.
These methods automatically add an “_id” field.
>>> from typing import TypedDict
>>> from pymongo import MongoClient
>>> from pymongo.collection import Collection
>>> class Movie(TypedDict):
... name: str
... year: int
...
>>> client: MongoClient = MongoClient()
>>> collection: Collection[Movie] = client.test.test
>>> inserted = collection.insert_one(Movie(name="Jurassic Park", year=1993))
>>> result = collection.find_one({"name": "Jurassic Park"})
>>> assert result is not None
>>> assert result["year"] == 1993
>>> # This will raise a type-checking error, despite being present, because it is added by PyMongo.
>>> assert result["_id"] # type:ignore[typeddict-item]
This same typing scheme works for all of the insert methods (insert_one()
,
insert_many()
, and bulk_write()
).
For bulk_write
both InsertOne
and ReplaceOne
operators are generic.
>>> from typing import TypedDict
>>> from pymongo import MongoClient
>>> from pymongo.operations import InsertOne
>>> from pymongo.collection import Collection
>>> client: MongoClient = MongoClient()
>>> collection: Collection[Movie] = client.test.test
>>> inserted = collection.bulk_write([InsertOne(Movie(name="Jurassic Park", year=1993))])
>>> result = collection.find_one({"name": "Jurassic Park"})
>>> assert result is not None
>>> assert result["year"] == 1993
>>> # This will raise a type-checking error, despite being present, because it is added by PyMongo.
>>> assert result["_id"] # type:ignore[typeddict-item]
Modeling Document Types with TypedDict¶
You can use TypedDict
(Python 3.8+) to model structured data.
As noted above, PyMongo will automatically add an _id
field if it is not present. This also applies to TypedDict.
There are three approaches to this:
Do not specify
_id
at all. It will be inserted automatically, and can be retrieved at run-time, but will yield a type-checking error unless explicitly ignored.Specify
_id
explicitly. This will mean that every instance of your custom TypedDict class will have to pass a value for_id
.Make use of
NotRequired
. This has the flexibility of option 1, but with the ability to access the_id
field without causing a type-checking error.
Note: to use TypedDict
and NotRequired
in earlier versions of Python (<3.8, <3.11), use the typing_extensions
package.
>>> from typing import TypedDict, NotRequired
>>> from pymongo import MongoClient
>>> from pymongo.collection import Collection
>>> from bson import ObjectId
>>> class Movie(TypedDict):
... name: str
... year: int
...
>>> class ExplicitMovie(TypedDict):
... _id: ObjectId
... name: str
... year: int
...
>>> class NotRequiredMovie(TypedDict):
... _id: NotRequired[ObjectId]
... name: str
... year: int
...
>>> client: MongoClient = MongoClient()
>>> collection: Collection[Movie] = client.test.test
>>> inserted = collection.insert_one(Movie(name="Jurassic Park", year=1993))
>>> result = collection.find_one({"name": "Jurassic Park"})
>>> assert result is not None
>>> # This will yield a type-checking error, despite being present, because it is added by PyMongo.
>>> assert result["_id"] # type:ignore[typeddict-item]
>>> collection: Collection[ExplicitMovie] = client.test.test
>>> # Note that the _id keyword argument must be supplied
>>> inserted = collection.insert_one(
... ExplicitMovie(_id=ObjectId(), name="Jurassic Park", year=1993)
... )
>>> result = collection.find_one({"name": "Jurassic Park"})
>>> assert result is not None
>>> # This will not raise a type-checking error.
>>> assert result["_id"]
>>> collection: Collection[NotRequiredMovie] = client.test.test
>>> # Note the lack of _id, similar to the first example
>>> inserted = collection.insert_one(NotRequiredMovie(name="Jurassic Park", year=1993))
>>> result = collection.find_one({"name": "Jurassic Park"})
>>> assert result is not None
>>> # This will not raise a type-checking error, despite not being provided explicitly.
>>> assert result["_id"]
Typed Database¶
While less common, you could specify that the documents in an entire database
match a well-defined schema using TypedDict
(Python 3.8+).
>>> from typing import TypedDict
>>> from pymongo import MongoClient
>>> from pymongo.database import Database
>>> class Movie(TypedDict):
... name: str
... year: int
...
>>> client: MongoClient = MongoClient()
>>> db: Database[Movie] = client.test
>>> collection = db.test
>>> inserted = collection.insert_one({"name": "Jurassic Park", "year": 1993})
>>> result = collection.find_one({"name": "Jurassic Park"})
>>> assert result is not None
>>> assert result["year"] == 1993
Typed Command¶
When using the command()
, you can specify the document type by providing a custom CodecOptions
:
>>> from pymongo import MongoClient
>>> from bson.raw_bson import RawBSONDocument
>>> from bson import CodecOptions
>>> client: MongoClient = MongoClient()
>>> options = CodecOptions(RawBSONDocument)
>>> result = client.admin.command("ping", codec_options=options)
>>> assert isinstance(result, RawBSONDocument)
Custom collections.abc.Mapping
subclasses and TypedDict
(Python 3.8+) are also supported.
For TypedDict
, use the form: options: CodecOptions[MyTypedDict] = CodecOptions(...)
.
Typed BSON Decoding¶
You can specify the document type returned by bson
decoding functions by providing CodecOptions
:
>>> from typing import Any, Dict
>>> from bson import CodecOptions, encode, decode
>>> class MyDict(Dict[str, Any]):
... def foo(self):
... return "bar"
...
>>> options = CodecOptions(document_class=MyDict)
>>> doc = {"x": 1, "y": 2}
>>> bsonbytes = encode(doc, codec_options=options)
>>> rt_document = decode(bsonbytes, codec_options=options)
>>> assert rt_document.foo() == "bar"
RawBSONDocument
and TypedDict
(Python 3.8+) are also supported.
For TypedDict
, use the form: options: CodecOptions[MyTypedDict] = CodecOptions(...)
.
Troubleshooting¶
Client Type Annotation¶
If you forget to add a type annotation for a MongoClient
object you may get the following mypy
error:
from pymongo import MongoClient
client = MongoClient() # error: Need type annotation for "client"
The solution is to annotate the type as client: MongoClient
or client: MongoClient[Dict[str, Any]]
. See Basic Usage.
Incompatible Types¶
If you use the generic form of MongoClient
you
may encounter a mypy
error like:
from pymongo import MongoClient
client: MongoClient = MongoClient()
client.test.test.insert_many(
{"a": 1}
) # error: Dict entry 0 has incompatible type "str": "int";
# expected "Mapping[str, Any]": "int"
The solution is to use client: MongoClient[Dict[str, Any]]
as used in
Basic Usage .
Actual Type Errors¶
Other times mypy
will catch an actual error, like the following code:
from pymongo import MongoClient
from typing import Mapping
client: MongoClient = MongoClient()
client.test.test.insert_one(
[{}]
) # error: Argument 1 to "insert_one" of "Collection" has
# incompatible type "List[Dict[<nothing>, <nothing>]]";
# expected "Mapping[str, Any]"
In this case the solution is to use insert_one({})
, passing a document instead of a list.
Another example is trying to set a value on a RawBSONDocument
, which is read-only.:
from bson.raw_bson import RawBSONDocument
from pymongo import MongoClient
client = MongoClient(document_class=RawBSONDocument)
coll = client.test.test
doc = {"my": "doc"}
coll.insert_one(doc)
retrieved = coll.find_one({"_id": doc["_id"]})
assert retrieved is not None
assert len(retrieved.raw) > 0
retrieved[
"foo"
] = "bar" # error: Unsupported target for indexed assignment
# ("RawBSONDocument") [index]