Custom Type Example¶
This is an example of using a custom type with PyMongo. The example
here is a bit contrived, but shows how to use a
SONManipulator to manipulate
documents as they are saved or retrieved from MongoDB. More
specifically, it shows a couple different mechanisms for working with
custom datatypes in PyMongo.
Setup¶
We’ll start by getting a clean database to use for the example:
>>> from pymongo.mongo_client import MongoClient
>>> client = MongoClient()
>>> client.drop_database("custom_type_example")
>>> db = client.custom_type_example
Since the purpose of the example is to demonstrate working with custom
types, we’ll need a custom datatype to use. Here we define the aptly
named Custom class, which has a single method, x():
>>> class Custom(object):
...   def __init__(self, x):
...     self.__x = x
...
...   def x(self):
...     return self.__x
...
>>> foo = Custom(10)
>>> foo.x()
10
When we try to save an instance of Custom with PyMongo, we’ll
get an InvalidDocument exception:
>>> db.test.insert({"custom": Custom(5)})
Traceback (most recent call last):
InvalidDocument: cannot convert value of type <class 'Custom'> to bson
Manual Encoding¶
One way to work around this is to manipulate our data into something
we can save with PyMongo. To do so we define two methods,
encode_custom() and decode_custom():
>>> def encode_custom(custom):
...   return {"_type": "custom", "x": custom.x()}
...
>>> def decode_custom(document):
...   assert document["_type"] == "custom"
...   return Custom(document["x"])
...
We can now manually encode and decode Custom instances and
use them with PyMongo:
>>> db.test.insert({"custom": encode_custom(Custom(5))})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
>>> decode_custom(db.test.find_one()["custom"])
<Custom object at ...>
>>> decode_custom(db.test.find_one()["custom"]).x()
5
Automatic Encoding and Decoding¶
Needless to say, that was a little unwieldy. Let’s make this a bit
more seamless by creating a new
SONManipulator.
SONManipulator instances allow you
to specify transformations to be applied automatically by PyMongo:
>>> from pymongo.son_manipulator import SONManipulator
>>> class Transform(SONManipulator):
...   def transform_incoming(self, son, collection):
...     for (key, value) in son.items():
...       if isinstance(value, Custom):
...         son[key] = encode_custom(value)
...       elif isinstance(value, dict): # Make sure we recurse into sub-docs
...         son[key] = self.transform_incoming(value, collection)
...     return son
...
...   def transform_outgoing(self, son, collection):
...     for (key, value) in son.items():
...       if isinstance(value, dict):
...         if "_type" in value and value["_type"] == "custom":
...           son[key] = decode_custom(value)
...         else: # Again, make sure to recurse into sub-docs
...           son[key] = self.transform_outgoing(value, collection)
...     return son
...
Now we add our manipulator to the Database:
>>> db.add_son_manipulator(Transform())
After doing so we can save and restore Custom instances seamlessly:
>>> db.test.remove() # remove whatever has already been saved
{...}
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
If we get a new Database instance we’ll
clear out the SONManipulator
instance we added:
>>> db = client.custom_type_example
This allows us to see what was actually saved to the database:
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': {u'x': 5, u'_type': u'custom'}}
which is the same format that we encode to with our
encode_custom() method!
Binary Encoding¶
We can take this one step further by encoding to binary, using a user
defined subtype. This allows us to identify what to decode without
resorting to tricks like the _type field used above.
We’ll start by defining the methods to_binary() and
from_binary(), which convert Custom instances to and
from Binary instances:
Note
You could just pickle the instance and save that. What we do here is a little more lightweight.
>>> from bson.binary import Binary
>>> def to_binary(custom):
...   return Binary(str(custom.x()), 128)
...
>>> def from_binary(binary):
...   return Custom(int(binary))
...
Next we’ll create another
SONManipulator, this time using the
methods we just defined:
>>> class TransformToBinary(SONManipulator):
...   def transform_incoming(self, son, collection):
...     for (key, value) in son.items():
...       if isinstance(value, Custom):
...         son[key] = to_binary(value)
...       elif isinstance(value, dict):
...         son[key] = self.transform_incoming(value, collection)
...     return son
...
...   def transform_outgoing(self, son, collection):
...     for (key, value) in son.items():
...       if isinstance(value, Binary) and value.subtype == 128:
...         son[key] = from_binary(value)
...       elif isinstance(value, dict):
...         son[key] = self.transform_outgoing(value, collection)
...     return son
...
Now we’ll empty the Database and add the
new manipulator:
>>> db.test.remove()
{...}
>>> db.add_son_manipulator(TransformToBinary())
After doing so we can save and restore Custom instances
seamlessly:
>>> db.test.insert({"custom": Custom(5)})
ObjectId('...')
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': <Custom object at ...>}
>>> db.test.find_one()["custom"].x()
5
We can see what’s actually being saved to the database (and verify
that it is using a Binary instance) by
clearing out the manipulators and repeating our
find_one():
>>> db = client.custom_type_example
>>> db.test.find_one()
{u'_id': ObjectId('...'), u'custom': Binary('5', 128)}