Compare commits

..

1 Commits

Author SHA1 Message Date
Bastien Gérard
d73ca6f90d Create main.yml 2020-11-15 15:24:40 +01:00
9 changed files with 44 additions and 385 deletions

33
.github/workflows/main.yml vendored Normal file
View File

@@ -0,0 +1,33 @@
# This is a basic workflow to help you get started with Actions
name: CI
# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2
# Runs a single command using the runners shell
- name: Run a one-line script
run: echo Hello, world!
# Runs a set of commands using the runners shell
- name: Run a multi-line script
run: |
echo Add other actions to build,
echo test, and deploy your project.

View File

@@ -33,7 +33,7 @@ clean:
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. Check $(BUILDDIR)/html/index.html"
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml

View File

@@ -6,23 +6,17 @@ Changelog
Development
===========
- (Fill this out as you fix issues and develop your features).
Changes in 0.21.0
=================
- Bug fix in DynamicDocument which is not parsing known fields in constructor like Document do #2412
- When using pymongo >= 3.7, make use of Collection.count_documents instead of Collection.count
and Cursor.count that got deprecated in pymongo >= 3.7.
This should have a negative impact on performance of count see Issue #2219
- Fix a bug that made the queryset drop the read_preference after clone().
- Remove Py3.5 from CI as it reached EOL and add Python 3.9
- Fix some issues related with db_field/field conflict in constructor #2414
- BREAKING CHANGE: Fix the behavior of Doc.objects.limit(0) which should return all documents (similar to mongodb) #2311
- Fix the behavior of Doc.objects.limit(0) which should return all documents (similar to mongodb) #2311
- Bug fix in ListField when updating the first item, it was saving the whole list, instead of
just replacing the first item (as usually done when updating 1 item of the list) #2392
just replacing the first item (as it's usually done) #2392
- Add EnumField: ``mongoengine.fields.EnumField``
- Refactoring - Remove useless code related to Document.__only_fields and Queryset.only_fields
- Fix query transformation regarding special operators #2365
- Bug Fix: Document.save() fails when shard_key is not _id #2154
Changes in 0.20.0
=================

View File

@@ -14,6 +14,5 @@ User Guide
gridfs
signals
text-indexes
migration
logging-monitoring
mongomock

View File

@@ -1,267 +0,0 @@
===================
Documents migration
===================
The structure of your documents and their associated mongoengine schemas are likely
to change over the lifetime of an application. This section provides guidance and
recommendations on how to deal with migrations.
Due to the very flexible nature of mongodb, migrations of models aren't trivial and
for people that know about `alembic` for `sqlalchemy`, there is unfortunately no equivalent
library that will manage the migration in an automatic fashion for mongoengine.
Example 1: Addition of a field
==============================
Let's start by taking a simple example of a model change and review the different option you
have to deal with the migration.
Let's assume we start with the following schema and save an instance:
.. code-block:: python
class User(Document):
name = StringField()
User(name="John Doe").save()
# print the objects as they exist in mongodb
print(User.objects().as_pymongo()) # [{u'_id': ObjectId('5d06b9c3d7c1f18db3e7c874'), u'name': u'John Doe'}]
On the next version of your application, let's now assume that a new field `enabled` gets added to the
existing ``User`` model with a `default=True`. Thus you simply update the ``User`` class to the following:
.. code-block:: python
class User(Document):
name = StringField(required=True)
enabled = BooleaField(default=True)
Without applying any migration, we now reload an object from the database into the ``User`` class
and checks its `enabled` attribute:
.. code-block:: python
assert User.objects.count() == 1
user = User.objects().first()
assert user.enabled is True
assert User.objects(enabled=True).count() == 0 # uh?
assert User.objects(enabled=False).count() == 0 # uh?
# this is consistent with what we have in the database
# in fact, 'enabled' does not exist
print(User.objects().as_pymongo().first()) # {u'_id': ObjectId('5d06b9c3d7c1f18db3e7c874'), u'name': u'John'}
assert User.objects(enabled=None).count() == 1
As you can see, even if the document wasn't updated, mongoengine applies the default value seamlessly when it
loads the pymongo dict into a ``User`` instance. At first sight it looks like you don't need to migrate the
existing documents when adding new fields but this actually leads to inconsistencies when it comes to querying.
In fact, when querying, mongoengine isn't trying to account for the default value of the new field and so
if you don't actually migrate the existing documents, you are taking a risk that querying/updating
will be missing relevant record.
When adding fields/modifying default values, you can use any of the following to do the migration
as a standalone script:
.. code-block:: python
# Use mongoengine to set a default value for a given field
User.objects().update(enabled=True)
# or use pymongo
user_coll = User._get_collection()
user_coll.update_many({}, {'$set': {'enabled': True}})
Example 2: Inheritance change
=============================
Let's consider the following example:
.. code-block:: python
class Human(Document):
name = StringField()
meta = {"allow_inheritance": True}
class Jedi(Human):
dark_side = BooleanField()
light_saber_color = StringField()
Jedi(name="Darth Vader", dark_side=True, light_saber_color="red").save()
Jedi(name="Obi Wan Kenobi", dark_side=False, light_saber_color="blue").save()
assert Human.objects.count() == 2
assert Jedi.objects.count() == 2
# Let's check how these documents got stored in mongodb
print(Jedi.objects.as_pymongo())
# [
# {'_id': ObjectId('5fac4aaaf61d7fb06046e0f9'), '_cls': 'Human.Jedi', 'name': 'Darth Vader', 'dark_side': True, 'light_saber_color': 'red'},
# {'_id': ObjectId('5fac4ac4f61d7fb06046e0fa'), '_cls': 'Human.Jedi', 'name': 'Obi Wan Kenobi', 'dark_side': False, 'light_saber_color': 'blue'}
# ]
As you can observe, when you use inheritance, MongoEngine stores a field named '_cls' behind the scene to keep
track of the Document class.
Let's now take the scenario that you want to refactor the inheritance schema and:
- Have the Jedi's with dark_side=True/False become GoodJedi's/DarkSith
- get rid of the 'dark_side' field
move to the following schemas:
.. code-block:: python
# unchanged
class Human(Document):
name = StringField()
meta = {"allow_inheritance": True}
# attribute 'dark_side' removed
class GoodJedi(Human):
light_saber_color = StringField()
# new class
class BadSith(Human):
light_saber_color = StringField()
MongoEngine doesn't know about the change or how to map them with the existing data
so if you don't apply any migration, you will observe a strange behavior, as if the collection was suddenly
empty.
.. code-block:: python
# As a reminder, the documents that we inserted
# have the _cls field = 'Human.Jedi'
# Following has no match
# because the query that is used behind the scene is
# filtering on {'_cls': 'Human.GoodJedi'}
assert GoodJedi.objects().count() == 0
# Following has also no match
# because it is filtering on {'_cls': {'$in': ('Human', 'Human.GoodJedi', 'Human.BadSith')}}
# which has no match
assert Human.objects.count() == 0
assert Human.objects.first() is None
# If we bypass MongoEngine and make use of underlying driver (PyMongo)
# we can see that the documents are there
humans_coll = Human._get_collection()
assert humans_coll.count_documents({}) == 2
# print first document
print(humans_coll.find_one())
# {'_id': ObjectId('5fac4aaaf61d7fb06046e0f9'), '_cls': 'Human.Jedi', 'name': 'Darth Vader', 'dark_side': True, 'light_saber_color': 'red'}
As you can see, first obvious problem is that we need to modify '_cls' values based on existing values of
'dark_side' documents.
.. code-block:: python
humans_coll = Human._get_collection()
old_class = 'Human.Jedi'
good_jedi_class = 'Human.GoodJedi'
bad_sith_class = 'Human.BadSith'
humans_coll.update_many({'_cls': old_class, 'dark_side': False}, {'$set': {'_cls': good_jedi_class}})
humans_coll.update_many({'_cls': old_class, 'dark_side': True}, {'$set': {'_cls': bad_sith_class}})
Let's now check if querying improved in MongoEngine:
.. code-block:: python
assert GoodJedi.objects().count() == 1 # Hoorah!
assert BadSith.objects().count() == 1 # Hoorah!
assert Human.objects.count() == 2 # Hoorah!
# let's now check that documents load correctly
jedi = GoodJedi.objects().first()
# raises FieldDoesNotExist: The fields "{'dark_side'}" do not exist on the document "Human.GoodJedi"
In fact we only took care of renaming the _cls values but we havn't removed the 'dark_side' fields
which does not exist anymore on the GoodJedi's and BadSith's models.
Let's remove the field from the collections:
.. code-block:: python
humans_coll = Human._get_collection()
humans_coll.update_many({}, {'$unset': {'dark_side': 1}})
.. note:: We did this migration in 2 different steps for the sake of example but it could have been combined
with the migration of the _cls fields: ::
humans_coll.update_many(
{'_cls': old_class, 'dark_side': False},
{
'$set': {'_cls': good_jedi_class},
'$unset': {'dark_side': 1}
}
)
And verify that the documents now load correctly:
.. code-block:: python
jedi = GoodJedi.objects().first()
assert jedi.name == "Obi Wan Kenobi"
sith = BadSith.objects().first()
assert sith.name == "Darth Vader"
An other way of dealing with this migration is to iterate over
the documents and update/replace them one by one. This is way slower but
it is often useful for complex migrations of Document models.
.. code-block:: python
for doc in humans_coll.find():
if doc['_cls'] == 'Human.Jedi':
doc['_cls'] = 'Human.BadSith' if doc['dark_side'] else 'Human.GoodJedi'
doc.pop('dark_side')
humans_coll.replace_one({'_id': doc['_id']}, doc)
.. warning:: Be aware of this `flaw <https://groups.google.com/g/mongodb-user/c/AFC1ia7MHzk>`_ if you modify documents while iterating
Recommendations
===============
- Write migration scripts whenever you do changes to the model schemas
- Using :class:`~mongoengine.DynamicDocument` or ``meta = {"strict": False}`` may help to avoid some migrations or to have the 2 versions of your application to co-exist.
- Write post-processing checks to verify that migrations script worked. See below
Post-processing checks
======================
The following recipe can be used to sanity check a Document collection after you applied migration.
It does not make any assumption on what was migrated, it will fetch 1000 objects randomly and
run some quick checks on the documents to make sure the document looks OK. As it is, it will fail
on the first occurrence of an error but this is something that can be adapted based on your needs.
.. code-block:: python
def get_random_oids(collection, sample_size):
pipeline = [{"$project": {'_id': 1}}, {"$sample": {"size": sample_size}}]
return [s['_id'] for s in collection.aggregate(pipeline)]
def get_random_documents(DocCls, sample_size):
doc_collection = DocCls._get_collection()
random_oids = get_random_oids(doc_collection, sample_size)
return DocCls.objects(id__in=random_oids)
def check_documents(DocCls, sample_size):
for doc in get_random_documents(DocCls, sample_size):
# general validation (types and values)
doc.validate()
# load all subfields,
# this may trigger additional queries if you have ReferenceFields
# so it may be slow
for field in doc._fields:
try:
getattr(doc, field)
except Exception:
LOG.warning(f"Could not load field {field} in Document {doc.id}")
raise
check_documents(Human, sample_size=1000)

View File

@@ -28,7 +28,7 @@ __all__ = (
)
VERSION = (0, 21, 0)
VERSION = (0, 20, 0)
def get_version():

View File

@@ -101,13 +101,12 @@ class BaseDocument:
self._dynamic_fields = SON()
# Assign default values for fields
# not set in the constructor
for field_name in self._fields:
if field_name in values:
# Assign default values to the instance.
for key, field in self._fields.items():
if self._db_field_map.get(key, key) in values:
continue
value = getattr(self, field_name, None)
setattr(self, field_name, value)
value = getattr(self, key, None)
setattr(self, key, value)
if "_cls" not in values:
self._cls = self._class_name
@@ -116,6 +115,7 @@ class BaseDocument:
dynamic_data = {}
FileField = _import_class("FileField")
for key, value in values.items():
key = self._reverse_db_field_map.get(key, key)
field = self._fields.get(key)
if field or key in ("id", "pk", "_cls"):
if __auto_convert and value is not None:
@@ -750,8 +750,7 @@ class BaseDocument:
@classmethod
def _from_son(cls, son, _auto_dereference=True, created=False):
"""Create an instance of a Document (subclass) from a PyMongo SON (dict)
"""
"""Create an instance of a Document (subclass) from a PyMongo SON."""
if son and not isinstance(son, dict):
raise ValueError(
"The source SON object needs to be of type 'dict' but a '%s' was found"
@@ -764,8 +763,6 @@ class BaseDocument:
# Convert SON to a data dict, making sure each key is a string and
# corresponds to the right db field.
# This is needed as _from_son is currently called both from BaseDocument.__init__
# and from EmbeddedDocumentField.to_python
data = {}
for key, value in son.items():
key = str(key)

View File

@@ -3822,95 +3822,5 @@ class ObjectKeyTestCase(MongoDBTestCase):
assert book._object_key == {"pk": book.pk, "author__name": "Author"}
class DBFieldMappingTest(MongoDBTestCase):
def setUp(self):
class Fields(object):
w1 = BooleanField(db_field="w2")
x1 = BooleanField(db_field="x2")
x2 = BooleanField(db_field="x3")
y1 = BooleanField(db_field="y0")
y2 = BooleanField(db_field="y1")
z1 = BooleanField(db_field="z2")
z2 = BooleanField(db_field="z1")
class Doc(Fields, Document):
pass
class DynDoc(Fields, DynamicDocument):
pass
self.Doc = Doc
self.DynDoc = DynDoc
def tearDown(self):
for collection in list_collection_names(self.db):
self.db.drop_collection(collection)
def test_setting_fields_in_constructor_of_strict_doc_uses_model_names(self):
doc = self.Doc(z1=True, z2=False)
assert doc.z1 is True
assert doc.z2 is False
def test_setting_fields_in_constructor_of_dyn_doc_uses_model_names(self):
doc = self.DynDoc(z1=True, z2=False)
assert doc.z1 is True
assert doc.z2 is False
def test_setting_unknown_field_in_constructor_of_dyn_doc_does_not_overwrite_model_fields(
self,
):
doc = self.DynDoc(w2=True)
assert doc.w1 is None
assert doc.w2 is True
def test_unknown_fields_of_strict_doc_do_not_overwrite_dbfields_1(self):
doc = self.Doc()
doc.w2 = True
doc.x3 = True
doc.y0 = True
doc.save()
reloaded = self.Doc.objects.get(id=doc.id)
assert reloaded.w1 is None
assert reloaded.x1 is None
assert reloaded.x2 is None
assert reloaded.y1 is None
assert reloaded.y2 is None
def test_dbfields_are_loaded_to_the_right_modelfield_for_strict_doc_2(self):
doc = self.Doc()
doc.x2 = True
doc.y2 = True
doc.z2 = True
doc.save()
reloaded = self.Doc.objects.get(id=doc.id)
assert (
reloaded.x1,
reloaded.x2,
reloaded.y1,
reloaded.y2,
reloaded.z1,
reloaded.z2,
) == (doc.x1, doc.x2, doc.y1, doc.y2, doc.z1, doc.z2)
def test_dbfields_are_loaded_to_the_right_modelfield_for_dyn_doc_2(self):
doc = self.DynDoc()
doc.x2 = True
doc.y2 = True
doc.z2 = True
doc.save()
reloaded = self.DynDoc.objects.get(id=doc.id)
assert (
reloaded.x1,
reloaded.x2,
reloaded.y1,
reloaded.y2,
reloaded.z1,
reloaded.z2,
) == (doc.x1, doc.x2, doc.y1, doc.y2, doc.z1, doc.z2)
if __name__ == "__main__":
unittest.main()

View File

@@ -2272,13 +2272,6 @@ class TestField(MongoDBTestCase):
with pytest.raises(FieldDoesNotExist):
Doc(bar="test")
def test_undefined_field_works_no_confusion_with_db_field(self):
class Doc(Document):
foo = StringField(db_field="bar")
with pytest.raises(FieldDoesNotExist):
Doc(bar="test")
class TestEmbeddedDocumentListField(MongoDBTestCase):
def setUp(self):