Skip to content

Commit bec2bab

Browse files
committed
Added tests, docs, cleanup for adding fields parameter/Django 2.2 support.
1 parent 4f4ea69 commit bec2bab

File tree

4 files changed

+72
-39
lines changed

4 files changed

+72
-39
lines changed

README.md

Lines changed: 40 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
# django-bulk-sync
2+
23
Combine bulk create, update, and delete into a single call.
34

4-
`django-bulk-sync` is a package for the Django ORM that combines bulk_create, bulk_update, and delete into a single method call to `bulk_sync`.
5+
`django-bulk-sync` is a package for the Django ORM that combines bulk_create, bulk_update, and delete into a single method call to `bulk_sync`.
56

67
It manages all necessary creates, updates, and deletes with as few database calls as possible to maximize performance.
78

89
## Installation
910

10-
The package is available on pip as [django-bulk-sync][django-bulk-sync]. Run:
11+
The package is available on pip as [django-bulk-sync][django-bulk-sync]. Run:
1112

1213
`pip install django-bulk-sync`
1314

@@ -17,17 +18,17 @@ then import via:
1718

1819
## A Usage Scenario
1920

20-
Companies have zero or more Employees. You want to efficiently sync the names of all employees for a single `Company` from an import from that company, but some are added, updated, or removed. The simple approach is inefficient -- read the import line by line, and:
21+
Companies have zero or more Employees. You want to efficiently sync the names of all employees for a single `Company` from an import from that company, but some are added, updated, or removed. The simple approach is inefficient -- read the import line by line, and:
2122

2223
For each of N records:
2324

24-
- SELECT to check for the employee's existence
25-
- UPDATE if it exists, INSERT if it doesn't
25+
- SELECT to check for the employee's existence
26+
- UPDATE if it exists, INSERT if it doesn't
27+
28+
Then figure out some way to identify what was missing and delete it. As is so often the case, the speed of this process is controlled mostly by the number of queries run, and here it is about two queries for every record, and so O(N).
2629

27-
Then figure out some way to identify what was missing and delete it. As is so often the case, the speed of this process is controlled mostly by the number of queries run, and here it is about two queries for every record, and so O(N).
30+
Instead, with `bulk_sync`, we can avoid the O(N) number of queries, and simplify the logic we have to write as well.
2831

29-
Instead, with `bulk_sync`, we can avoid the O(N) number of queries, and simplify the logic we have to write as well.
30-
3132
## Example Usage
3233

3334
```python
@@ -36,16 +37,16 @@ from bulk_sync import bulk_sync
3637

3738
new_models = []
3839
for line in company_import_file:
39-
# The `.id` (or `.pk`) field should not be set. Instead, `key_fields`
40+
# The `.id` (or `.pk`) field should not be set. Instead, `key_fields`
4041
# tells it how to match.
4142
e = Employee(name=line['name'], phone_number=line['phone_number'], ...)
4243
new_models.append(e)
4344

44-
# `filters` controls the subset of objects considered when deciding to
45+
# `filters` controls the subset of objects considered when deciding to
4546
# update or delete.
46-
filters = Q(company_id=501)
47+
filters = Q(company_id=501)
4748
# `key_fields` matches an existing object if all `key_fields` are equal.
48-
key_fields = ('name', )
49+
key_fields = ('name', )
4950
ret = bulk_sync(
5051
new_models=new_models,
5152
filters=filters,
@@ -57,34 +58,39 @@ print("Results of bulk_sync: "
5758
.format(**ret['stats']))
5859
```
5960

60-
Under the hood, it will atomically call `bulk_create`, `bulk_update`, and a single queryset `delete()` call, to correctly and efficiently update all fields of all employees for the filtered Company, using `name` to match properly.
61+
Under the hood, it will atomically call `bulk_create`, `bulk_update`, and a single queryset `delete()` call, to correctly and efficiently update all fields of all employees for the filtered Company, using `name` to match properly.
6162

6263
## Argument Reference
6364

64-
`def bulk_sync(new_models, key_fields, filters, batch_size=None):`
65-
Combine bulk create, update, and delete. Make the DB match a set of in-memory objects.
66-
- `new_models`: An iterable of Django ORM `Model` objects that you want stored in the database. They may or may not have `id` set, but you should not have already called `save()` on them.
67-
- `key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
68-
- `filters`: Q() filters specifying the subset of the database to work in.
69-
- `batch_size`: passes through to Django `bulk_create.batch_size` and `bulk_update.batch_size`, and controls how many objects are created/updated per SQL query.
70-
- `fields`: a list of fields to update - passed through to Django's built in `buld_update`
65+
`def bulk_sync(new_models, key_fields, filters, batch_size=None, fields=None):`
66+
Combine bulk create, update, and delete. Make the DB match a set of in-memory objects.
67+
68+
- `new_models`: An iterable of Django ORM `Model` objects that you want stored in the database. They may or may not have `id` set, but you should not have already called `save()` on them.
69+
- `key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
70+
- `filters`: Q() filters specifying the subset of the database to work in. Use `None` or `[]` if you want to sync against the entire table.
71+
- `batch_size`: passes through to Django `bulk_create.batch_size` and `bulk_update.batch_size`, and controls how many objects are created/updated per SQL query.
72+
- `fields`: (optional) List of fields to update. If not set, will sync all fields that are editable and not auto-created.
73+
- Returns a dict:
74+
75+
{
76+
'stats': {
77+
"created": number of `new_models` not found in database and so created,
78+
"updated": number of `new_models` that were found in database as matched by `key_fields`,
79+
"deleted": number of deleted objects - rows in database that matched `filters` but were not present in `new_models`.
80+
}
81+
}
7182

7283
`def bulk_compare(old_models, new_models, key_fields, ignore_fields=None):`
7384
Compare two sets of models by `key_fields`.
74-
- `old_models`: Iterable of Django ORM objects to compare.
75-
- `new_models`: Iterable of Django ORM objects to compare.
76-
- `key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key
77-
is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
78-
- `ignore_fields`: (optional) If set, provide field names that should not be considered when comparing objects.
79-
- Returns dict of: ```
80-
{
81-
'added': list of all added objects.
82-
'unchanged': list of all unchanged objects.
83-
'updated': list of all updated objects.
84-
'updated_details': dict of {obj: {field_name: (old_value, new_value)}} for all changed fields in each updated object.
85-
'removed': list of all removed objects.
86-
} ```
85+
86+
- `old_models`: Iterable of Django ORM objects to compare.
87+
- `new_models`: Iterable of Django ORM objects to compare.
88+
- `key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key
89+
is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
90+
- `ignore_fields`: (optional) If set, provide field names that should not be considered when comparing objects.
91+
- Returns dict of: `{ 'added': list of all added objects. 'unchanged': list of all unchanged objects. 'updated': list of all updated objects. 'updated_details': dict of {obj: {field_name: (old_value, new_value)}} for all changed fields in each updated object. 'removed': list of all removed objects. }`
8792

8893
## Frameworks Supported
8994

90-
This library is tested using Python 3 against Django 2.2.
95+
This library is tested using Python 3 against Django 2.2+. If you are looking for versions that work with Django < 2.2,
96+
please use the 1.x releases.

bulk_sync/__init__.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,15 @@ def bulk_sync(new_models, key_fields, filters, batch_size=None, fields=None):
1212
`new_models`: Django ORM objects that are the desired state. They may or may not have `id` set.
1313
`key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key
1414
is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
15-
`filters`: Q() filters specifying the subset of the database to work in.
15+
`filters`: Q() filters specifying the subset of the database to work in. Use `None` or `[]` if you want to sync against the entire table.
1616
`batch_size`: passes through to Django `bulk_create.batch_size` and `bulk_update.batch_size`, and controls
1717
how many objects are created/updated per SQL query.
18-
`fields`: a list of fields to update - passed to django's bulk_update
18+
`fields`: (optional) list of fields to update. If not set, will sync all fields that are editable and not auto-created.
1919
2020
"""
2121
db_class = new_models[0].__class__
2222

23-
if not fields:
23+
if fields is None:
2424
# Get a list of fields that aren't PKs and aren't editable (e.g. auto_add_now) for bulk_update
2525
fields = [field.name
2626
for field in db_class._meta.fields

setup.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
setuptools.setup(
77
name="django-bulk-sync",
8-
version='1.3.1',
8+
version='2.0.0',
99
description="Combine bulk add, update, and delete into a single call.",
1010
long_description=long_description,
1111
long_description_content_type='text/markdown',
@@ -18,7 +18,6 @@
1818
'License :: OSI Approved :: MIT License',
1919
'Operating System :: OS Independent',
2020
'Framework :: Django',
21-
'Framework :: Django :: 1.11',
2221
'Framework :: Django :: 2.2',
2322
],
2423
zip_safe=False,

tests/tests.py

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,34 @@ def test_pk_set_but_keyfield_changes_ignores_pk(self):
7575

7676
ret = bulk_sync(new_models=new_objs, filters=Q(company_id=c1.id), key_fields=("name",))
7777

78+
def test_fields_parameter(self):
79+
c1 = Company.objects.create(name="Foo Products, Ltd.")
80+
c2 = Company.objects.create(name="Bar Microcontrollers, Inc.")
81+
82+
e1 = Employee.objects.create(name="Scott", age=40, company=c1)
83+
e2 = Employee.objects.create(name="Isaac", age=9, company=c2)
84+
85+
# We should update Scott's age, and not touch company.
86+
new_objs = [
87+
Employee(name="Scott", age=41, company=c1),
88+
Employee(name="Isaac", age=9, company=c1),
89+
]
90+
91+
ret = bulk_sync(new_models=new_objs, filters=None, key_fields=("name",), fields=['age'])
92+
93+
new_e1 = Employee.objects.get(id=e1.id)
94+
self.assertEqual("Scott", new_e1.name)
95+
self.assertEqual(41, new_e1.age)
96+
self.assertEqual(c1, new_e1.company)
97+
98+
new_e2 = Employee.objects.get(id=e2.id)
99+
self.assertEqual("Isaac", new_e2.name)
100+
self.assertEqual(9, new_e2.age)
101+
self.assertEqual(c2, new_e2.company)
102+
103+
self.assertEqual(2, ret["stats"]["updated"])
104+
self.assertEqual(0, ret["stats"]["created"])
105+
self.assertEqual(0, ret["stats"]["deleted"])
78106

79107
class BulkCompareTests(TestCase):
80108
""" Test `bulk_compare` method """

0 commit comments

Comments
 (0)