You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Combine bulk create, update, and delete into a single call.
3
4
4
-
`django-bulk-sync` is a package for the Django ORM that combines bulk_create, bulk_update, and delete into a single method call to `bulk_sync`.
5
+
`django-bulk-sync` is a package for the Django ORM that combines bulk_create, bulk_update, and delete into a single method call to `bulk_sync`.
5
6
6
7
It manages all necessary creates, updates, and deletes with as few database calls as possible to maximize performance.
7
8
8
9
## Installation
9
10
10
-
The package is available on pip as [django-bulk-sync][django-bulk-sync]. Run:
11
+
The package is available on pip as [django-bulk-sync][django-bulk-sync]. Run:
11
12
12
13
`pip install django-bulk-sync`
13
14
@@ -17,17 +18,17 @@ then import via:
17
18
18
19
## A Usage Scenario
19
20
20
-
Companies have zero or more Employees. You want to efficiently sync the names of all employees for a single `Company` from an import from that company, but some are added, updated, or removed. The simple approach is inefficient -- read the import line by line, and:
21
+
Companies have zero or more Employees. You want to efficiently sync the names of all employees for a single `Company` from an import from that company, but some are added, updated, or removed. The simple approach is inefficient -- read the import line by line, and:
21
22
22
23
For each of N records:
23
24
24
-
- SELECT to check for the employee's existence
25
-
- UPDATE if it exists, INSERT if it doesn't
25
+
- SELECT to check for the employee's existence
26
+
- UPDATE if it exists, INSERT if it doesn't
27
+
28
+
Then figure out some way to identify what was missing and delete it. As is so often the case, the speed of this process is controlled mostly by the number of queries run, and here it is about two queries for every record, and so O(N).
26
29
27
-
Then figure out some way to identify what was missing and delete it. As is so often the case, the speed of this process is controlled mostly by the number of queries run, and here it is about two queries for every record, and so O(N).
30
+
Instead, with `bulk_sync`, we can avoid the O(N) number of queries, and simplify the logic we have to write as well.
28
31
29
-
Instead, with `bulk_sync`, we can avoid the O(N) number of queries, and simplify the logic we have to write as well.
30
-
31
32
## Example Usage
32
33
33
34
```python
@@ -36,16 +37,16 @@ from bulk_sync import bulk_sync
36
37
37
38
new_models = []
38
39
for line in company_import_file:
39
-
# The `.id` (or `.pk`) field should not be set. Instead, `key_fields`
40
+
# The `.id` (or `.pk`) field should not be set. Instead, `key_fields`
40
41
# tells it how to match.
41
42
e = Employee(name=line['name'], phone_number=line['phone_number'], ...)
42
43
new_models.append(e)
43
44
44
-
# `filters` controls the subset of objects considered when deciding to
45
+
# `filters` controls the subset of objects considered when deciding to
45
46
# update or delete.
46
-
filters = Q(company_id=501)
47
+
filters = Q(company_id=501)
47
48
# `key_fields` matches an existing object if all `key_fields` are equal.
48
-
key_fields = ('name', )
49
+
key_fields = ('name', )
49
50
ret = bulk_sync(
50
51
new_models=new_models,
51
52
filters=filters,
@@ -57,34 +58,39 @@ print("Results of bulk_sync: "
57
58
.format(**ret['stats']))
58
59
```
59
60
60
-
Under the hood, it will atomically call `bulk_create`, `bulk_update`, and a single queryset `delete()` call, to correctly and efficiently update all fields of all employees for the filtered Company, using `name` to match properly.
61
+
Under the hood, it will atomically call `bulk_create`, `bulk_update`, and a single queryset `delete()` call, to correctly and efficiently update all fields of all employees for the filtered Company, using `name` to match properly.
Combine bulk create, update, and delete. Make the DB match a set of in-memory objects.
66
-
-`new_models`: An iterable of Django ORM `Model` objects that you want stored in the database. They may or may not have `id` set, but you should not have already called `save()` on them.
67
-
-`key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
68
-
-`filters`: Q() filters specifying the subset of the database to work in.
69
-
-`batch_size`: passes through to Django `bulk_create.batch_size` and `bulk_update.batch_size`, and controls how many objects are created/updated per SQL query.
70
-
-`fields`: a list of fields to update - passed through to Django's built in `buld_update`
Combine bulk create, update, and delete. Make the DB match a set of in-memory objects.
67
+
68
+
-`new_models`: An iterable of Django ORM `Model` objects that you want stored in the database. They may or may not have `id` set, but you should not have already called `save()` on them.
69
+
-`key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
70
+
-`filters`: Q() filters specifying the subset of the database to work in. Use `None` or `[]` if you want to sync against the entire table.
71
+
-`batch_size`: passes through to Django `bulk_create.batch_size` and `bulk_update.batch_size`, and controls how many objects are created/updated per SQL query.
72
+
-`fields`: (optional) List of fields to update. If not set, will sync all fields that are editable and not auto-created.
73
+
- Returns a dict:
74
+
75
+
{
76
+
'stats': {
77
+
"created": number of `new_models` not found in database and so created,
78
+
"updated": number of `new_models` that were found in database as matched by `key_fields`,
79
+
"deleted": number of deleted objects - rows in database that matched `filters` but were not present in `new_models`.
-`old_models`: Iterable of Django ORM objects to compare.
75
-
-`new_models`: Iterable of Django ORM objects to compare.
76
-
-`key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key
77
-
is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
78
-
-`ignore_fields`: (optional) If set, provide field names that should not be considered when comparing objects.
79
-
- Returns dict of: ```
80
-
{
81
-
'added': list of all added objects.
82
-
'unchanged': list of all unchanged objects.
83
-
'updated': list of all updated objects.
84
-
'updated_details': dict of {obj: {field_name: (old_value, new_value)}} for all changed fields in each updated object.
85
-
'removed': list of all removed objects.
86
-
} ```
85
+
86
+
-`old_models`: Iterable of Django ORM objects to compare.
87
+
-`new_models`: Iterable of Django ORM objects to compare.
88
+
-`key_fields`: Identifying attribute name(s) to match up `new_models` items with database rows. If a foreign key
89
+
is being used as a key field, be sure to pass the `fieldname_id` rather than the `fieldname`.
90
+
-`ignore_fields`: (optional) If set, provide field names that should not be considered when comparing objects.
91
+
- Returns dict of: `{ 'added': list of all added objects. 'unchanged': list of all unchanged objects. 'updated': list of all updated objects. 'updated_details': dict of {obj: {field_name: (old_value, new_value)}} for all changed fields in each updated object. 'removed': list of all removed objects. }`
87
92
88
93
## Frameworks Supported
89
94
90
-
This library is tested using Python 3 against Django 2.2.
95
+
This library is tested using Python 3 against Django 2.2+. If you are looking for versions that work with Django < 2.2,
0 commit comments