You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+27-4Lines changed: 27 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
-
# Schema Transformer: Migrating RU-based Azure Cosmos DB for MongoDB to vCore-based
1
+
# Schema Transformer: Migrating RU-based Azure Cosmos DB for MongoDB to Azure DocumentDB
2
2
3
-
Schema Transformer is a Python script designed to analyze Mongo RU Collection schemas and efficiently transform them into a vCore-optimized structure. This ensures seamless compatibility and enhances query performance.
3
+
Schema Transformer is a Python script designed to analyze Mongo RU Collection schemas and efficiently transform them into a DocumentDB optimized structure. This ensures seamless compatibility and enhances query performance.
4
4
5
5
With this tool, you can generate index and sharding recommendations tailored specifically to your workload, making your migration smoother and more efficient.
6
6
@@ -9,16 +9,17 @@ With this tool, you can generate index and sharding recommendations tailored spe
9
9
The tool supports the following versions:
10
10
11
11
-**Source:** Azure Cosmos DB for MongoDB RU-based (version 4.2 and above)
12
-
-**Target:** Azure Cosmos DB for MongoDB vCore (all versions)
12
+
-**Target:** Azure DocumentDB (all versions)
13
13
14
14
## How to Run the Script
15
15
16
16
### Prerequisites
17
17
18
18
Before running the assessment, ensure that the client machine meets the following requirements:
19
19
20
-
- Access to both source and target MongoDB endpoints, either over a private or public network via the specified IP or hostname.
20
+
- Access to both source MongoDB RU endpoint and target Azure DocumentDb endpoint, either over a private or public network via the specified IP or hostname.
21
21
- Python (version 3.10 or above) must be installed.
22
+
- PyMongo library must be installed (`pip install pymongo`).
22
23
23
24
### Steps to Run the Assessment
24
25
@@ -132,6 +133,27 @@ Before running the assessment, ensure that the client machine meets the followin
132
133
}
133
134
```
134
135
136
+
6. To colocate collections with a reference collection
137
+
138
+
```json
139
+
{
140
+
"sections": [
141
+
{
142
+
"include": [
143
+
"db1.coll2",
144
+
"db1.coll3"
145
+
],
146
+
"migrate_shard_key": "false",
147
+
"drop_if_exists": "true",
148
+
"optimize_compound_indexes": "true",
149
+
"co_locate_with": "coll1"
150
+
}
151
+
]
152
+
}
153
+
```
154
+
155
+
**Note:** The collection specified in `co_locate_with` must already exist in the same database as the collection being processed. If the reference collection is not found, the script will fail with an error.
156
+
135
157
4. Run the following command, providing the full path of the JSON file created in the previous step:
136
158
137
159
```cmd
@@ -148,3 +170,4 @@ This process will generate a vCore-optimized schema with index and sharding reco
148
170
| **migrate_shard_key** | Determines whether the existing shard key definition should be migrated. If set to `True`, the shard key is retained; if `False`, the target collection remains unsharded. Collections that are originally unsharded in the source will remain unsharded in the target, regardless of this setting. **Default:** `False`. |
149
171
| **drop_if_exists** | Specifies whether collections with the same name in the target should be dropped and recreated. If `True`, existing collections are removed before migration; if `False`, they remain unchanged. **Default:** `False`. |
150
172
| **optimize_compound_indexes** | Controls whether compound indexes should be optimized. If `True`, the script identifies redundant indexes and excludes them from migration; if `False`, all indexes are migrated as-is. **Default:** `False`. |
173
+
| **co_locate_with** | Specifies the name of a reference collection from the same database to colocate with. When specified, the target collection will be colocated with the reference collection for improved query performance. The reference collection must exist in the same database before colocation is applied, or an error will be thrown. This option is useful for optimizing queries that join or access related collections together. **Default:** `None`. |
0 commit comments