Skip to content

Commit f1139fd

Browse files
authored
Merge pull request #188645 from markjbrown/main
fix data model for comments
2 parents 1f866a6 + b38cfea commit f1139fd

File tree

1 file changed

+13
-20
lines changed

1 file changed

+13
-20
lines changed

articles/cosmos-db/sql/modeling-data.md

Lines changed: 13 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: mjbrown
77
ms.service: cosmos-db
88
ms.subservice: cosmosdb-sql
99
ms.topic: conceptual
10-
ms.date: 08/26/2021
10+
ms.date: 02/15/2022
1111

1212
---
1313
# Data modeling in Azure Cosmos DB
@@ -113,7 +113,7 @@ Take this JSON snippet.
113113
}
114114
```
115115

116-
This might be what a post entity with embedded comments would look like if we were modeling a typical blog, or CMS, system. The problem with this example is that the comments array is **unbounded**, meaning that there's no (practical) limit to the number of comments any single post can have. This may become a problem as the size of the item could grow infinitely large.
116+
This might be what a post entity with embedded comments would look like if we were modeling a typical blog, or CMS, system. The problem with this example is that the comments array is **unbounded**, meaning that there's no (practical) limit to the number of comments any single post can have. This may become a problem as the size of the item could grow infinitely large so is a design you should avoid.
117117

118118
As the size of the item grows the ability to transmit the data over the wire as well as reading and updating the item, at scale, will be impacted.
119119

@@ -133,26 +133,19 @@ Post item:
133133
}
134134

135135
Comment items:
136-
{
137-
"postId": "1"
138-
"comments": [
139-
{"id": 4, "author": "anon", "comment": "more goodness"},
140-
{"id": 5, "author": "bob", "comment": "tails from the field"},
141-
...
142-
{"id": 99, "author": "angry", "comment": "blah angry blah angry"}
143-
]
144-
},
145-
{
146-
"postId": "1"
147-
"comments": [
148-
{"id": 100, "author": "anon", "comment": "yet more"},
149-
...
150-
{"id": 199, "author": "bored", "comment": "will this ever end?"}
151-
]
152-
}
136+
[
137+
{"id": 4, "postId": "1", "author": "anon", "comment": "more goodness"},
138+
{"id": 5, "postId": "1", "author": "bob", "comment": "tails from the field"},
139+
...
140+
{"id": 99, "postId": "1", "author": "angry", "comment": "blah angry blah angry"},
141+
{"id": 100, "postId": "2", "author": "anon", "comment": "yet more"},
142+
...
143+
{"id": 199, "postId": "2", "author": "bored", "comment": "will this ever end?"}
144+
]
153145
```
154146

155-
This model has the three most recent comments embedded in the post container, which is an array with a fixed set of attributes. The other comments are grouped in to batches of 100 comments and stored as separate items. The size of the batch was chosen as 100 because our fictitious application allows the user to load 100 comments at a time.
147+
This model has a document for each comment with a property that contains the post id. This allows posts to contain any number of comments and can grow efficiently. Users wanting to see more
148+
than the most recent comments would query this container passing the postId which should be the partition key for the comments container.
156149

157150
Another case where embedding data isn't a good idea is when the embedded data is used often across items and will change frequently.
158151

0 commit comments

Comments
 (0)