Merge pull request #188645 from markjbrown/main

PRMerger13 · web-flow · commit f1139fd5a16d · 2022-02-16T00:50:36.000+05:30
fix data model for comments
diff --git a/articles/cosmos-db/sql/modeling-data.md b/articles/cosmos-db/sql/modeling-data.md
@@ -7,7 +7,7 @@ ms.author: mjbrown
 ms.service: cosmos-db
 ms.subservice: cosmosdb-sql
 ms.topic: conceptual
-ms.date: 08/26/2021
+ms.date: 02/15/2022
 
 ---
 # Data modeling in Azure Cosmos DB
@@ -113,7 +113,7 @@ Take this JSON snippet.
 }
 ```
 
-This might be what a post entity with embedded comments would look like if we were modeling a typical blog, or CMS, system. The problem with this example is that the comments array is **unbounded**, meaning that there's no (practical) limit to the number of comments any single post can have. This may become a problem as the size of the item could grow infinitely large.
+This might be what a post entity with embedded comments would look like if we were modeling a typical blog, or CMS, system. The problem with this example is that the comments array is **unbounded**, meaning that there's no (practical) limit to the number of comments any single post can have. This may become a problem as the size of the item could grow infinitely large so is a design you should avoid.
 
 As the size of the item grows the ability to transmit the data over the wire as well as reading and updating the item, at scale, will be impacted.
 
@@ -133,26 +133,19 @@ Post item:
 }
 
 Comment items:
-{
-    "postId": "1"
-    "comments": [
-        {"id": 4, "author": "anon", "comment": "more goodness"},
-        {"id": 5, "author": "bob", "comment": "tails from the field"},
-        ...
-        {"id": 99, "author": "angry", "comment": "blah angry blah angry"}
-    ]
-},
-{
-    "postId": "1"
-    "comments": [
-        {"id": 100, "author": "anon", "comment": "yet more"},
-        ...
-        {"id": 199, "author": "bored", "comment": "will this ever end?"}
-    ]
-}
+[
+    {"id": 4, "postId": "1", "author": "anon", "comment": "more goodness"},
+    {"id": 5, "postId": "1", "author": "bob", "comment": "tails from the field"},
+    ...
+    {"id": 99, "postId": "1", "author": "angry", "comment": "blah angry blah angry"},
+    {"id": 100, "postId": "2", "author": "anon", "comment": "yet more"},
+    ...
+    {"id": 199, "postId": "2", "author": "bored", "comment": "will this ever end?"}   
+]
 ```
 
-This model has the three most recent comments embedded in the post container, which is an array with a fixed set of attributes. The other comments are grouped in to batches of 100 comments and stored as separate items. The size of the batch was chosen as 100 because our fictitious application allows the user to load 100 comments at a time.  
+This model has a document for each comment with a property that contains the post id. This allows posts to contain any number of comments and can grow efficiently. Users wanting to see more
+than the most recent comments would query this container passing the postId which should be the partition key for the comments container.
 
 Another case where embedding data isn't a good idea is when the embedded data is used often across items and will change frequently.