-
-
Notifications
You must be signed in to change notification settings - Fork 197
Home
Welcome to the Dynamoid wiki! This will expand as we grow our knowledgebase.
DynamoDB doesn't do general sorting without using what is known as the Range key since the general structure is a Key Value Store.
Thus, your sorting is based on items with the same HashKey. So for example we could have this as a Topics table:
| Forum | Topic | PostCount |
|---|---|---|
| General | Hello World | 10 |
| General | For Your Information | 25 |
| General | Vote for New Moderator | 13 |
| Health and Science | The New Cure for Allergies | 10 |
| Health and Science | Gluten, is it good for you? | 3 |
| Technology | The New iPhone X | 142 |
If we have Forum as our HashKey and PostCount as our Range key for the entire table then we can get the "topics" sorted in each forum by post count:
Topics.where(forum: "General").order('post_count DESC')
This will return your topics in descending order in the General Forum. And in this case we don't need a Secondary index because our base table has the Range key.
So now for your post example, this is a bit harder, I'm not sure what your HashKey would be, but in order to sort, you need a HashKey of the same value across your various rows.
The general Post example would likely be:
| Id | Content | Votes |
|---|---|---|
| 1 | Top 10 Ruby Gems You Should Use | 10 |
| 2 | The Best Tacos in Town | 3 |
| 3 | Follow These Users For Maximum Success | 13 |
In this case, our HashKey might be Id and the Votes is our Range Key, however, Range is sorted PER HashKey which means, you actually can't query this table and get your posts sorted by votes with this schema.
Instead you would have to add maybe "Site" to your schema and so each Post corresponds to a particular Site.
| Id | Content | Votes | Site |
|---|---|---|---|
| 1 | Top 10 Ruby Gems You Should Use | 10 | My Blog |
| 2 | The Best Tacos in Town | 3 | My Blog |
| 3 | Follow These Users For Maximum Success | 13 | My Blog |
| 4 | What Happens Next Will Surprise You | 10 | Jerry's Blog |
| 5 | Top 10 Conspiracy Theories With Rick | 3 | Jerry's Blog |
In this schema, we now can see we can get our posts sorted by "votes" by having the HashKey be Site and we'd do:
Post.where(site: "My Blog").order("votes DESC")
Again, this HashKey part is important!
Building off this example, since our HashKey is Site we can't have multiple range keys on the table. You can create Local Secondary Indices (which is basically adding more range keys to the table :P) so our schema could be:
| Type | Name | HashKey | RangeKey |
|---|---|---|---|
| Table | posts | Site | Votes |
| Local Secondary Index | content_index | Content | |
| Local Secondary Index | id_index | Id | |
| Local Secondary Index | created_at_index | CreatedAt |
By doing this we can then do queries like:
Post.where(site: "My Blog").order("content DESC")
Post.where(site: "My Blog").order("id DESC")
Post.where(site: "My Blog").order("created_at DESC")
Again, I must repeat, you MUST have same HashKey in order to sort, thus we cannot as give me ALL POSTS sorted by Votes in this scheme without introducing a new attribute or saying having all Site values be the exact same across all rows (Hope that makes sense)
So the difference is that Local Secondary Indices:
- Must be created when table is created
- Must share the same hash key that your table has
- Must identify the range key to use
Whereas a Global Secondary Index may have a different Hash Key and a different range key.
However, be careful because no partition (determined by HashKey) can have more than 10GB of data (see http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html#LSI.ItemCollections.SizeLimit) thus in the example here, all rows with Site as "My Blog" will end up same partition due to same HashKey and all these rows must be contained within 10 GB of space.