Skip to content

DB normalization/joining #210

@si14

Description

@si14

This issue is mostly an invitation to continue the Slack discussion.

The problem

I frequently find myself writing normalization/denormalization code in various places. It would be nice to have an "officially blessed" (or documented) approach, be it a part of re-frame, a library or just a documented pattern.

What's normalization?

The long explanation can be found in Om.Next wiki. In short, in this particular context, it's a practice of storing "objects" (or entity's properties if you may) in an (often) top-level map by their ID. For example, if we have some Snippets belonging to a bunch of Users, we may structure app DB as follows:

{:user/by-id {:joe-id {:name "Joe"
                       :id :joe-id
                       :snippets [1 3 5]}
              :jane-id {:name "Jane"
                        :id :jane-id
                        :snippets [2 4]}}
 :snippet/by-id {1 {:id 1
                    :creation-date ...
                    :text "..."}
                 2 {...}}}

There are two separate concerns resolved by normalization: ordering and modification consistency.

Modification consistency

I'll lift an example from Om.Next docs. Let's say you have the following structure:

(def init-data
  {:list/one [{:name "John" :points 0}
              {:name "Mary" :points 0}
              {:name "Bob"  :points 0}]
   :list/two [{:name "Mary" :points 0 :age 27}
              {:name "Gwen" :points 0}
              {:name "Jeff" :points 0}]})

You see that "Mary" occurs in both lists? If you want to increment Mary's points, you'll need to ensure that you are incrementing them in both places. On the other hand, if you normalize the structure like this

{:list/one
 [[:person/by-name "John"]
  [:person/by-name "Mary"]
  [:person/by-name "Bob"]],
 :list/two
 [[:person/by-name "Mary"]
  [:person/by-name "Gwen"]
  [:person/by-name "Jeff"]],
 :person/by-name
 {"John" {:name "John", :points 0},
  "Mary" {:name "Mary", :points 0, :age 27},
  "Bob" {:name "Bob", :points 0},
  "Gwen" {:name "Gwen", :points 0}, 
  "Jeff" {:name "Jeff", :points 0}}}

you can increment the value only once while ensuring consistent data across your app. Please note that Om.Next uses explicit "reference typing" like [:person/by-name "Mary"], while in others examples in this issue I use "implicit" one with int IDs. Which one do you think is better?

Ordering

Sometimes you want to preserve a server-provided ordering (let's say you receive a vector of Snippets) while having a way to access "objects" by IDs. You have at least three alternatives:

  1. create both a vector and a map of Snippets, hopefully exploiting pointer sharing. Now you need to manually ensure modification consistency between both structures.
  2. use something like map-indexed to add an index into each Snippet and use either sorted-map-by or sort by that index in your views. It's workable but honestly doesn't feel right.
  3. accumulate all Snippets in a map, use a vector of IDs to preserve order.

Questions

  • Do you think there is a problem here?
  • Do you see normalization as a way to structure data in your webapps?
  • Do you think re-frame should facilitate a solution, either through docs, code or a link to a library?
  • Which "reference style" would you prefer, "typed"/"explicit" or "implicit" one?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions