Skip to content

Logootish

Nathan Pennie edited this page Nov 2, 2019 · 7 revisions

This assumes that you've read the section in "Home" and are familiar with Logoot.

Terminology

This may be a little bit confusing at first. I'd recommend that you skim the terminology section first and use it for reference when reading the rest of this.

  • Atom, character -- The two terms are used somewhat interchangeably. This is the smallest segment of text that cannot be split into smaller pieces. In this implementation, characters and atoms are the same thing, but in other Logoot implementations, lines could be used as atoms, for example. The downside to this would be that you couldn't insert text in the middle of the line. So for this algorithm, characters are used as atoms
  • Position -- A unique value for the atom that specifies its position
    • These positions must be able to be ordered in a set (I will explain ordering a bit later)
    • The current implementation used for a position is an array of numbers in a class used to provide utility functions. Examples of positions in text format are [0] or [0,0] or [1,2,3]
    • For any two positions, another position must be able to be created between them. If just integers are used, this poses a problem, since there's nothing between 0 and 1, for example. The solution to this is to add another integer to the array to make the position more specific
  • Levels -- The number of elements in the position array. For example, [0] would have 1 level. Be careful though! When getting the variable levels from a LogootPosition, 0 represents one level, not 1 like you would expect from an array length. That way, myposition.level(myposition.levels) will return the lowest level.
  • Vector Clock, rclk -- Imagine that you first insert a single atom at a particular position and then remove it to insert a different letter. If these edits came out of order, how would the client know that the second letter should be used and not removed? The solution is the vector clock, which is named rclk in the code and in events. The vector clock is incremented with every removal
  • Events -- This refers to the Matrix events actually sent in JSON form. Since it would be inefficient to have one event per insertion or removal, any length properties used are simply added on to the lowest level to find the end position. For example, if I insert hi at [0,1,0], the message would have a start position of [0,1,0] and the body of hi. The client would then figure out that the end position, or the position where the next node would be, is [0,1,2]. Thanks to in browser spell check, I now realize that it's insertion instead of insertation. If you see that, I meant to change it. Oops. Not a big deal
  • Node -- Since text is sent in longer chunks, to save on RAM, it is stored in longer chunks as well. A single segment spanning between two positions with (and this is key!) no nodes in the middle of it is a single node. Any nodes placed in the middle must cause the parent node to be split in two. (Though I will explain it in more detail later, this is one of the many jobs of the _mergeNode function) The reason for the necessary split is because the binary search tree must not have overlap to find nodes

Insertions

(Look, I even spelled it right this time!) These are the simplest type of modification. Let's assume that our document looks like 134, where the 1 has the position of [0] and the 3 and 4 have the positions [2] and [3], respectively. If 2 is inserted between the 1 and 3, it's easy to find a position in between their positions. Since 1 has the position [0] and 3 has the position [2], the position [1] is conveniently located right between them. So the new 2 can be given the position [1].

The final document would be represented like this:

[0]     - '1'
[1]     - '2'
[2]-[3] - '34'

Of course, usually it is never that easy. Let's assume that we have the same 134 as before, but this time the positions are [0], [1] and [2], respectively. Now, we try to insert 2 between 1 and 3, as before. This time, we must create a new position between [0] and [1] by adding a new level. Remember that the node with more levels comes first, so the new position would be [1,0]. It seems counter-intuitive to have [1,0] come before [1], but it greatly simplifies the code.

The final document would be represented like this:

[0]     - '1'
[1,0]   - '2'
[1]-[2] - '34'

Removals

...coming soon

Contributing

Clone this wiki locally