diff --git a/README.md b/README.md
index d9ff043..088402d 100644
--- a/README.md
+++ b/README.md
@@ -122,6 +122,64 @@ e.g. to take the existing 1kb JSON paramters, but also support 124-byte keys, us
 
 If you are deriving a key to look up in-circuit and you do not know the maximum length of the key, all query methods have a version with a `_var` suffix (e.g. `JSON::get_string_var`), which accepts the key as a `BoundedVec`
 
+#  Architecture
+### Overview
+The JSON parser uses 5 steps to efficiently parse and index JSON data:
+
+1. **build_transcript** - Convert raw bytes to a transcript of tokens using state machine defined by by JSON_CAPTURE_TABLE. Categorize each character as string, number, ...
+2. **capture_missing_tokens & keyswap** - Fix missing tokens and correctly identify keys. Complete a second scan of the tokens, check for missing tokens (e.g.commas after literals), and for strings that are keys to an object, relabel them as keys, 
+3. **compute_json_packed** - Pack bytes into Field elements for efficient substring extraction
+4. **create_json_entries** - Create structured JSON entries with parent-child relationships
+5. **compute_keyhash_and_sort_json_entries** - Sort entries by key hash for efficient lookups
+
+### Key Design Patterns
+- **Using table lookups**: Uses many lookup tables to avoid branching logic to reduce circuit size
+- **Packing data to Field elements**: Combines multiple fields that encodes different features into a single Field element for comparison
+
+### Table Generation
+The parser uses several lookup tables generated from `src/_table_generation/`:
+- `TOKEN_FLAGS_TABLE`: State transitions for token processing
+- `JSON_CAPTURE_TABLE`: Character-by-character parsing rules
+- `TOKEN_VALIDATION_TABLE`: JSON grammar validation
+
+### Example walkthrough
+We can take a look at raw Json text {"name": "Alice", "age": 30} and how it is being parsed.
+First, The parser reads the JSON one character at a time and uses lookup tables to decide what to do with each character. For {"name": "Alice"}, 
+Character: {  → "Start scanning an object (grammar_capture)"
+Character: "  → "Start scanning a string" 
+Character: n  → "Continue scanning the string"
+Character: a  → "Continue scanning the string"
+Character: m  → "Continue scanning the string"
+Character: e  → "Continue scanning the string"
+Character: "  → "End the string"
+Character: :  → "Key-value separator"
+Character: "  → "Start scanning a string"
+Character: A  → "Continue scanning the string"
+Character: l  → "Continue scanning the string"
+Character: i  → "Continue scanning the string"
+Character: c  → "Continue scanning the string"
+Character: e  → "Continue scanning the string"
+Character: "  → "End the string"
+Character: }  → "End the object"
+
+The parser builds a list of "tokens", the basic building blocks of the JSON, which becomes
+1. BEGIN_OBJECT_TOKEN ({)
+2. STRING_TOKEN ("name")
+3. KEY_SEPARATOR_TOKEN (:)
+4. STRING_TOKEN ("Alice")
+5. END_OBJECT_TOKEN (})
+
+The parser converts tokens into structured entries with parent-child relationships.
+Each entry knows:
+What type it is (object, string, number, etc.)
+Who its parent is
+How many children it has
+Where it is in the original JSON
+
+Finally, the parser sorts entries by their key hashes for fast lookups.
+Original order: [{"name": "Alice"}, {"age": 30}]
+Sorted order:   [{"age": 30}, {"name": "Alice"}]
+
 # Acknowledgements
 
 Many thanks to the authors of the OG noir json library https://github.com/RontoSOFT/noir-json-parser
diff --git a/src/_table_generation/table_generation.md b/src/_table_generation/table_generation.md
new file mode 100644
index 0000000..a3b7753
--- /dev/null
+++ b/src/_table_generation/table_generation.md
@@ -0,0 +1,71 @@
+# Table Generation Documentation
+
+## Overview
+The JSON parser uses lookup tables to avoid branching logic and reduce gate count. These tables are generated from `src/_table_generation/make_tables.nr`.
+
+## Generation Process
+Tables are generated by simulating all possible input combinations from basic hardcoded tables and recording the expected outputs.
+
+## TOKEN_FLAGS_TABLE
+Maps (token, context) pairs to parsing flags:
+- `create_json_entry`: Whether to create a JSON entry for this token, set to true if token is literal/number/string(not key)/end of array/object
+- `is_end_of_object_or_array`: Whether token ends an object/array
+- `is_start_of_object_or_array`: Whether token starts an object/array
+- `new_context`: What context to switch to, object is 0, array is 1
+- `is_key_token`: Whether token is a key
+- `is_value_token`: Whether token is a value, set to True for string_token, numeric_token, and literal_token
+- `preserve_num_entries`: boolean flag that controls whether the current token should preserve the existing count of entries at the current depth or reset/increment it. 1 for tokens like NO_TOKEN, KEY_TOKEN, STRING_TOKEN, NUMERIC_TOKEN, LITERAL_TOKEN
+0 for tokens like OBJECT_START_TOKEN, ARRAY_START_TOKEN, OBJECT_END_TOKEN, ARRAY_END_TOKEN
+
+## JSON_CAPTURE_TABLE
+Maps (escape_flag, scan_mode, ascii) to scanning actions:
+- `scan_token`: Next capture mode based on current capture mode, can be grammar_capture([,{,comma,},],:)/string_capture/literal_capture/numeric_capture/error_capture. For example, if currently we are in string capture, and character is ", then scan_token will be set to grammar_capture because we are at end of string, back to grammar scan. If we are in numeric scan, and current character is not 0-9, then we are back to grammar scan as we expect the number has ended.
+- `push_transcript`: Whether to add token to transcript: in grammar mode: true for all structual elements[,{,comma,},],:. In string_capture, true for ", which signals string end. In numeric/literal_capture, true for space, \t, \n, \r, ", and comma. Note the first scan will not pick up numerics or literals because we don't know when they end, so we need to rely on capture_missing_tokens function.
+- `increase_length`: Whether to extend current token, always false for grammar_capture, true for 0-9 in numeric capture, all characters except for " in string_capture, all letters in true, false, null in literal_capture
+- `is_potential_escape_sequence`: true if current token is / in string_capture mode
+
+## Other tables
+While TOKEN_FLAGS_TABLE and JSON_CAPTURE_TABLE are the more important tables, they are built from foundational hardcoded tables in make_tables_subtables.nr:
+
+GRAMMAR_CAPTURE_TABLE: State transition table for grammar scan mode. Each entry specifies the next scan mode (GRAMMAR_CAPTURE, STRING_CAPTURE, NUMERIC_CAPTURE, LITERAL_CAPTURE, or ERROR_CAPTURE) based on the encountered ASCII character. For example, "f" is mapped to LITEAL_CAPTURE because it indicates we began to scan the literal false.
+STRING_CAPTURE_TABLE
+NUMERIC_CAPTURE_TABLE
+LITERAL_CAPTURE_TABLE
+
+GRAMMAR_CAPTURE_TOKEN: Maps characters in grammar mode to token types. Converts ASCII characters into the appropriate JSON token types for structural elements, values, and literals.
+ Structural characters ({, }, [, ], ,, :) → their respective structural tokens
+- Quote (") → STRING_TOKEN (start of string)
+- Digits (0-9) → NUMERIC_TOKEN (start of number)
+- Literal starters (f, t, n) → LITERAL_TOKEN (start of true/false/null)
+- Invalid characters → NO_TOKEN or error handling
+STRING_CAPTURE_TOKEN
+NUMERIC_CAPTURE_TOKEN
+LITERAL_CAPTURE_TOKEN
+
+STRING_CAPTURE_PUSH_TRANSCRIPT: Determines when to add tokens to the transcript while scanning inside a string. Only true for the closing quote ("). This signals the end of the string and triggers token creation. All other characters within the string (letters, numbers, punctuation, spaces) are false because they extend the current string token rather than creating new tokens.
+
+GRAMMAR_CAPTURE_PUSH_TRANSCRIPT: Determines when to add tokens to the transcript while scanning in grammar mode. True for the following characters:
+- Comma (,) → true (value separator)
+- Colon (:) → true (key-value separator)
+- All other characters → false (including digits, quotes, and literal starters)
+
+NUMERIC_CAPTURE_PUSH_TRANSCRIPT: Determines when to add the current numeric token to the transcript while scanning a number. True for the following characters:
+- Whitespace (space, tab, newline, carriage return) → true (end number)
+- Quote (") → true (end number, followed by string)
+- Comma (,) → true (end number, followed by next value)
+- All other characters → false (extend current number or error)
+
+LITERAL_CAPTURE_PUSH_TRANSCRIPT: Determines when to add the current literal token (true/false/null) to the transcript while scanning a literal. True for any grammar character: , [ ] { } " space tab newline (This is only used in the first scan, in the second step capture_missing_tokens, we will be able to separate the literal and value separator)
+
+GRAMMAR_CAPTURE_INCREASE_LENGTH: Determines when to extend the current token length while scanning in grammar mode. True for Digits (0-9) -> starting numeric scan, Letters for literals (f, t, n, r, u, e, a, l, s) -> starting literal scan. For structural tokens, we don't count its length (is just  1). For string tokens, we are expecting to see a " first before seeing letters.
+
+STRING_CAPTURE_INCREASE_LENGTH: Determines when to extend the current string token while scanning inside a string. True for all printable characters except for Quote (ends the string)
+NUMERIC_CAPTURE_INCREASE_LENGTH: True for 0-9
+LITERAL_CAPTURE_INCREASE_LENGTH: True for t,r,u,e,f,a,l,s,n
+
+GRAMMAR_CAPTURE_ERROR_FLAG
+STRING_CAPTURE_ERROR_FLAG
+NUMERIC_CAPTURE_ERROR_FLAG
+LITERAL_CAPTURE_ERROR_FLAG
+
+PROCESS_RAW_TRANSCRIPT_TABLE: This table is used to post-process the raw transcript and add missing grammar tokens that were not captured during the initial scanning in build_transcript. Input: encoded_ascii of the last token in each entry (scan_mode + ascii character). Output: containing: token: The token type for this entry, new_grammar: Whether to add a missing grammar token, and scan_token: The type of grammar token to add (if needed), such as END_OBJECT_TOKEN }, or VALUE_SEPARATOR_TOKEN comma.
\ No newline at end of file
diff --git a/src/json.nr b/src/json.nr
index a39b7b7..754165e 100644
--- a/src/json.nr
+++ b/src/json.nr
@@ -77,8 +77,7 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
     }
 }
 
-unconstrained fn __check_entry_ptr_bounds(entry_ptr: Field, max: u32) {
-    entry_ptr.assert_max_bit_size::<32>();
+unconstrained fn __check_entry_ptr_bounds(entry_ptr: u32, max: u32) {
     // n.b. even though this assert is in an unconstrained function, an out of bounds error will be triggered when writing into self.key_data[entry_ptr]
     assert(entry_ptr as u32 < max - 1, "create_json_entries: MaxNumValues limit exceeded!");
 }
@@ -117,6 +116,7 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
     }
 
     // TODO: when impl is more mature, merge this into create_json_entries
+    // correctly identify and label key tokens that are previous labeled as string tokens
     fn keyswap(&mut self) {
         // TODO: this won't work if 1st entry is a key!
         let mut current = TranscriptEntry::from_field(self.transcript[0]);
@@ -125,6 +125,7 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
         for i in 0..MaxNumTokens - 1 {
             next = TranscriptEntry::from_field(self.transcript[i + 1]);
 
+            //if next token is :, current token is a key, so next_is_key = 1 and we can swap the token
             let next_is_key = next.token == KEY_SEPARATOR_TOKEN as Field;
 
             let valid_token = TOKEN_IS_STRING[cast_num_to_u32(current.token)];
@@ -133,6 +134,7 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
             }
 
             let old_transcript = self.transcript[i];
+            // only change is set the token to be KEY_TOKEN
             let new_transcript = TranscriptEntry::to_field(
                 TranscriptEntry {
                     token: KEY_TOKEN as Field,
@@ -226,34 +228,65 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
         assert(depth == 0, "validate_tokens: unclosed objects or arrays");
     }
 
-    /**
-     * @brief given a processed transcript of json tokens, compute a list of json entries that describes the values within the JSON blob
-     * @details a 'value' here is either an Object, Array, String, Numeric or Literal.
-     *          e.g. "[ 1, 2, 3 ]" contains 4 values (3 Numeric types and the Array that contains them)
-     *
-     *          To avoid branches and if statements, we construct a state transition function out of the lookup table TOKEN_FLAGS_TABLE
-     *          This table takes as an input the following:
-     *              1. The token value of an element in the transcript
-     *              2. The layer type the previous token is located in (i.e. are we in an array or an object?)
-     *          The table outputs the following data:
-     *              1. Should we create a new json entry? (i.e. is the token a STRING_TOKEN, LITERAL_TOKEN, NUMERIC_TOKEN, END_ARRAY_TOKEN, END_OBJECT_TOKEN)
-     *              2. Is the token `}` or `]`?
-     *              3. Is the token `{` or `[`?
-     *              4. Given the current layer type and the token being queried, what should the new layer type be?
-     *              5. Is the token `KEY_TOKEN`?
-     *              6. Is the token a `STRING_TOKEN`, `NUMERIC_TOKEN` OR `VALUE_TOKEN`?
-     *              7. Is the token one that we should skip over? `,` or `:`
-     **/
+    /// Parses [`Self::transcript`] to populate [`Self::json_entries_packed`] and [`Self::key_data`]
+    /// Given a processed transcript of json tokens, compute a list of json entries that describes the values within the JSON blob
+    ///
+    /// [`Self::json_entries_packed`] is a [JSONEntry] struct whose members have been packed into a single Field element.
+    ///
+    /// A 'value' here is either an Object, Array, String, Numeric or Literal.
+    ///         e.g. "[ 1, 2, 3 ]" contains 4 values (3 Numeric types and the Array that contains them)
+    ///
+    ///         To avoid branches and if statements, we construct a state transition function out of the lookup table TOKEN_FLAGS_TABLE
+    ///         This table takes as an input the following:
+    ///             1. The token value of an element in the transcript
+    ///             2. The layer type the previous token is located in (i.e. are we in an array or an object?)
+    ///         The table outputs the following data:
+    ///             1. Should we create a new json entry? (i.e. is the token a STRING_TOKEN, LITERAL_TOKEN, NUMERIC_TOKEN, END_ARRAY_TOKEN, END_OBJECT_TOKEN)
+    ///             2. Is the token `}` or `]`?
+    ///             3. Is the token `{` or `[`?
+    ///             4. Given the current layer type and the token being queried, what should the new layer type be?
+    ///             5. Is the token `KEY_TOKEN`?
+    ///             6. Is the token a `STRING_TOKEN`, `NUMERIC_TOKEN` OR `VALUE_TOKEN`?
+    ///             7. Is the token one that we should skip over? `,` or `:`
+    ///
+    /// ## explanation of `parent_context_stack`
+    /// When recording a JSONEntry, we need to understand how many children (if any) a JSONEntry has,
+    /// as well as a way of accessing children given the parent JSONEntry object
+    /// Note: OBJECT_TOKEN and ARRAY_TOKEN have children. single values (NUMERIC_TOKEN, LITERAL_TOKEN, STRING_TOKEN) do not.
+    /// We define a "context stack" via `parent_context_stack` to track this data.
+    /// The front of `parent_context_stack` contains a JSONContextStackEntry (packed into a single Field for the purposes of efficient lookups) for the current parent
+    /// If we parse a token that creates a new parent (BEGIN_OBJECT_TOKEN or BEGIN_ARRAY_TOKEN), we push a new parent onto the stack
+    /// If we reach the end of an object or array (END_OBJECT_TOKEN, END_ARRAY_TOKEN) we pop the current parent off of the stack
+    /// Note: "stack" is used loosely here. We have a fixed-size array of packed JSONContextStackEntry vals and a pointer to the head of the stack.
+    /// (the array size defines the maximum number of entities the stack can contain, currently set at 32)
+    /// Note: the size param `32` is a magic number we should replace with a defined const global variable
+    /// To push: we increment the pointer by 1 and write a new entry at the pointer value
+    /// To pop: we decrement the pointer by 1 (we don't need to delete data because new data is written every time the pointer is incremented)
     fn create_json_entries(&mut self) {
-        let mut entry_ptr = 0;
+        let mut parent_context_stack: [Field; 32] = [0; 32];
+        let mut entry_ptr: u32 = 0;
+        // depth = points to the next unused slot in parent_context_stack
+        // Note: parent_context_stack[0] = the root JSON object
         let mut depth: Field = 1;
+        // how many children does the current parent have?
         let mut num_entries_at_current_depth: Field = 0;
-        let mut next_identity_value: Field = 1;
+
+        // current_identity_value = unique identifier for all JSON objects/arrays we create
         let mut current_identity_value: Field = 0;
+        // next_identity_value = smallest integer that we've not yet assigned as a unique identifier to the JSON objects/arrays we create
+        let mut next_identity_value: Field = 1;
+        // context = is the current parent an object or array?
+        // context: 0 for object, 1 for array
         let mut context = OBJECT_LAYER;
 
+        // current_key_index_and_length encodes 2 bits of data in a single Field element (to save some gates)
+        // note: would be more readable if we had a custom struct that wrapped a Field element with defined update methods
+        // 1. what is the key index? (index = unique identifier, starts at 0)
+        // 2. what is the size of the key in bytes?
+        // current_key_index_and_length = index + length * 0x10000 (assumes index does not exceed 2^16. I don't think we check for this, there is an assumption that the size of the circuit would be too large to compile/run if the JSON blob has over 2^16 unique keys)
         let mut current_key_index_and_length: Field = 0;
 
+        //stack won't pop elements, but will push new elements by overwriting the top element
         let mut parent_context_stack: [Field; 32] = [0; 32];
         let mut tokens: [Field; MaxNumTokens] = [0; MaxNumTokens];
         //  maybe 71.75 gates per iteration
@@ -265,13 +298,25 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
                 TranscriptEntry::from_field(self.transcript[i]);
 
             tokens[i] = token;
+
+            // The TOKEN_FLAGS_TABLE table encodes the following information:
+            // Given the current token and the context (whether the parent is an object or array),
+            // we can extract the following information from TOKEN_FLAGS_TABLE via a simple lookup:
+            // 1. Should we create a new JSONEntry object?
+            //  - i.e is the token END_ARRAY_TOKEN, END_OBJECT_TOKEN, STRING_TOKEN, NUMERIC_TOKEN, LITERAL_TOKEN
+            // 2. If the token creates a new parent (BEGIN_OBJECT_TOKEN or BEGIN_ARRAY_TOKEN), what is the context value? (0 = object, 1 = array)
+            // 3. Has the parent entity changed?
+            //  - `preserve_num_entries` = 1 if the parent entity changes, not the best variable name
+            // 5. Various bools that describe the token, which are cheaper to acquire this way than via comparison operators
+            // - `is_end_of_object_or_array, `is_start_of_object_or_array`, `is_key_token`, `is_value_token`
+            // See comments in token_flags.nr for more details
             // 13 gates
             let TokenFlags {
     create_json_entry,
     is_end_of_object_or_array,
     is_start_of_object_or_array,
     new_context,
-    is_key_token: update_key,
+    is_key_token,
     is_value_token,
     preserve_num_entries,} = TokenFlags::from_field(
                 TOKEN_FLAGS_TABLE[cast_num_to_u32(token) + context * NUM_TOKENS],
@@ -282,16 +327,30 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
             let is_end_of_object_or_array = is_end_of_object_or_array as Field;
             let is_start_of_object_or_array = is_start_of_object_or_array as Field;
             let new_context = new_context as Field;
-            let update_key = update_key as Field;
+            let is_key_token = is_key_token as Field;
             let is_value_token = is_value_token as Field;
             let preserve_num_entries = preserve_num_entries as Field;
 
+            // Determine what the current key index is and the key size in bytes
+            // (key index = byte location in the original JSON of the key)
+            // Pseudocode equivalent:
+            // current_key_index_and_length = update_key ? (index + length * 0x10000) : current_key_index_and_length
+
             // 2 gates
+            // only update current_key_index_and_length if the token is a key token
             let diff = (index + length * 0x10000) - current_key_index_and_length;
             std::as_witness(diff);
-            current_key_index_and_length = diff * update_key + current_key_index_and_length;
+            current_key_index_and_length =
+                diff * is_key_token as Field + current_key_index_and_length;
             std::as_witness(current_key_index_and_length);
 
+            // If the current token is BEGIN_OBJECT_TOKEN or BEGIN_ARRAY_TOKEN,
+            // we need to push a new parent object into `parent_context_stack`.
+            // We apply a trick here to avoid branching: regardless of the token type,
+            // we *always* write a new stack entry into `parent_context_stack[depth]`.
+            // If the current token is not BEGIN_OBJECT_TOKEN or BEGIN_ARRAY_TOKEN,
+            // the data we write never gets read.
+            // Note: we only update the value of `depth` if token == BEGIN_OBJECT_TOKEN or BEGIN_ARRAY_TOKEN,
             // 2 gates
             let new_context_stack_entry = JSONContextStackEntry::to_field(
                 JSONContextStackEntry {
@@ -303,60 +362,122 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
                 },
             );
             // subtotal 22.25
-            // 1 gate
-            let depth_index: Field = (depth - 1);
-            // 3.5 gates
-            let previous_stack_entry_packed = parent_context_stack[cast_num_to_u32(depth_index)];
 
+            parent_context_stack[cast_num_to_u32(depth)] = new_context_stack_entry;
+
+            // `previous_stack_entry` = Extract information about the parent of the current token
+            // If the current token = END_OBJECT_TOKEN or END_ARRAY_TOKEN, previous_stack_entry = data that describes the object or array we are about to create
             // 9.5 gates
+            let depth_index: Field = (depth - 1);
+            let previous_stack_entry_packed = parent_context_stack[cast_num_to_u32(depth_index)];
             let previous_stack_entry =
                 JSONContextStackEntry::from_field(previous_stack_entry_packed);
 
-            let object_or_array_entry: JSONEntry = JSONEntry {
-                array_pointer: previous_stack_entry.num_entries,
-                entry_type: token,
-                child_pointer: 0,
-                num_children: num_entries_at_current_depth,
-                json_pointer: previous_stack_entry.json_index,
-                json_length: length,
-                parent_index: previous_stack_entry.current_identity,
-                id: current_identity_value,
-            };
-            // 0
-            let value_entry: JSONEntry = JSONEntry {
-                array_pointer: num_entries_at_current_depth,
-                entry_type: token,
-                child_pointer: 0,
-                num_children: 0,
-                json_pointer: index,
-                json_length: length,
-                parent_index: current_identity_value,
-                id: 0,
-            };
-
-            // 3 gates
-            let object_or_array_entry_packed = object_or_array_entry.to_field();
-            // 2 gates
-            let value_entry_packed = value_entry.to_field();
-
-            // 2 gates
-            let diff = object_or_array_entry_packed - value_entry_packed;
-            std::as_witness(diff);
-            let new_entry = diff * is_end_of_object_or_array + value_entry_packed;
-            std::as_witness(new_entry);
+            // update the depth of the context stack
+            // 1 gate
+            depth = depth + is_start_of_object_or_array - is_end_of_object_or_array;
 
-            // 3 gates
-            // subtotal 24 + 22.25 = 46.25
-            let old = current_identity_value;
-            current_identity_value = (next_identity_value * is_start_of_object_or_array);
-            std::as_witness(current_identity_value);
-            current_identity_value = current_identity_value
-                + (previous_stack_entry.current_identity * is_end_of_object_or_array);
-            std::as_witness(current_identity_value);
-            current_identity_value = current_identity_value + old * preserve_num_entries;
-            std::as_witness(current_identity_value);
+            /* PSEUDOCODE FOR FOLLOWING BLOCK:
+            let mut new_entry: JSONEntry = JSONEntry::default()
+            if create_json_entry {
+                let array_entry = object_or_array_entry_packed ? object_or_array_entry_packed : value_entry_packed;
+                self.json_entries_packed[cast_num_to_u32(entry_ptr)] = array_entry;   }
+            */
+            {
+                // If token = END_OBJECT_TOKEN or END_ARRAY_TOKEN, we derive the following:
+                // array_pointer = if *this entity's parent is an array*, `array_pointer` will point to the index in the array where the current object/array is located
+                // entry_type = END_OBJECT_TOKEN or END_ARRAY_TOKEN
+                // child_pointer = points to the first child object, is set in compute_keyhash_and_sort_json_entries
+                // num_children = how many entities does this object/array contain?
+                // json_pointer = byte index of the original json where this object/array starts
+                // parent_index = what's the ID of the object/array's parent?
+                let object_or_array_entry: JSONEntry = JSONEntry {
+                    array_pointer: previous_stack_entry.num_entries,
+                    entry_type: token,
+                    child_pointer: 0,
+                    num_children: num_entries_at_current_depth,
+                    json_pointer: previous_stack_entry.json_index,
+                    json_length: length,
+                    parent_index: previous_stack_entry.current_identity,
+                    id: current_identity_value,
+                };
+                // If token = STRING_TOKEN, NUMERIC_TOKEN or LITERAL_TOKEN, we derive the following:
+                // array_pointer = if *this entity's parent is an array*, `array_pointer` will point to the index in the array where the current string/number/literal is located
+                // entry_type = STRING_TOKEN, NUMERIC_TOKEN or LITERAL_TOKEN
+                // child_pointer = not relevant for value entries (cannot contain children)
+                // num_children = not relevant for value entries (cannot contain children)
+                // json_pointer = byte index of the original json where this string/number/literal starts
+                // parent_index = what's the ID of the string/number/literal's parent?
+                let value_entry: JSONEntry = JSONEntry {
+                    array_pointer: num_entries_at_current_depth,
+                    entry_type: token,
+                    child_pointer: 0,
+                    num_children: 0,
+                    json_pointer: index,
+                    json_length: length,
+                    parent_index: current_identity_value,
+                    id: 0,
+                };
+                // 3 gates
+                let object_or_array_entry_packed = object_or_array_entry.to_field();
+                // 2 gates
+                let value_entry_packed = value_entry.to_field();
+                // 2 gates
+                let diff = object_or_array_entry_packed - value_entry_packed;
+                std::as_witness(diff);
+                let new_entry = diff * is_end_of_object_or_array + value_entry_packed;
+                std::as_witness(new_entry);
+                // 4.5 gates
+                self.json_entries_packed[entry_ptr] =
+                    JSONEntryPacked { value: new_entry * create_json_entry as Field };
+            }
+            // Update `next_identity_value` and `current_identity_value` (unique identifiers for JSON entries)
+            // If we've ended an object/array, set `current_identity_value` to the value of the object/array's parent (whose context we're now in)
+            // If we've started an object/array, set `current_identity_value` to `next_identity_value` (the next unused unique identifier), also increment `next_identity_value` to a new unique value
+            //  When processing normal tokens: keep current ID unchanged
+            // Pseudocode for the following:
+            //
+            //  if !preserve_num_entries {
+            //      if is_start_of_object_or_array {
+            //          current_identity_value = next_identity_value;
+            //      }
+            //      else if is_end_of_object_or_array {
+            //          current_identity_value = previous_stack_entry.current_identity;
+            //      }
+            //      next_identity_value += is_start_of_object_or_array;
+            //  }
+            //
+            // Note: `preserve_num_entries` is mutually exclusive with `is_start_of_object_or_array` and `is_end_of_object_or_array`
+            // - When `is_start_of_object_or_array = 1` -> `preserve_num_entries = 0` (start new object/array)
+            // - When `is_end_of_object_or_array = 1` -> `preserve_num_entries = 0` (end object/array)
+            // - When `preserve_num_entries = 1` -> both flags = 0 (normal token)
+            //
+            // 4 gates
+            {
+                let old = current_identity_value;
+                current_identity_value = (next_identity_value * is_start_of_object_or_array);
+                std::as_witness(current_identity_value);
+                current_identity_value = current_identity_value
+                    + (previous_stack_entry.current_identity * is_end_of_object_or_array);
+                std::as_witness(current_identity_value);
+                current_identity_value = current_identity_value + old * preserve_num_entries;
+                std::as_witness(current_identity_value);
+                // If the current token creates an object or array, subsequent entries will be a child of this object
+                // i.e. we need to assign them a new identifier so increase `next_identity_value`
+                next_identity_value = next_identity_value + is_start_of_object_or_array;
+                std::as_witness(next_identity_value);
+            }
 
+            // Update the number of entries in the parent object/array
+            // Pseudocode:
+            // if (!preserve_num_entries && is_value_token) {
+            //  num_entries_at_current_depth += 1;
+            // } else if (is_end_of_object_or_array) {
+            //  num_entries_at_current_depth = previous_stack_entry.num_entries + 1;
+            // }
             // 2 gates
+            // If we ses a value token (string/number/literal), we add 1 to count. If we see , or :, no change.
+            // If preserve_num_entries is 0 (i.e. start or end of object or array) then we reset variable to 0.
             num_entries_at_current_depth =
                 num_entries_at_current_depth * preserve_num_entries + is_value_token;
             std::as_witness(num_entries_at_current_depth);
@@ -364,13 +485,45 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
                 + (previous_stack_entry.num_entries + 1) * is_end_of_object_or_array;
             std::as_witness(num_entries_at_current_depth);
 
+            // Set the value of `context` (badly named: are we in an object or array? context == 0 => object, context == 1 => array)
+            // If current token is END_OBJECT_TOKEN or END_ARRAY_TOKEN, set context to the context value in previous_stack_entry
+            // (i.e. restore the context to whatever the parent of the object/array is)
+            // Pseudocode:
+            // if (is_end_of_object_or_array) {
+            //   context = previous_stack_entry.context
+            // } else {
+            //   context = new_context
+            // }
             // 1 gate
             // if `is_end_of_object_or_array == 1`, `new_context = 0` so we can do something cheaper than a conditional select:
+            // If is_end_of_object_or_array is 1, then new_context is 0, so set context = previous_stack_entry.context
+            // If is_end_of_object_or_array is 0, then set context = new_context
             context = cast_num_to_u32(
                 previous_stack_entry.context * is_end_of_object_or_array + new_context,
             );
             std::as_witness(context as Field);
+
+            // Update data that describes the key for the current token.
+            // If we are creating a JSON entry, we also populate `self.key_data` with info that describes the current entry's key
+            // key_data contains 3 members that are packed into a Field:
+            // * the key index (where in the original JSON blob does the key start?)
+            // * the key length (length of the key in bytes)
+            // * current_identity_value (unique identifier for the key's JSON object. starts at 0)
+            // * in the current parent object/array, how many JSON entries deep is the key's associated JSON object?
+            // TODO: would be much more readable if we have a custom struct `KeyData` that wrapped a Field elemenet with sensible helper methods
+            // Pseudocode:
+            // if (create_json_entry) {
+            //   let mut new_key_data;
+            //   if (is_value_token) {
+            //     new_key_data = make_key(current_key_index_and_length, current_identity_value, num_entries_at_current_depth - 1);
+            //   } else if (is_end_of_object_or_array) {
+            //     new_key_data = make_key(previous_stack_entry.current_key_index_and_length, current_identity_value, num_entries_at_current_depth - 1);
+            //   }
+            //   self.key_data[entry_ptr] = new_key_data;
+            // }
             // 3 gates
+            // If context is 0 (object context), then don't take the num_entries_at_current_depth term into account
+            // because searching for a key only depends of the key name, not position, as opposed to array context where we need to look up by position/index.
             let common_term = current_identity_value
                 + context as Field * (num_entries_at_current_depth - 1) * 0x1000000000000;
             std::as_witness(common_term);
@@ -382,29 +535,14 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
                     * is_end_of_object_or_array
                     * 0x10000;
             std::as_witness(new_key_data);
-
             // 3.5 gates
-            self.key_data[cast_num_to_u32(entry_ptr)] = new_key_data * create_json_entry;
-
-            // 3.5 gates
-            parent_context_stack[cast_num_to_u32(depth)] = new_context_stack_entry;
-
-            // 4.5 gates
-            self.json_entries_packed[cast_num_to_u32(entry_ptr)] =
-                JSONEntryPacked { value: new_entry * create_json_entry };
+            self.key_data[entry_ptr] = new_key_data * create_json_entry as Field;
 
+            // Update `entry_ptr` (points to the head of self.key_data and self.json_entries_packed)
             // 1 gate
-            next_identity_value = next_identity_value + is_start_of_object_or_array;
-            std::as_witness(next_identity_value);
+            entry_ptr += create_json_entry as u32;
 
-            // 1 gate
-            depth = depth + is_start_of_object_or_array - is_end_of_object_or_array;
-
-            // 1 gate
-            // 2105 + 46.25
             // subtotal 66.75?
-            entry_ptr += create_json_entry;
-            std::as_witness(entry_ptr);
         }
         self.validate_tokens(tokens);
     }
@@ -441,13 +579,14 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
         // Safety: check the comments below
         let raw_transcript = unsafe { __build_transcript::<NumBytes, MaxNumTokens>(self.json) };
 
+        // steps to verify the transcript is correct
         // 14 gates per iteration, plus fixed cost for initing 2,048 size lookup table (4,096 gates)
         let mut previous_was_potential_escape_sequence: bool = false;
         for i in 0..NumBytes {
             let ascii = self.json[i];
 
             // 1 gate
-            let encoded_ascii = previous_was_potential_escape_sequence as Field * 1024
+            let encoded_ascii: Field = previous_was_potential_escape_sequence as Field * 1024
                 + scan_mode * 256
                 + ascii as Field;
             std::as_witness(encoded_ascii);
@@ -459,10 +598,12 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
                 ScanData::from_field(capture_flags);
 
             // 2 gates
-            let raw = raw_transcript[cast_num_to_u32(transcript_ptr)];
+            let raw: Field = raw_transcript[cast_num_to_u32(transcript_ptr)];
 
+            // TODO: document this
+            // TODO: why are we comparing a derived quantity against `raw_transcript` instead of constructing `raw_transcript` directly (faster)
             // 1 gate
-            let diff = raw
+            let diff: Field = raw
                 - RawTranscriptEntry::to_field(
                     RawTranscriptEntry { encoded_ascii, index: i as Field - length, length },
                 );
@@ -482,10 +623,10 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
             scan_mode = scan_token;
         }
 
-        // we encode error flag into the scan_token value, which must be less than 4
+        // we encode error flag into the scan_token value, which must be less than 4 (object, array, string, literal)
         // the lookup into JSON_CAPTURE_TABLE applies an implicit 2-bit range check on `scan_token`
         // however this does not get triggered if the final byte scanned produces an error state
-        length.assert_max_bit_size::<2>();
+        scan_mode.assert_max_bit_size::<2>();
 
         JSON {
             json: self.json,
@@ -507,7 +648,8 @@ impl<let NumBytes: u32, let NumPackedFields: u32, let MaxNumTokens: u32, let Max
      * @brief Check for missing tokens that we could have missed in `build_transcript`
      * @details If we had a json string where a NUMERIC_TOKEN or LITERAL_TOKEN is directly succeeded by a VALUE_SEPARATOR_TOKEN, END_OBJECT_TOKEN, END_ARRAY_TOKEN,
      *          we will have missed the latter token.
-     *          We pick these up via the lookup table PROCESS_RAW_TRANSCRIPT_TABLE
+     *          We pick these up via the lookup table PROCESS_RAW_TRANSCRIPT_TABLE. 
+     *          The entries in self.raw_transcript currently look like false}, true], null, where the grammar tokens are counted as part of the token.
      **/
     fn capture_missing_tokens(&mut self) {
         let mut transcript_ptr: Field = 0;
diff --git a/src/json_entry.nr b/src/json_entry.nr
index 42f781a..d2b395b 100644
--- a/src/json_entry.nr
+++ b/src/json_entry.nr
@@ -235,7 +235,7 @@ impl std::cmp::Eq for JSONEntry {
         let num_children = (self.num_children == other.num_children);
         let json_pointer = (self.json_pointer == other.json_pointer);
         let json_length = (self.json_length == other.json_length);
-        array_ptr | entry | child | num_children | json_pointer | json_length
+        array_ptr & entry & child & num_children & json_pointer & json_length
     }
 }
 
diff --git a/src/token_flags.nr b/src/token_flags.nr
index da9aa79..f5f8c2c 100644
--- a/src/token_flags.nr
+++ b/src/token_flags.nr
@@ -1,3 +1,17 @@
+/// Describes information extracted from [TOKEN_FLAGS_TABLE], given a [Token] enum and a layer type (todo more data)
+/// * `create_json_entry` : is this token linked to the creating of a [JSONEntry] object?
+///     - we generate corresponding [JSONEntry] objects for `STRING_TOKEN`, `NUMERIC_TOKEN`, `LITERAL_TOKEN`, `END_OBJECT_TOKEN`, `END_ARRAY_TOKEN`
+/// * `is_end_of_object_or_array`: `token == END_OBJECT_TOKEN || token == END_ARRAY_TOKEN`
+/// * `is_start_of_object_or_array`: `token == START_OBJECT_TOKEN || token == `START_ARRAY_TOKEN`
+/// * `new_context`: describes whether the next token being scanned belongs to an object or array
+///     - `new_context` updates whenever we parse a `START_OBJECT_TOKEN` or `START_ARRAY_TOKEN`
+///     - we are utilizing the TOKEN_FLAGS_TABLE to execute the following (equivalent) logic for cheap:
+///     - `if (token == START_OBJECT_TOKEN) { new_context = OBJECT_LAYER; } else if (token == START_ARRAY_TOKEN { new_context = ARRAY_LAYER; })
+/// * `is_key_token`: `token == KEY_TOKEN`
+/// * `is_value_token`: `token == STRING_TOKEN || token == NUMERIC_TOKEN || token == LITERAL_TOKEN`
+/// * `preserve_num_entries`: `!(is_start_of_object_or_array || is_end_of_object_or_array)`
+/// *   - as we parse our tokens, we keep track of how many children the current parent object contains.
+///     - when `preserve_num_entries == 1` implies the current parent object has not changed (could do with renaming this variable)
 pub(crate) struct TokenFlags {
     pub(crate) create_json_entry: bool,
     pub(crate) is_end_of_object_or_array: bool,
@@ -10,6 +24,12 @@ pub(crate) struct TokenFlags {
 
 impl TokenFlags {
 
+    /// Convert a Field element that contains a packed TokenFlags object into a real TokenFlags object
+    /// Note: when accessing these objects from lookup tables, it is much more efficient to represent a TokenFlags object by a single Field element,
+    /// so only 1 lookup operation is required (vs 1 per struct member)
+    /// This is an unconstrained fn which the `from_field` method uses for witness generation
+    /// (fewer constraints required to validate the result of `__from_field` vs applying `__from_field` logic directly in a constrained fn)
+    /// Note: code would be much more readable if we had an explicit PackedTokenFlags struct that wrapped a Field element
     unconstrained fn __from_field(f: Field) -> Self {
         let bytes: [u8; 7] = f.to_be_bytes();
         let create_json_entry = bytes[0] != 0;
@@ -31,17 +51,18 @@ impl TokenFlags {
         }
     }
 
+    /// Convert a Field element that contains a packed TokenFlags object into a real TokenFlags object
     pub(crate) fn from_field(f: Field) -> Self {
         // 10 gates
         // Safety: check the comments below
         let r = unsafe { TokenFlags::__from_field(f) };
-
         // asserts the relation of r and f
         assert(r.to_field() == f);
         r
     }
 
-    // 4 gates
+    /// Pack a TokenFlags object into a Field element
+    /// 4 gates
     pub(crate) fn to_field(self) -> Field {
         (self.preserve_num_entries as Field)
             + (self.is_value_token as Field) * 0x100