Skip to content

Count of bytes spent per data type #4

@jridgewell

Description

@jridgewell

jridgewell@7b22d70

I made a hacky counter to see where we're spending bytes. 0 here means that the delta was 0 (but we're really spending 1 byte), and a 1/2/…7 means that we're spending that many bytes encoding the data

Original Scopes

{
  startLine: { "0": 5219, "1": 39327, "2": 79, "3": 2 },
  startCol: { "0": 4278, "1": 22792, "2": 17557 },
  endLine: { "0": 9396, "1": 35123, "2": 107, "3": 1 },
  endCol: { "0": 4347, "1": 33013, "2": 7266, "3": 1 },
  flags: { "1": 44627 },
  names: { "1": 5, "2": 771, "3": 10530 },
  kind: { "0": 30920, "2": 370, "3": 13337 },
  variableLength: { "0": 23138, "1": 21437, "2": 50, "3": 2 },
  variable: { "1": 1320, "2": 971, "3": 62563 }
}

We're doing pretty well. We might be spending too many bytes encoding startCol and endCol (24k 2-byte VLQs). We're spending a lot encoding names (10 3-byte VLQs, maybe make this a relative delta?). Lots of bytes spent on kind (13k 3-byte VLQs), but this just means the we didn't insert the kind strings near each other in the names array. We could save 20k by skipping the variable length when there are no variables. But by far the worst is variable encoding, spending 62k 3-byte VLQs.


Generated Ranges

{
  startLine: { "0": 44625, "1": 2 },
  startCol: { "0": 11686, "1": 12103, "2": 20186, "3": 67, "4": 40, "5": 545 },
  endLine: { "0": 44627 },
  endCol: {
    "0": 2989,
    "1": 20129,
    "2": 21421,
    "3": 73,
    "4": 5,
    "5": 9,
    "7": 1
  },
  flags: { "1": 44627 },
  defSourceIdx: { "0": 44627 },
  defScopeIdx: { "0": 1, "1": 8, "2": 248, "3": 7938, "4": 36432 },
  callSourceIdx: {},
  callLine: {},
  callCol: {},
  bindingsLength: { "0": 22341, "1": 22234, "2": 50, "3": 2 },
  binding: { "0": 5, "1": 12459, "2": 5, "3": 53760, "4": 11082 },
  bindingLength: {},
  bindingLine: {},
  bindingCol: {}
}

Might be spending too much on start/end columns (40k 2-byte VLQs). We can save 88kb omitting the start/end lines. Lots of byte spent on encoding the definition's scope index (8k 3-bytes, 36k 4-bytes). We can save 44kb omitting the definition's source index. We could save 22kb omitting the bindings length, or 44kb if we just tied the length to the original scopes length and don't output a bindings length at all. But we're spending 53k 3-bytes and 11k 4-bytes encoding variables names.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions