1-
21# Code objects
32
43A ` CodeObject ` is a builtin Python type that represents a compiled executable,
@@ -43,7 +42,7 @@ so a compact format is very important.
4342Note that traceback objects don't store all this information -- they store the start line
4443number, for backward compatibility, and the "last instruction" value.
4544The rest can be computed from the last instruction (` tb_lasti ` ) with the help of the
46- locations table. For Python code, there is a convenience method
45+ locations table. For Python code, there is a convenience method
4746(` codeobject.co_positions ` )[ https://docs.python.org/dev/reference/datamodel.html#codeobject.co_positions ]
4847which returns an iterator of ` ({line}, {endline}, {column}, {endcolumn}) ` tuples,
4948one per instruction.
@@ -75,9 +74,11 @@ returned by the `co_positions()` iterator.
7574> See [ ` Objects/lnotab_notes.txt ` ] ( ../Objects/lnotab_notes.txt ) for more details.
7675
7776` co_linetable ` consists of a sequence of location entries.
78- Each entry starts with a byte with the most significant bit set, followed by zero or more bytes with the most significant bit unset.
77+ Each entry starts with a byte with the most significant bit set, followed by
78+ zero or more bytes with the most significant bit unset.
7979
8080Each entry contains the following information:
81+
8182* The number of code units covered by this entry (length)
8283* The start line
8384* The end line
@@ -86,54 +87,88 @@ Each entry contains the following information:
8687
8788The first byte has the following format:
8889
89- Bit 7 | Bits 3-6 | Bits 0-2
90- ---- | ---- | ----
91- 1 | Code | Length (in code units) - 1
90+ | Bit 7 | Bits 3-6 | Bits 0-2 |
91+ | ------- | ---------- | ---------------------------- |
92+ | 1 | Code | Length (in code units) - 1 |
9293
9394The codes are enumerated in the ` _PyCodeLocationInfoKind ` enum.
9495
95- ## Variable-length integer encodings
96+ ### Variable-length integer encodings
9697
97- Integers are often encoded using a variable- length integer encoding
98+ Integers are often encoded using a variable length integer encoding
9899
99- ### Unsigned integers (` varint ` )
100+ #### Unsigned integers (` varint ` )
100101
101102Unsigned integers are encoded in 6-bit chunks, least significant first.
102103Each chunk but the last has bit 6 set.
103104For example:
104105
105106* 63 is encoded as ` 0x3f `
106- * 200 is encoded as ` 0x48 ` , ` 0x03 `
107+ * 200 is encoded as ` 0x48 ` , ` 0x03 ` since `` 200 = (0x03 << 6) | 0x48 `` .
108+
109+ The following helper can be used to convert an integer into a ` varint ` :
110+
111+ ``` py
112+ def encode_varint (s ):
113+ ret = []
114+ while s >= 64 :
115+ ret.append(((s & 0x 3F ) | 0x 40 ) & 0x 3F )
116+ s >>= 6
117+ ret.append(s & 0x 3F )
118+ return bytes (ret)
119+ ```
120+
121+ To convert a ` varint ` into an unsigned integer:
122+
123+ ``` py
124+ def decode_varint (chunks ):
125+ ret = 0
126+ for chunk in reversed (chunks):
127+ ret = (ret << 6 ) | chunk
128+ return ret
129+ ```
107130
108- ### Signed integers (` svarint ` )
131+ #### Signed integers (` svarint ` )
109132
110133Signed integers are encoded by converting them to unsigned integers, using the following function:
111- ``` Python
112- def convert (s ):
134+
135+ ``` py
136+ def svarint_to_varint (s ):
113137 if s < 0 :
114- return ((- s)<< 1 ) | 1
138+ return ((- s) << 1 ) | 1
115139 else :
116- return (s<< 1 )
140+ return s << 1
141+ ```
142+
143+ To convert a ` varint ` into a signed integer:
144+
145+ ``` py
146+ def varint_to_svarint (uval ):
147+ return - (uval >> 1 ) if uval & 1 else (uval >> 1 )
117148```
118149
119- * Location entries*
150+ ### Location entries
120151
121152The meaning of the codes and the following bytes are as follows:
122153
123- Code | Meaning | Start line | End line | Start column | End column
124- ---- | ---- | ---- | ---- | ---- | ----
125- 0-9 | Short form | Δ 0 | Δ 0 | See below | See below
126- 10-12 | One line form | Δ (code - 10) | Δ 0 | unsigned byte | unsigned byte
127- 13 | No column info | Δ svarint | Δ 0 | None | None
128- 14 | Long form | Δ svarint | Δ varint | varint | varint
129- 15 | No location | None | None | None | None
154+ | Code | Meaning | Start line | End line | Start column | End column |
155+ | ------- | ---------------- | --------------- | ---------- | --------------- | --------------- |
156+ | 0-9 | Short form | Δ 0 | Δ 0 | See below | See below |
157+ | 10-12 | One line form | Δ (code - 10) | Δ 0 | unsigned byte | unsigned byte |
158+ | 13 | No column info | Δ svarint | Δ 0 | None | None |
159+ | 14 | Long form | Δ svarint | Δ varint | varint | varint |
160+ | 15 | No location | None | None | None | None |
130161
131162The Δ means the value is encoded as a delta from another value:
163+
132164* Start line: Delta from the previous start line, or ` co_firstlineno ` for the first entry.
133- * End line: Delta from the start line
165+ * End line: Delta from the start line.
166+
167+ ### The short forms
134168
135- * The short forms*
169+ Codes 0-9 are the short forms. The short form consists of two bytes,
170+ the second byte holding additional column information. The code is the
171+ start column divided by 8 (and rounded down).
136172
137- Codes 0-9 are the short forms. The short form consists of two bytes, the second byte holding additional column information. The code is the start column divided by 8 (and rounded down).
138173* Start column: ` (code*8) + ((second_byte>>4)&7) `
139174* End column: ` start_column + (second_byte&15) `
0 commit comments