You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.rst
+28-38Lines changed: 28 additions & 38 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,45 +16,38 @@ Introduction
16
16
*decompyle3* translates Python bytecode back into equivalent Python
17
17
source code. It accepts bytecodes from Python version 3.7 on.
18
18
19
-
For decompilation of older Python bytecode see uncompyle6_.
19
+
For decompilation of older Python bytecode, see uncompyle6_.
20
20
21
21
Why this?
22
22
---------
23
23
24
24
Uncompyle6 is awesome, but it has a fundamental problem in the way
25
-
it handles control flow. In the early days of Python when there was
26
-
little optimization and code was generated in a very template-oriented
27
-
way, figuring out control flow-structures could be done by simply looking at code patterns.
28
-
29
-
Over the years more code optimization, specifically around handling
30
-
jumps has made it harder to support detecting control flow strictly
31
-
from code patterns. This was noticed as far back as Python 2.4 (2004)
32
-
but since this is a difficult problem, so far it hasn't been tackled
25
+
it handles control flow. In the early days of Python, when there was
26
+
little optimization and code was generated in a very template-oriented way, figuring out control flow structures could be done by simply looking at code patterns.
27
+
28
+
Over the years, more code optimization, specifically around handling jumps, has made it harder to support detecting control flow strictly
29
+
from code patterns. This was noticed as far back as Python 2.4 (2004), but since this is a difficult problem, so far it hasn't been tackled
33
30
in a satisfactory way.
34
31
35
-
The initial attempt to fix to this problem was to add markers in the
36
-
instruction stream, initially this was a `COME_FROM` instruction, and
37
-
then use that in pattern detection.
32
+
The initial attempt to fix this problem was to add markers in the
33
+
instruction stream, initially this was a `COME_FROM` instruction, and then use that in pattern detection.
38
34
39
35
Over the years, I've extended that to be more specific, so
40
36
`COME_FROM_LOOP` and `COME_FROM_WITH` were added. And I added checks
41
-
at grammar-reduce time to make try to make sure jumps match with
42
-
supposed `COME_FROM` targets.
37
+
at grammar-reduce time to try to make sure jumps match with the supposed `COME_FROM` targets.
43
38
44
-
However all of this is complicated, not robust, has greatly slowed
45
-
down deparsing and is not really tenable.
39
+
However, all of this is complicated, not robust, has greatly slowed down deparsing and is not really tenable.
46
40
47
-
So in this project we started rewriting and refactoring the grammar.
41
+
In this project, we began rewriting and refactoring the grammar.
48
42
49
-
However it is clear that even this isn't enough. Control flow needs
50
-
to be addressed by using dominators and reverse-dominators which
51
-
the python-control-flow_ project can give.
43
+
However, even this isn't enough. Control flow needs
44
+
to be addressed by using dominators and reverse-dominators, which the python-control-flow_ project can give.
52
45
53
46
This I am *finally* slowly doing in yet another non-public project. It
54
-
is a lot of work. Funding in the form of sponsorhip while greatly
55
-
appreciated isn't commensurate with the amount of effort, and
56
-
currently I have a full-time job. So it may take time before it is
57
-
available publicly, if at all.
47
+
is a lot of work. Funding in the form of sponsorship, while greatly
48
+
appreciated, isn't commensurate with the amount of effort, and
49
+
currently, I have a full-time job. So it may take time before it is
50
+
available publicly.
58
51
59
52
Requirements
60
53
------------
@@ -89,7 +82,7 @@ Running Tests
89
82
90
83
make check
91
84
92
-
A GNU makefile has been added to smooth over setting running the right
85
+
A GNU makefile has been added to smooth over setting up and running the right
93
86
command, and running tests from fastest to slowest.
94
87
95
88
If you have remake_ installed, you can see the list of all tasks
@@ -115,20 +108,18 @@ Verification
115
108
------------
116
109
117
110
If you want Python syntax verification of the correctness of the
118
-
decompilation process, add the :code:`--syntax-verify` option. However since
119
-
Python syntax changes, you should use this option if the bytecode is
111
+
decompilation process, add the :code:`--syntax-verify` option. However, since Python syntax changes, you should use this option if the bytecode is
120
112
the right bytecode for the Python interpreter that will be checking
121
113
the syntax.
122
114
123
-
You can also crosscompare the results with another python decompiler
115
+
You can also cross-compare the results with another Python decompiler
124
116
like unpyc37_ . Since they work differently, bugs here often aren't in
125
117
that, and vice versa.
126
118
127
119
There is an interesting class of these programs that is readily
128
-
available give stronger verification: those programs that when run
129
-
test themselves. Our test suite includes these.
120
+
available to give stronger verification: those programs that, when run, test themselves. Our test suite includes these.
130
121
131
-
And Python comes with another a set of programs like this: its test
122
+
And Python comes with another set of programs like this: its test
132
123
suite for the standard library. We have some code in :code:`test/stdlib` to
133
124
facilitate this kind of checking too.
134
125
@@ -146,20 +137,19 @@ This program can't decompile Microsoft Windows EXE files created by
146
137
Py2EXE_, although we can probably decompile the code after you extract
147
138
the bytecode properly. `Pydeinstaller <https://github.com/charles-dyfis-net/pydeinstaller>`_ may help with unpacking Pyinstaller bundlers.
148
139
149
-
Handling pathologically long lists of expressions or statements is
150
-
slow. We don't handle Cython_ or MicroPython which don't use bytecode.
140
+
Handling pathologically long lists of expressions or statements is slow. We don't handle Cython_ or MicroPython, which don't use bytecode.
151
141
152
142
There are numerous bugs in decompilation. And that's true for every
153
-
other CPython decompiler I have encountered, even the ones that
143
+
other CPython decompilers I have encountered, even the ones that
154
144
claimed to be "perfect" on some particular version like 2.4.
155
145
156
-
As Python progresses decompilation also gets harder because the
146
+
As Python progresses, decompilation also gets harder because the
157
147
compilation is more sophisticated and the language itself is more
158
148
sophisticated. I suspect that attempts there will be fewer ad-hoc
159
149
attempts like unpyc37_ (which is based on a 3.3 decompiler) simply
160
150
because it is harder to do so. The good news, at least from my
161
151
standpoint, is that I think I understand what's needed to address the
162
-
problems in a more robust way. But right now until such time as
152
+
problems in a more robust way. But right now, until such time as
163
153
project is better funded, I do not intend to make any serious effort
164
154
to support Python versions 3.8 or 3.9, including bugs that might come
165
155
in. I imagine at some point I may be interested in it.
@@ -178,10 +168,10 @@ issues above the queue of other things I might be doing instead.
178
168
See Also
179
169
--------
180
170
181
-
* https://github.com/andrew-tavera/unpyc37/ : indirect fork of https://code.google.com/archive/p/unpyc3/ The above projects use a different decompiling technique than what is used here. Instructions are walked. Some instructions use the stack to generate strings, while others don't. Because control flow isn't dealt with directly, it too suffers the same problems as the various `uncompyle` and `decompyle` programs.
171
+
* https://github.com/andrew-tavera/unpyc37/ : indirect fork of https://code.google.com/archive/p/unpyc3/. The above projects use a different decompiling technique than what is used here. Instructions are walked. Some instructions use the stack to generate strings, while others don't. Because control flow isn't dealt with directly, it too suffers the same problems as the various `uncompyle` and `decompyle` programs.
182
172
* https://github.com/rocky/python-xdis : Cross Python version disassembler
183
173
* https://github.com/rocky/python-xasm : Cross Python version assembler
184
-
* https://github.com/rocky/python-decompile3/wiki : Wiki Documents which describe the code and aspects of it in more detail
174
+
* https://github.com/rocky/python-decompile3/wiki : Wiki Documents that describe the code and aspects of it in more detail
0 commit comments