Conversation
|
Thanks for working on this! I will try to look at this either tonight or tomorrow after work. |
8b42c5a to
ebfdc5c
Compare
|
@llllllllll ping -- any thing I can do? |
llllllllll
left a comment
There was a problem hiding this comment.
Sorry it took so long, I left some comments. This is pretty awesome, thanks!
| # expressions properly. They are always functions of one argument, | ||
| # so just do the right thing. | ||
| # so just do the right thing. Py3.4 also would fail without this | ||
| # hack, for list comprehensions too. (Haven't checked for other 3.x.) |
byterun/pyvm2.py
Outdated
| f = self.frame | ||
| opoffset = f.f_lasti | ||
| byteCode = byteint(f.f_code.co_code[opoffset]) | ||
| if f.py36_opcodes: |
There was a problem hiding this comment.
why do we need special processing here? the non-3.6 branch looks like it should handle this fine.
There was a problem hiding this comment.
Only until 3.4. I will change the special processing to check for 3.4 onward if that is required.
There was a problem hiding this comment.
So, using the opcodes = dis.get_instructions API is kind of hard in versions before Python 3.6 because on jumps, the VM sets the bytecode counter to the offset to be jumped to. In versions before Python 3.6, there is no direct correspondence to between the opcodes index and the offset to be jumped to -- and has to be computed for each jump. For Python 3.6, and above, there is a direct correspondence because each bytecode is now fixed width, and one can compute the jump index directly by dividing target by 2. Given the significant simplification Python 3.6 has provided, along with the API for bytecode iteration, perhaps we should special case from Python 3.6 onward?
There was a problem hiding this comment.
@vrthra reports:
opcodes = dis.get_instructions API is kind of hard in versions before Python 3.6 because o[f] jumps
While there are more serious problems mentioned below, by using xdis in my fork, the newer and better APIs in later releases can be applied to bytecode on older versions of Python.
the VM sets the bytecode counter to the offset to be jumped to.
In fact, this is wrong at least as far as emulating CPython semantics because this value is stored in the frame's f_lasti.
While this may be tolerable for reading log traces, if you were to try to hook this up with a debugger it would be intolerable: you can't report a frame's f_lasti as the instruction that would be run if the current instruction had finished and not jumped, returned or raised an exception. Instead f_lasti needs to show the current instruction as it does in CPython.
perhaps we should special case from Python 3.6 onward?
Although the specific problems you mention are addressed above, something like this in x-python goes on with opcodes. Here the advantages are
- pedagocial
- cleaner separatation of code, which leads to
- scalability
byterun/pyvm2.py
Outdated
| byteName = dis.opname[byteCode] | ||
| arg = None | ||
| arguments = [] | ||
| if f.py36_opcodes and byteCode == dis.EXTENDED_ARG: |
There was a problem hiding this comment.
I don't think extended_arg is new to 3.6, can we remove the first part of this check?
There was a problem hiding this comment.
As I mentioned previously, this approach only works for Python 3.6 and above, and only if we are using the get_instructions api.
| intArg = byteint(arg[0]) + (byteint(arg[1]) << 8) | ||
| if f.py36_opcodes: | ||
| intArg = currentOp.arg | ||
| else: |
There was a problem hiding this comment.
also here, I am not sure why we are handling 3.6 in such a special way, the code below looks correct in either case.
There was a problem hiding this comment.
(I misunderstood you earlier.) From 3.6 on, there is a subtle difference. All ops have arguments. Which means that we do not need to increment the f_lasti. Given that we have to special case py36 anyway, we might as well rely on the get_instructions API for versions from now on to get the actual argument rather than doing the bit manipulation ourselves.
| elts = self.popn(count) | ||
| self.push(tuple(e for l in elts for e in l)) | ||
|
|
||
| def byte_BUILD_TUPLE_UNPACK(self, count): |
There was a problem hiding this comment.
in the C source, BUILD_TUPLE_UNPACK, BUILD_TUPLE_UNPACK_WITH_CALL, and BUILD_LIST_UNPACK have the same target, should we reflect that by having them share the same target here?
There was a problem hiding this comment.
I will update to reflect that that.
byterun/pyvm2.py
Outdated
|
|
||
| def byte_BUILD_MAP(self, count): | ||
| # Pushes a new dictionary on to stack. | ||
| if not(six.PY3 and sys.version_info.minor >= 5): |
There was a problem hiding this comment.
this might be more clear as sys.version_info[:2] < (3, 5). at first I thought this was flipped
byterun/pyvm2.py
Outdated
| # dictionary holds count entries: {..., TOS3: TOS2, TOS1:TOS} | ||
| # updated in version 3.5 | ||
| kvs = {} | ||
| for i in range(0, count): |
There was a problem hiding this comment.
range(n) is shorthand for range(0, n) and is a more common form
| globs = self.frame.f_globals | ||
| fn = Function(name, code, globs, defaults, None, self) | ||
| if PY3 and sys.version_info.minor >= 6: | ||
| closure = self.pop() if (argc & 0x8) else None |
There was a problem hiding this comment.
is there a reason to use hex-literals for values under 10?
There was a problem hiding this comment.
@llllllllll I took the comparisons from the Python bytecode documentation directly, which uses these hex literals.
| self.frame.f_lineno = lineno | ||
|
|
||
| if PY3: | ||
| def build_class(func, name, *bases, **kwds): |
There was a problem hiding this comment.
did we need to rewrite this function? I think the scope of this project is confined to the interpreter loop and function calling. I may have missed some new reason why this needs to be in Python now though.
There was a problem hiding this comment.
See the comment further down in the function, and my remarks on the pull request #20 -- it might be better to discuss this over there instead. I'm not sure if any of your other comments are about my old submission instead of @vrthra's, since I've forgotten what was in it by now.
I vaguely remember trying alternatives, but couldn't find any better way to get Py3.4 supported.
byterun/pyobj.py
Outdated
| def __init__(self, f_code, f_globals, f_locals, f_back): | ||
| def __init__(self, f_code, f_globals, f_locals, f_closure, f_back): | ||
| self.f_code = f_code | ||
| self.py36_opcodes = list(dis.get_instructions(self.f_code)) \ |
There was a problem hiding this comment.
Is there a reason to prefer this over just reading the f_code? If anything, 3.6 makes this very simple with the fixed width instructions.
There was a problem hiding this comment.
Given that Python is now exposing the API to iterate, and given that the bytecode format is not considered by Python project to be part of their public API, I think we should use their public API rather than read and interpret f_code.
Unfortunately the API is only available from 3.4 so we have to special case this for any code from 3.4 onwards.
Adding python 3.6 support.
Includes the python 3.4 patch from @darius so that the tests pass. If that gets committed, I will remove the last patch from this PR.