Skip to content

Conversation

@mlechu
Copy link
Member

@mlechu mlechu commented Oct 16, 2025

Currently, we have a pile of assertions in expr_to_syntaxtree checking the
validity of the given Expr. This means JuliaLowering can't handle old macros
calling old or new macros with unrecognized Expr forms---in the "old calling
old" case, this is breakage, and in the "old calling new" case it's still a
downgrade to the macro system. expr_to_syntaxtree handles macro input and
quoted expressions, so it can't assume anything about the AST structure.

This PR attempts to define expr_to_syntaxtree on all inputs in a sane way.

  1. If e is one of the known forms with a different representation in
    SyntaxTree, convert it.
  2. Otherwise, if e.head corresponds to a known JuliaSyntax.Kind, insert that
    and recurse on its children.
  3. Otherwise, insert an K"expr_syntax" node with e as its .value. We'll
    still get an error as we did before if we try and evaluate this, but a future
    expansion is expected to clean it up.

The assertions have been removed, but their contents are generally kept for
deciding between cases (1) and (2) above.

@mlechu mlechu force-pushed the ec/total-expr-to-st branch from 337f356 to 2c8a478 Compare October 20, 2025 18:07
Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing the problem of unknown heads seems good and necessary, thanks!

I do find it surprising that this gives us mixtures of SyntaxTree and Expr in the intermediate forms and I'm a little concerned this mixed representation will lead to bugs.

How about we slightly change the representation here such that unknown heads still turn into K"expr_syntax" (or maybe K"unknown_head" or some better name?) but we map the children even when the head is unrecognized. The original head can go into an attribute. Actually this is almost what K"extension" currently is - so there's already a half-formed plan to do this kind of thing.

In addition I think we should put the error message for K"expr_syntax" in desugaring so macro expansion of unknown heads isn't an error.

@mlechu
Copy link
Member Author

mlechu commented Oct 28, 2025

I do find it surprising that this gives us mixtures of SyntaxTree and Expr in the intermediate forms and I'm a little concerned this mixed representation will lead to bugs.

Any potential bugs in mind? I'm actually more worried about case 2 (mapping children) than case 3 (give up and don't convert), since case 2 can have incorrect round-tripping behaviour if an invalid Expr is a valid SyntaxTree with different semantics. Unfortunately the only correct solutions are:

  • (A) Only convert when we know the semantics, otherwise produce something that will round-trip. This essentially requires validation.jl for Expr (or making compat.jl that complete), and only converting to SyntaxTree if we have a known form; otherwise using K"unknown_expr".
  • (B) Only converting to SyntaxTree once macro expansion has finished, so old macros may not expand to new macrocalls.

I decided to inch towards (A), where case 2 is incorrect but not problematic in practice. I left unknown "must round-trip" stuff as Expr since it's probably going to be used by an Expr-macro immediately after, and there isn't any provenance to lose. It would eat some CPU cycles, but wouldn't be incorrect to recursively convert the Expr one-to-one under K"unknown_expr", but I don't know if that's any more useful than the Expr.

@c42f
Copy link
Member

c42f commented Oct 29, 2025

Any potential bugs in mind

Not specifically right now. But having a mixed representation seems more confusing to reason about and feels like fertile ground for such bugs.

case 2 can have incorrect round-tripping behaviour if an invalid Expr is a valid SyntaxTree with different semantics

hmm ok. Do you have an example of this? Maybe we already have this problem in the existing eager conversion code?

(B) ... old macros may not expand to new macrocalls

Hmm. I don't love this tradeoff. Though old macros can't even expand to old macrocalls in the existing system and have things generally work so I guess there's precedent in some sense 😅

@mlechu
Copy link
Member Author

mlechu commented Oct 29, 2025

Do you have an example of this?

Anywhere SyntaxTree uses a representation that is an invalid Expr (Expr(:var, :x), Expr(:try, Expr(:block, :foo), Expr(:catch, :e, Expr(:block)))), the expression will not round-trip. As I mentioned, we could recursively convert one-to-one if you prefer that (not using insert_convert_expr, a much simpler procedure). This would make forms like (unknown_expr (var x)), but I'm not sure there's anything useful to do with the wrapped form other than converting it back to Expr, and it's just another edge case for "tree-scanning" static analysis tools to skip over

@c42f
Copy link
Member

c42f commented Oct 31, 2025

Anywhere SyntaxTree uses a representation that is an invalid Expr(:var, :x) ...

Is the following what you're concerned about:

macro blah()
    v = Expr(:var, :x)
    quote
        @process_var $v
        @process_var x
    end
end

macro process_var(ex)
    if ex isa Expr && ex.head == :var
        "var node"
    else
        "something else"
    end
end

julia> @macroexpand @blah
quote
    #= REPL[11]:4 =#
    "var node"
    #= REPL[11]:5 =#
    "something else"
end

This will indeed produce different results in new macro expansion if we blindly convert between Expr and SyntaxTree without validating both directions of the conversion.

The same thing can happen in reverse if a new macro produces an invalid SyntaxTree which just happens to convert to a valid Expr. This is harder because there's only a limited number of Kinds but I suspect it's possible. It seems less problematic in practice because nobody is writing new-style macros yet.

@mlechu
Copy link
Member Author

mlechu commented Oct 31, 2025

Yup, that is the concern

@c42f
Copy link
Member

c42f commented Nov 10, 2025

Yup, that is the concern

Right, it's a tricky one. I'd prefer to go fully one way or the other in terms of conversions. ie, there's a further option (C) to add to your list
(C) Never convert Expr->SyntaxTree until we are forced to. "Forced to" means reaching the end of macro expansion, or a new style macro consuming the mixed-representation tree, or quote expansion consuming it, or anything else in inside macro_exapansion.jl. When doing on-demand conversion like this we'd need to add whatever provenance information on the fly instead of doing it immediately from expand_macro. (ie, doing the parts of append_sourceref() which apply to old macro expansions)

This should allow old macros to call new ones.

Any potential bugs in mind

Not specifically right now. But having a mixed representation seems more confusing to reason about and feels like fertile ground for such bugs.

I found the following bug:

julia> macro quote_expansion_unknown_expr()
           inner_quote = Expr(:quote, Expr(:unknown_head, Expr(:$, :x)))
           quote
               x = :hi
               $inner_quote
           end
       end

julia> @quote_expansion_unknown_expr
:($(Expr(:unknown_head, :hi)))

julia> using JuliaLowering

julia> JuliaLowering.include_string(Main, "@quote_expansion_unknown_expr")
SyntaxTree with attributes kind,value,name_val,syntax_flags,source,scope_layer
:($(Expr(:unknown_head, :($(Expr(:$, :x)))))) :: expr_syntax │

julia> JuliaLowering.include_string(Main, "@quote_expansion_unknown_expr", expr_compat_mode=true)
:($(Expr(:unknown_head, :($(Expr(:$, :x))))))

In the current scheme, any function we call inside macro_expansion.jl needs to be prepared to accept K"expr_syntax" and to do something sensible with it even if that's just throwing an error. fixup_macro_name would be another example.

recursively convert the Expr one-to-one under K"unknown_expr", but I don't know if that's any more useful than the Expr

I still don't understand why everything inside an unknown head needs to become unknown itself. I know there's a few cases where nesting matters and I agree these could be problematic. But I think they might be rare enough that we could deal with them?

@mlechu
Copy link
Member Author

mlechu commented Nov 12, 2025

That is indeed a bug; thanks for finding it!

I still don't understand why everything inside an unknown head needs to become unknown itself.

I'm operating under the assumption that SyntaxTree is not meant to support arbitrary heads, and supporting this is an old-macro-compatibility specific thing, so this would be the easiest way to produce something that round-trips for the next macro to clean up. You're right that quote would be a problem.

I know there's a few cases where nesting matters and I agree these could be problematic. But I think they might be rare enough that we could deal with them?

I think it would be easy to make mostly correct, but nearly impossible to make fully correct (defining that as "semantics, if any, are preserved, and every Expr is representable as a SyntaxTree that can round-trip back to the same Expr"). Unknown heads are the easy case; the concerning case is heads known to one representation and not the other, or known to both with different expected structure.

"Forced to" means reaching the end of macro expansion, or a new style macro consuming the mixed-representation tree, or quote expansion consuming it, or anything else in inside macro_exapansion.jl.

Doesn't the last part mean what we do now (converting after each expansion so expand_forms_1 can handle it)?

I'm happy with the idea of waiting until after macro expansion to convert so we can just assume everything is valid, though. Can we just have expr_to_syntaxtree not recurse on macrocall arguments or quoted code so we don't have to deal with any of this? I think this is almost what I meant by (B). Some gymnastics would be needed if we still want to prioritize calling new macros from old expansions, but it could be done (or we could decide to prioritize the old macro in this case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants