Skip to content

Conversation

ARR4N
Copy link
Contributor

@ARR4N ARR4N commented Jun 18, 2024

Why this should be merged

From x/gethclone/README.md:

This is an experimental module for tracking upstream go-ethereum changes. The approach of git mergeing the upstream branch into subnet-evm is brittle as it relies on purely syntactic patching. gethclone is intended to follow a rebase-like pattern, applying a set of semantic patches (e.g. AST modification, Uber's gopatch, etc.) that (a) should be more robust to refactoring; and (b) act as explicit documentation of the diffs.

How this works

A subset of geth packages is specified via the --packages flag, retrieved with Go's native package resolution. The import graph is traversed for all other ethereum/go-ethereum packages, and (as yet undefined) AST patches are applied. The --output_go_mod flag specifies the go.mod file of the module to which the transformed files are to be written.

See examples of intended future use in comment below.

How this was tested

The astpatch package has tests to demonstrate proper AST traversal. As no real patches exist yet, this only records and confirms visited nodes.

The gethclone binary itself is not yet tested. The intention is that the cloned files themselves will have tests to demonstrate proper transformation. All *_test.go files will be cloned and run. Modifications (e.g. addition of struct fields) will have tests defined in the destination module. Ideas for more specific tests are welcome.

How is this documented

READMEs and Go comments.

# 2. Sort the unique results
# #. Print out the difference between the search results and the list of specified allowed package imports from geth.
extra_imports=$(grep -r --include='*.go' --exclude-dir='simulator' '"github.com/ethereum/go-ethereum/.*"' -o -h | sort -u | comm -23 - ./scripts/geth-allowed-packages.txt)
extra_imports=$(grep -Pr --include='*.go' --exclude-dir='simulator' '^\s+"github.com/ethereum/go-ethereum/.*"' -o -h | tr -d '[:blank:]' | sort -u | comm -23 - ./scripts/geth-allowed-packages.txt)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a false positive on const geth = "github.com/ethereum/go-ethereum/" so I updated the check to only match with (possible) whitespace but no other prefix.

@ARR4N ARR4N marked this pull request as ready for review June 19, 2024 12:28
@ARR4N ARR4N requested review from ceyonur and darioush as code owners June 19, 2024 12:28
@marun
Copy link

marun commented Jun 19, 2024

Interesting. For mechanical changes like imports or struct modifications this seems like a very promising approach. Might this approach also be appropriate for more complex fork maintenance (i.e. maintaining our changes at the AST level rather than as git commits)?

@ARR4N
Copy link
Contributor Author

ARR4N commented Jun 19, 2024

Interesting. For mechanical changes like imports or struct modifications this seems like a very promising approach. Might this approach also be appropriate for more complex fork maintenance (i.e. maintaining our changes at the AST level rather than as git commits)?

I think it would form part of a broader strategy. Changes like you describe, as well as deletions and other "simple" modifications can be easily implemented as an AST change. Additions of entire methods/functions should just be done as new files already in the destination module (i.e. regular Go code). More complex code changes will be addressed as we encounter them.

Cloned+transformed files should probably still be checked in so others can import the package, but with pre-merge checks to confirm that they're up to date with gethclone output (e.g. with check-clean-branch.sh).

Advantages of maintaining the changes as AST and in-situ files:

  • Explicit record of the differences compared to geth
  • Robust (hopefully) to refactoring of geth

The primary disadvantage is the complexity of writing an AST patch although I plan on writing some generators; that would be used something like:

patches.AddStructField(geth("core/types"), "StateAccount", &ast.Field{..."IsMultiCoin"...})
patches.RemoveMethod(geth("common"), "Address", "Cmp")

@marun
Copy link

marun commented Jun 19, 2024

No argument that the current approach of merging changes without shared history is not ideal, but how would you compare/contrast the AST approach proposed in this PR with maintenance of a proper fork with shared git history for which sets of persistent and transient carry patches are maintained?

processed map[string]bool
}

const geth = "github.com/ethereum/go-ethereum"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe consider moving this into a flag while retaining this value as the default value ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the end goal in having it configurable? This will likely have flow-on effects (e.g. the _ import would be insufficient to guarantee that packages.Load() can resolve the package).

Copy link
Contributor

@tsachiherman tsachiherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I would prefer to have the config struct defined in it's own file and leave the main.go to be the light-weight entry point for the executable.

@ARR4N
Copy link
Contributor Author

ARR4N commented Jun 19, 2024

No argument that the current approach of merging changes without shared history is not ideal, but how would you compare/contrast the AST approach proposed in this PR with maintenance of a proper fork with shared git history for which sets of persistent and transient carry patches are maintained?

Context for others as I didn't know about transient/carry patches: https://github.com/openshift/kubernetes/blob/master/REBASE.openshift.md

As for an answer: I'll get back to you @marun because I don't know yet.

@ARR4N
Copy link
Contributor Author

ARR4N commented Jun 19, 2024

@tsachiherman:

I think that I would prefer to have the config struct defined in it's own file and leave the main.go to be the light-weight entry point for the executable.

Moved into gethclone.go

alarso16 pushed a commit that referenced this pull request Sep 30, 2025
github-merge-queue bot pushed a commit that referenced this pull request Oct 1, 2025
ceyonur added a commit that referenced this pull request Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants