estimate: Use the default GOPATH to cache downloads#304
estimate: Use the default GOPATH to cache downloads#304rhansen wants to merge 1 commit intoDebian:masterfrom
GOPATH to cache downloads#304Conversation
There was a problem hiding this comment.
The code looks alright at first glance, but I am not sure if this is a good idea or not. The advantage is that subsequent runs will be faster, but it does not clean behind it.
Also, I am not sure what is the de-duplication behaviour of go get when another version of a library is already downloaded.
I'm afraid this would only make the process faster when running estimate for the same package multiple times, at the cost of using disk space.
|
In what scenario is the re-downloading happening that this optimizes? |
This avoids re-downloading the same modules (potentially hundreds of
megabytes for moderately sized binaries) over and over. The downside
is wasted disk space that the user must manually clear (`go clean
-modcache`).
Why keeping the downloaded modules is believed to be more useful than
automatically deleting them:
* It makes offline use possible.
* Packaging a complex module with lots of unpackaged dependencies
means a long slow process of chipping away at the dependencies.
It would be nice to be able to re-run `dh-make-golang estimate`
several times to track progress and plan next steps without
incurring a huge download cost.
* Developing `dh-make-golang estimate` itself involves re-running it
over and over to test changes. The interesting modules to test
are the complex ones, and those take a while to download.
* I believe that `golang-*-dev` package maintainers are already
accustomed to running `go clean -modcache`, an occasional
necessity with serious Go development.
* The modules don't actually need to be fully downloaded and
unpacked; only their `go.mod` files are necessary to construct a
dependency graph. Changing `dh-make-golang estimate` in a future
commit to only download just `go.mod` would address disk usage
concerns. It would also mean that re-downloading over and over
would be less of a problem, but keeping the `go.mod` files avoids
abusing proxy.golang.org and sum.golang.org.
This change makes `progressSize` not useful, so `go get`'s stdout and
stderr are passed through, and `-x` is passed to `go get` to get some
progress indication.
Two scenarios that affect me:
and maybe a third scenario on occasion:
The modules don't actually need to be fully downloaded and unpacked, only their (I'll update the commit message to include the above rationale.)
This is a problem with Go in general. I expect most Debian Go maintainers are also Go developers, and are already used to running
Both versions are saved and unpacked in the module cache (no deduplication). |
1accda0 to
14e2060
Compare
After thinking about it a bit more, I think your current approach is alright. Maybe it simply needs an explanation in the man page and/or --help message of the estimate command that explains that one has to run |
|
In fact, after testing it a bit, I find the And keep the As the cache is reused, we can expect the downloads to be usually faster anyway. So it would look something like this: diff --git a/estimate.go b/estimate.go
index 81cc655..cf0ce74 100644
--- a/estimate.go
+++ b/estimate.go
@@ -70,12 +70,16 @@ func get(repodir, repo, rev string) error {
if rev != "" {
packages += "@" + rev
}
- cmd := exec.Command("go", "get", "-x", "-t", packages)
+ cmd := exec.Command("go", "get", "-t", packages)
+ out := bytes.Buffer{}
cmd.Dir = repodir
- cmd.Stderr = os.Stderr
- cmd.Stdout = os.Stdout
- return cmd.Run()
+ cmd.Stderr = &out
+ err := cmd.Run()
+ if err != nil {
+ fmt.Fprint(os.Stderr, out.String())
+ }
+ return err
}
// getModuleDir returns the path of the directory containing a module for the given repository dir.
@@ -204,6 +208,8 @@ func estimate(importpath, revision string) error {
return fmt.Errorf("create dummymod: %w", err)
}
+ log.Println(`Downloading modules... (run "go clean -modcache" to clear downloaded modules)`)
+
if err := get(repodir, importpath, revision); err != nil {
return fmt.Errorf("go get: %w", err)
} |
This avoids re-downloading the same modules (potentially hundreds of megabytes for moderately sized binaries) over and over.
This makes
progressSizenot useful, sogo get's stdout and stderr are passed through and-xis passed togo getto get some progress indication.