Skip to content

VCS references have non-existing URLs #630

@sithmein

Description

@sithmein

The url of external vcs references sometimes does not exist. For example, the URL for

      "purl": "pkg:golang/github.com/aws/aws-sdk-go-v2/internal/ini@v1.8.0?type=module\u0026goos=linux\u0026goarch=amd64",
      "externalReferences": [
        {
          "url": "https://github.com/aws/aws-sdk-go-v2/internal/ini",
          "type": "vcs"
        }

does not exist while https://github.com/aws/aws-sdk-go-v2/tree/main/internal/ini or https://github.com/aws/aws-sdk-go-v2 does.

More detail from a discussion in Slack:

Yeah that looks like a bug in the VCS URL resolution:

var (
// By convention, modules with a major version equal to or above v2
// have it as suffix in their module path.
vcsUrlMajorVersionSuffixRegex = regexp.MustCompile(`(/v[\d]+)$`)
// gopkg.in with user segment
// Example: gopkg.in/user/pkg.v3 -> github.com/user/pkg
vcsUrlGoPkgInRegexWithUser = regexp.MustCompile(`^gopkg\.in/([^/]+)/([^.]+)\..*$`)
// gopkg.in without user segment
// Example: gopkg.in/pkg.v3 -> github.com/go-pkg/pkg
vcsUrlGoPkgInRegexWithoutUser = regexp.MustCompile(`^gopkg\.in/([^.]+)\..*$`)
)
const vcsHttpsPrefix = "https://"
func resolveVCSURL(modulePath string) string {
switch {
case strings.HasPrefix(modulePath, "github.com/"):
return vcsHttpsPrefix + vcsUrlMajorVersionSuffixRegex.ReplaceAllString(modulePath, "")
case vcsUrlGoPkgInRegexWithUser.MatchString(modulePath):
return vcsHttpsPrefix + vcsUrlGoPkgInRegexWithUser.ReplaceAllString(modulePath, "github.com/$1/$2")
case vcsUrlGoPkgInRegexWithoutUser.MatchString(modulePath):
return vcsHttpsPrefix + vcsUrlGoPkgInRegexWithoutUser.ReplaceAllString(modulePath, "github.com/go-$1/$1")
}
return ""
}

and

You probably want to truncate the link to https://github.com/aws/aws-sdk-go-v2 instead of inserting tree/main.

  1. The tree/main version is browsable, but it is not usable with Git.
  2. Inserting tree/main assumes that the default branch is main (tree/HEAD fixes this)
  3. Inserting tree/anything assumes that the code is on the default branch and the code may not exist on the default branch.

There are other problems if the host is not GitHub, but this code only tries to support GitHub. Because of the way Go (theoretically) works, there is a known algorithm to turn a module ID like v.io into a VCS URL like https://github.com/vanadium/core and even to know that this module is in Git and not Fossil. (how is CycloneDX supposed to represent the type of VCS?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions