-
Notifications
You must be signed in to change notification settings - Fork 57
chore: add validation for module source URLs #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
72d7ee4
5b6d878
3b6b1ba
7e94395
5419676
9c00c57
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,6 @@ on: | |
push: | ||
branches: | ||
- main | ||
- master | ||
pull_request: | ||
|
||
permissions: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,11 +3,81 @@ package main | |
import ( | ||
"bufio" | ||
"context" | ||
"path/filepath" | ||
"regexp" | ||
"strings" | ||
|
||
"golang.org/x/xerrors" | ||
) | ||
|
||
var ( | ||
terraformSourceRe = regexp.MustCompile(`^\s*source\s*=\s*"([^"]+)"`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you have a good idea for what source fields typically look like? I feel like this regex is too broad (the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm seeing all of these on the Terraform docs site: I think I'd prefer to have more explicit patterns for each source type example listed there There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, also, could we add a comment saying what general pattern we expect this to match? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One last update: now that I've looked at the rest of this file, if we're always going to use |
||
) | ||
|
||
func extractNamespaceAndModuleFromPath(filePath string) (namespace string, moduleName string, err error) { | ||
matifali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Expected path format: registry/<namespace>/modules/<module-name>/README.md. | ||
parts := strings.Split(filepath.Clean(filePath), string(filepath.Separator)) | ||
if len(parts) < 5 || parts[0] != "registry" || parts[2] != "modules" || parts[4] != "README.md" { | ||
return "", "", xerrors.Errorf("invalid module path format: %s", filePath) | ||
} | ||
namespace = parts[1] | ||
moduleName = parts[3] | ||
return namespace, moduleName, nil | ||
} | ||
|
||
func validateModuleSourceURL(body string, filePath string) []error { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wish I knew more about Coder templates. Do templates have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no its not applicable to templates |
||
var errs []error | ||
|
||
namespace, moduleName, err := extractNamespaceAndModuleFromPath(filePath) | ||
if err != nil { | ||
return []error{err} | ||
} | ||
|
||
expectedSource := "registry.coder.com/" + namespace + "/" + moduleName + "/coder" | ||
|
||
trimmed := strings.TrimSpace(body) | ||
foundCorrectSource := false | ||
isInsideTerraform := false | ||
firstTerraformBlock := true | ||
|
||
lineScanner := bufio.NewScanner(strings.NewReader(trimmed)) | ||
for lineScanner.Scan() { | ||
nextLine := lineScanner.Text() | ||
|
||
if strings.HasPrefix(nextLine, "```") { | ||
if strings.HasPrefix(nextLine, "```tf") && firstTerraformBlock { | ||
isInsideTerraform = true | ||
firstTerraformBlock = false | ||
} else if isInsideTerraform { | ||
// End of first terraform block. | ||
break | ||
} | ||
continue | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this can be rewritten as: if strings.HasPrefix(nextLine, "```") {
if strings.HasPrefix(nextLine, "```tf") {
isInsideTerraform = true
continue
}
if isInsideTerraform {
break
}
} I don't see the point of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The goal is to check all tf blocks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe I'm missing something obvious, then. What validation requirement(s) are we trying to fulfill? I was under the impression that we wanted the first Terraform block to contain a source field, but that other sources could be valid for the other blocks. Are we trying to keep all sources fully "boxed in" for Coder templates? Not saying that the current code is wrong, but I'm not fully sure what we're trying to protect against, so I just want to make sure I'm giving good feedback There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We want to make sure all sources for all codeblock in a module README.md are in correct format. This should not apply to templates if it does its probably a bug. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I meant "modules" instead of "templates" But if the goal is to check all Terraform blocks, then we'll need to remove the With how the code is currently set up, once we hit the end of the first block, we always hit the |
||
|
||
if isInsideTerraform { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can reduce the nesting if we do this: if !isInsideTerraform {
continue
} |
||
// Check for any source line in the first terraform block. | ||
if matches := terraformSourceRe.FindStringSubmatch(nextLine); matches != nil { | ||
actualSource := matches[1] | ||
if actualSource == expectedSource { | ||
foundCorrectSource = true | ||
break | ||
} else if strings.HasPrefix(actualSource, "registry.coder.com/") && strings.Contains(actualSource, "/"+moduleName+"/coder") { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But also, do we want to extract the |
||
// Found source for this module but with wrong namespace/format. | ||
errs = append(errs, xerrors.Errorf("incorrect source URL format: found %q, expected %q", actualSource, expectedSource)) | ||
return errs | ||
} | ||
} | ||
} | ||
} | ||
|
||
if !foundCorrectSource { | ||
errs = append(errs, xerrors.Errorf("did not find correct source URL %q in first Terraform code block", expectedSource)) | ||
} | ||
|
||
return errs | ||
} | ||
|
||
func validateCoderModuleReadmeBody(body string) []error { | ||
var errs []error | ||
|
||
|
@@ -94,6 +164,9 @@ func validateCoderModuleReadme(rm coderResourceReadme) []error { | |
for _, err := range validateCoderModuleReadmeBody(rm.body) { | ||
errs = append(errs, addFilePathToError(rm.filePath, err)) | ||
} | ||
for _, err := range validateModuleSourceURL(rm.body, rm.filePath) { | ||
errs = append(errs, addFilePathToError(rm.filePath, err)) | ||
} | ||
for _, err := range validateResourceGfmAlerts(rm.body) { | ||
errs = append(errs, addFilePathToError(rm.filePath, err)) | ||
} | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tests generated by gemini CLI |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,3 +20,86 @@ func TestValidateCoderResourceReadmeBody(t *testing.T) { | |
} | ||
}) | ||
} | ||
|
||
func TestValidateModuleSourceURL(t *testing.T) { | ||
t.Parallel() | ||
|
||
t.Run("Valid source URL format", func(t *testing.T) { | ||
t.Parallel() | ||
|
||
body := "# Test Module\n\n```tf\nmodule \"test-module\" {\n source = \"registry.coder.com/test-namespace/test-module/coder\"\n version = \"1.0.0\"\n agent_id = coder_agent.example.id\n}\n```\n" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: I feel like reading through these bodies is going to be a pain if the validation logic ever breaks, and we need to debug things. A couple of things come to mind:
|
||
filePath := "registry/test-namespace/modules/test-module/README.md" | ||
errs := validateModuleSourceURL(body, filePath) | ||
if len(errs) != 0 { | ||
t.Errorf("Expected no errors, got: %v", errs) | ||
} | ||
}) | ||
|
||
t.Run("Invalid source URL format - wrong namespace", func(t *testing.T) { | ||
t.Parallel() | ||
|
||
body := "# Test Module\n\n```tf\nmodule \"test-module\" {\n source = \"registry.coder.com/wrong-namespace/test-module/coder\"\n version = \"1.0.0\"\n agent_id = coder_agent.example.id\n}\n```\n" | ||
filePath := "registry/test-namespace/modules/test-module/README.md" | ||
errs := validateModuleSourceURL(body, filePath) | ||
if len(errs) != 1 { | ||
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs) | ||
} | ||
if len(errs) > 0 && !contains(errs[0].Error(), "incorrect source URL format") { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This |
||
t.Errorf("Expected source URL format error, got: %s", errs[0].Error()) | ||
} | ||
}) | ||
|
||
t.Run("Missing source URL", func(t *testing.T) { | ||
t.Parallel() | ||
|
||
body := "# Test Module\n\n```tf\nmodule \"other-module\" {\n source = \"registry.coder.com/other/other-module/coder\"\n version = \"1.0.0\"\n agent_id = coder_agent.example.id\n}\n```\n" | ||
filePath := "registry/test-namespace/modules/test-module/README.md" | ||
errs := validateModuleSourceURL(body, filePath) | ||
if len(errs) != 1 { | ||
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs) | ||
} | ||
if len(errs) > 0 && !contains(errs[0].Error(), "did not find correct source URL") { | ||
matifali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
t.Errorf("Expected missing source URL error, got: %s", errs[0].Error()) | ||
} | ||
}) | ||
|
||
t.Run("Module name with hyphens vs underscores", func(t *testing.T) { | ||
t.Parallel() | ||
|
||
body := "# Test Module\n\n```tf\nmodule \"test_module\" {\n source = \"registry.coder.com/test-namespace/test-module/coder\"\n version = \"1.0.0\"\n agent_id = coder_agent.example.id\n}\n```\n" | ||
filePath := "registry/test-namespace/modules/test-module/README.md" | ||
errs := validateModuleSourceURL(body, filePath) | ||
if len(errs) != 0 { | ||
t.Errorf("Expected no errors for hyphen/underscore variation, got: %v", errs) | ||
} | ||
}) | ||
matifali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
t.Run("Invalid file path format", func(t *testing.T) { | ||
t.Parallel() | ||
|
||
body := "# Test Module" | ||
matifali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
filePath := "invalid/path/format" | ||
errs := validateModuleSourceURL(body, filePath) | ||
if len(errs) != 1 { | ||
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs) | ||
} | ||
if len(errs) > 0 && !contains(errs[0].Error(), "invalid module path format") { | ||
t.Errorf("Expected path format error, got: %s", errs[0].Error()) | ||
} | ||
}) | ||
} | ||
|
||
func contains(s, substr string) bool { | ||
return len(s) >= len(substr) && (s == substr || (len(s) > len(substr) && | ||
(s[:len(substr)] == substr || s[len(s)-len(substr):] == substr || | ||
indexOfSubstring(s, substr) >= 0))) | ||
} | ||
matifali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
func indexOfSubstring(s, substr string) int { | ||
for i := 0; i <= len(s)-len(substr); i++ { | ||
if s[i:i+len(substr)] == substr { | ||
return i | ||
} | ||
matifali marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
return -1 | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,7 +25,7 @@ var ( | |
terraformVersionRe = regexp.MustCompile(`^\s*\bversion\s+=`) | ||
|
||
// Matches the format "> [!INFO]". Deliberately using a broad pattern to catch formatting issues that can mess up | ||
// the renderer for the Registry website | ||
// the renderer for the Registry website. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All these There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, yeah, I feel like there should be a way to enforce this during the CI step, but this was my fault for not linting before opening one of the recent PRs |
||
gfmAlertRegex = regexp.MustCompile(`^>(\s*)\[!(\w+)\](\s*)(.*)`) | ||
) | ||
|
||
|
@@ -39,7 +39,7 @@ type coderResourceFrontmatter struct { | |
} | ||
|
||
// A slice version of the struct tags from coderResourceFrontmatter. Might be worth using reflection to generate this | ||
// list at runtime in the future, but this should be okay for now | ||
// list at runtime in the future, but this should be okay for now. | ||
var supportedCoderResourceStructKeys = []string{ | ||
"description", "icon", "display_name", "verified", "tags", "supported_os", | ||
// TODO: This is an old, officially deprecated key from the archived coder/modules repo. We can remove this once we | ||
|
@@ -315,15 +315,15 @@ func validateResourceGfmAlerts(readmeBody string) []error { | |
} | ||
|
||
// Nested GFM alerts is such a weird mistake that it's probably not really safe to keep trying to process the | ||
// rest of the content, so this will prevent any other validations from happening for the given line | ||
// rest of the content, so this will prevent any other validations from happening for the given line. | ||
if isInsideGfmQuotes { | ||
errs = append(errs, errors.New("registry does not support nested GFM alerts")) | ||
errs = append(errs, xerrors.New("registry does not support nested GFM alerts")) | ||
continue | ||
} | ||
|
||
leadingWhitespace := currentMatch[1] | ||
if len(leadingWhitespace) != 1 { | ||
errs = append(errs, errors.New("GFM alerts must have one space between the '>' and the start of the GFM brackets")) | ||
errs = append(errs, xerrors.New("GFM alerts must have one space between the '>' and the start of the GFM brackets")) | ||
Comment on lines
+334
to
+340
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These changes are also done by |
||
} | ||
isInsideGfmQuotes = true | ||
|
||
|
@@ -347,7 +347,7 @@ func validateResourceGfmAlerts(readmeBody string) []error { | |
} | ||
} | ||
|
||
if gfmAlertRegex.Match([]byte(sourceLine)) { | ||
if gfmAlertRegex.MatchString(sourceLine) { | ||
errs = append(errs, xerrors.Errorf("README has an incomplete GFM alert at the end of the file")) | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
module coder.com/coder-registry | ||
|
||
go 1.23.2 | ||
go 1.24 | ||
|
||
require ( | ||
cdr.dev/slog v1.6.1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was having weird issues with running Golang CI locally, and had to match the Go version.