-
Notifications
You must be signed in to change notification settings - Fork 1
fix: Add field name validation to prevent SQL injection #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit addresses issue #21 by implementing comprehensive field name validation at multiple points in the CEL-to-SQL conversion pipeline. Changes: - Enhanced validateFieldName() in utils.go with proper PostgreSQL limits: * Maximum identifier length: 63 characters (PostgreSQL NAMEDATALEN-1) * Format validation: must start with letter/underscore, alphanumeric+underscore only * Reserved keyword checking: 60+ SQL keywords now rejected * Empty string validation - Added validation calls in cel2sql.go: * visitSelect(): validates field names in select expressions * visitIdent(): validates identifier names to prevent reserved keywords in SQL - Comprehensive test coverage: * utils_test.go: 43 unit tests for validateFieldName() * field_validation_test.go: Integration tests for full CEL-to-SQL pipeline * Tests cover: valid names, SQL injection attempts, reserved keywords, format violations, length limits, and edge cases Security Impact: - Prevents SQL injection through malicious field names - Blocks use of reserved SQL keywords as unquoted identifiers - Enforces PostgreSQL identifier naming conventions - Defense-in-depth approach with validation at multiple pipeline stages All tests passing with 56.5% coverage in main package. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive field name validation to prevent SQL injection vulnerabilities in the CEL-to-SQL conversion pipeline. The changes implement defense-in-depth security by validating field names and identifiers at multiple points in the conversion process.
Key changes:
- Added
validateFieldName()function with PostgreSQL identifier rules, length limits, and reserved keyword checking - Integrated validation into
visitSelect()andvisitIdent()conversion methods - Added 43 unit tests and 7 integration tests covering validation rules, SQL injection patterns, and edge cases
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| utils.go | Implements field name validation with PostgreSQL identifier rules, 63-character limit enforcement, and reserved keyword checking |
| cel2sql.go | Integrates validation into identifier and select expression conversion to block invalid names early |
| utils_test.go | Comprehensive unit tests covering valid names, SQL injection attempts, format violations, length limits, and reserved keywords |
| field_validation_test.go | Integration tests verifying validation works through the full CEL-to-SQL conversion pipeline |
|
|
||
| var ( | ||
| // fieldNameRegexp validates PostgreSQL identifier format | ||
| fieldNameRegexp = regexp.MustCompile(`^[a-zA-Z_][a-zA-Z0-9_]*$`) |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The regexp is compiled at package initialization time but could be replaced with a more efficient character-by-character validation. For a security-critical validation function called frequently, consider replacing regex with explicit character validation loops to avoid regex engine overhead and improve performance.
| fieldName := sel.GetField() | ||
| if err := validateFieldName(fieldName); err != nil { | ||
| return fmt.Errorf("invalid field name in select expression: %w", err) | ||
| } |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name is retrieved and validated, but sel.GetField() is still called multiple times later in the function (lines 1119, 1133). Consider reusing the fieldName variable throughout to avoid redundant method calls and improve code clarity.
| con.str.WriteString("->>'") | ||
| con.str.WriteString(fieldName) | ||
| con.str.WriteString("'") |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Field names are written directly into SQL strings without escaping single quotes. While validateFieldName() prevents most injection patterns, if a field name containing a single quote somehow passes validation, it could break the SQL syntax. Consider adding explicit quote escaping or verifying the validation regex prevents single quotes.
richardwooding
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed review! I'll address each point:
Comment 1 (Regex Performance):
Good suggestion for future optimization. The current regex is compiled once at package initialization and the pattern is simple (^[a-zA-Z_][a-zA-Z0-9_]*$), so the overhead is minimal. Since this is a security-critical fix being backported, I'd prefer to keep the clear, maintainable regex implementation. If profiling shows this is a bottleneck, we can optimize to character-by-character validation in a follow-up PR.
Comment 2 (Reuse fieldName variable):
Already done! The fieldName variable extracted at line 1081 is reused throughout the function at lines 1093, 1104, 1119, 1124, 1126, and 1133. The code doesn't call sel.GetField() again after the initial extraction and validation.
Comment 3 (Single quote escaping):
The validation regex ^[a-zA-Z_][a-zA-Z0-9_]*$ only allows letters, digits, and underscores, so single quotes are already prevented by the validation. The regex explicitly rejects any character that isn't alphanumeric or underscore, making quote injection impossible. This is by design - PostgreSQL identifiers that need special characters (including quotes) must be quoted with double quotes, which we intentionally don't support here to maintain security.
Summary
Fixes #21 - Adds comprehensive field name validation to prevent SQL injection vulnerabilities in the CEL-to-SQL conversion pipeline.
Changes Made
1. Enhanced Field Name Validation (
utils.go)2. Added Validation to Conversion Pipeline (
cel2sql.go)visitSelect(): Validates field names in select expressions (e.g.,obj.fieldname)visitIdent(): Validates identifier names to prevent reserved keywords in SQL output3. Comprehensive Test Coverage
utils_test.go: 43 unit tests covering all validation rulesfield_validation_test.go: Integration tests through full CEL-to-SQL pipelineSecurity Impact
This fix provides defense-in-depth protection against SQL injection:
SELECT,DROP,UNIONcannot be used as unquoted identifiersTest Results
Examples
Before (vulnerable):
After (protected):
Testing Checklist
go test ./...)🤖 Generated with Claude Code