-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(wren-launcher): Expanded dbt<>Wren MDL conversions, added BigQuery support #1859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds an --include-staging-models flag and interactive prompt; threads IncludeStagingModels through CLI → DbtConvertProject; extends converter to parse optional semantic_manifest.json (enums, metrics, primary keys, relationships); adds BigQuery data source support and tests; extends Wren MDL with EnumDefinitions and Metrics. Duplicate prompt helper present. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as CLI User
participant CLI as DbtAutoConvert (CLI)
participant Conv as DbtConvertProject / Converter
participant DS as DataSource Builder
participant MDL as WrenMDLManifest
User->>CLI: run convert (flag or prompt)
CLI->>Conv: invoke conversion (IncludeStagingModels)
Conv->>DS: parse profiles → build DataSource (BigQuery path if detected)
Conv->>Conv: read manifest.json & catalog.json
alt semantic_manifest present
Conv->>Conv: read semantic_manifest.json → extract enums, metrics, PKs
end
Conv->>Conv: apply staging filter (based on IncludeStagingModels)
Conv->>Conv: generate relationships & metrics
Conv->>MDL: assemble manifest (models, enums, metrics, relationships, datasources)
MDL-->>CLI: return ConvertResult
CLI-->>User: write/output MDL
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
wren-launcher/commands/dbt/data_source.go (1)
119-137
: Consider adding validation for keyfile_json format.The function correctly extracts
keyfile_json
from the Additional map, but there's no validation of its format. Consider adding basic JSON validation to catch malformed service account keys early.wren-launcher/commands/dbt/converter.go (3)
573-587
: Consider sanitizing enum names for special characters.The enum name generation uses model and column names directly, which might contain special characters that are invalid in enum names.
Consider adding more robust sanitization:
enumName = fmt.Sprintf("%s_%s_Enum", modelName, columnName) // Sanitize enum name -enumName = strings.ReplaceAll(enumName, ".", "_") +// Replace any non-alphanumeric characters with underscores +re := regexp.MustCompile(`[^a-zA-Z0-9_]`) +enumName = re.ReplaceAllString(enumName, "_") +// Ensure it doesn't start with a number +if len(enumName) > 0 && enumName[0] >= '0' && enumName[0] <= '9' { + enumName = "_" + enumName +}
684-701
: Add validation for measure references.The code searches for input measures in the measure lookup but doesn't handle cases where measures might not be found.
Consider adding validation and logging:
// Find the model this metric is based on for model, measures := range measureLookup { for _, inputMeasure := range inputMeasuresList { if imMap, ok := inputMeasure.(map[string]interface{}); ok { imName := getStringFromMap(imMap, "name", "") if _, exists := measures[imName]; exists { baseModel = model break + } else if imName != "" { + pterm.Warning.Printf("Measure '%s' not found in model '%s'\n", imName, model) } } } if baseModel != "" { break } } +if baseModel == "" && len(inputMeasuresList) > 0 { + pterm.Warning.Printf("Could not find model for metric '%s'\n", metricName) +}
918-925
: Handle edge cases in ref parsing.The
parseRef
function assumes standard ref() syntax but might encounter variations.Consider handling more ref() variations:
func parseRef(refStr string) string { - re := regexp.MustCompile(`ref\(['"]([^'"]+)['"]\)`) + // Handle ref() with optional spaces and both quote types + re := regexp.MustCompile(`ref\s*\(\s*['"]([^'"]+)['"]\s*\)`) matches := re.FindStringSubmatch(refStr) if len(matches) > 1 { return matches[1] } return "" }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
wren-launcher/commands/dbt.go
(3 hunks)wren-launcher/commands/dbt/converter.go
(14 hunks)wren-launcher/commands/dbt/data_source.go
(3 hunks)wren-launcher/commands/dbt/data_source_test.go
(1 hunks)wren-launcher/commands/dbt/profiles.go
(2 hunks)wren-launcher/commands/dbt/profiles_analyzer.go
(2 hunks)wren-launcher/commands/dbt/wren_mdl.go
(3 hunks)wren-launcher/commands/launch.go
(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: in wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in conver...
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/profiles.go
wren-launcher/commands/dbt/profiles_analyzer.go
wren-launcher/commands/dbt.go
wren-launcher/commands/launch.go
wren-launcher/commands/dbt/data_source_test.go
wren-launcher/commands/dbt/data_source.go
wren-launcher/commands/dbt/converter.go
📚 Learning: in the wren-ai-service codebase, when adding new fields like "alias" to the output of functions that...
Learnt from: cyyeh
PR: Canner/WrenAI#1763
File: wren-ai-service/src/pipelines/generation/semantics_description.py:31-33
Timestamp: 2025-06-20T02:37:21.292Z
Learning: In the wren-ai-service codebase, when adding new fields like "alias" to the output of functions that use Pydantic models for validation, the user prefers not to update the corresponding Pydantic model definitions to include these new fields.
Applied to files:
wren-launcher/commands/dbt/wren_mdl.go
📚 Learning: in the wrenai codebase, the maptype methods for data sources should return the original type name fo...
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:154-156
Timestamp: 2025-07-09T02:43:07.560Z
Learning: In the WrenAI codebase, the MapType methods for data sources should return the original type name for unknown types, rather than a default fallback like "string". This maintains consistency across all data source implementations.
Applied to files:
wren-launcher/commands/dbt/data_source.go
wren-launcher/commands/dbt/converter.go
🧬 Code Graph Analysis (3)
wren-launcher/commands/dbt.go (1)
wren-launcher/commands/dbt/converter.go (3)
ConvertResult
(30-34)ConvertOptions
(19-27)ConvertDbtProjectCore
(38-259)
wren-launcher/commands/launch.go (1)
wren-launcher/commands/dbt.go (1)
DbtConvertProject
(63-75)
wren-launcher/commands/dbt/converter.go (3)
wren-launcher/commands/dbt/utils.go (1)
FileExists
(30-36)wren-launcher/commands/dbt/data_source.go (2)
WrenBigQueryDataSource
(218-224)DataSource
(12-16)wren-launcher/commands/dbt/wren_mdl.go (7)
WrenMDLManifest
(4-13)EnumDefinition
(16-19)WrenModel
(22-30)Relationship
(52-58)Metric
(61-68)View
(71-75)WrenColumn
(40-49)
🔇 Additional comments (10)
wren-launcher/commands/dbt/profiles_analyzer.go (1)
136-136
: LGTM! Clean implementation of method field extraction.The addition of the
Method
field extraction follows the established pattern and correctly includes it in theknownFields
map to prevent it from being stored in theAdditional
map. This properly supports BigQuery authentication methods as outlined in the PR objectives.Also applies to: 149-149
wren-launcher/commands/launch.go (1)
518-518
: LGTM! Correctly implements the new IncludeStagingModels parameter.The addition of the
false
parameter forIncludeStagingModels
is appropriate for the launch flow, where staging models are typically not needed. This aligns with the function signature update and maintains backward compatibility.wren-launcher/commands/dbt/profiles.go (2)
28-28
: LGTM! Well-implemented Method field for BigQuery support.The
Method
field is properly added with appropriate YAML/JSON tags andomitempty
annotation, following the established pattern for optional database-specific fields. This correctly supports BigQuery authentication methods as mentioned in the PR objectives.
11-11
: LGTM! Improved formatting consistency.The formatting adjustments to the
Target
andSSLMode
fields improve code consistency without affecting functionality.Also applies to: 35-35
wren-launcher/commands/dbt.go (2)
27-27
: LGTM! Well-implemented command line flag for staging model inclusion.The
--include-staging-models
flag is properly added with a clear description that explains its purpose. This gives users control over whether staging models are included during conversion.
50-50
: LGTM! Consistent implementation of IncludeStagingModels parameter.The parameter is properly threaded through both the
DbtAutoConvert
command and theDbtConvertProject
function, with clean formatting and consistent assignment to theConvertOptions
struct. The function signature update maintains clarity and follows Go conventions.Also applies to: 63-72
wren-launcher/commands/dbt/data_source_test.go (2)
377-470
: LGTM! Comprehensive BigQuery conversion tests.The test cases properly cover both supported BigQuery authentication methods ("service-account" and "service-account-json") and validate correct field extraction and assignment. The test structure follows established patterns and provides good coverage for the new BigQuery functionality.
472-544
: LGTM! Thorough BigQuery validation test coverage.The validation tests comprehensively cover both valid and invalid scenarios, including:
- Both supported authentication methods
- Missing required fields (project, dataset)
- Unsupported oauth method
- Method-specific requirements (keyfile for service-account)
This ensures the BigQuery data source validation behaves correctly across all expected use cases.
wren-launcher/commands/dbt/wren_mdl.go (1)
7-11
: LGTM! Clean struct additions for enhanced metadata support.The new
EnumDefinition
andMetric
structs are well-designed, and the extensions toWrenMDLManifest
andWrenColumn
are properly implemented with appropriate JSON tags and omitempty markers.Also applies to: 15-19, 43-43, 60-68
wren-launcher/commands/dbt/converter.go (1)
431-433
: LGTM! Staging model filtering is implemented correctly.The filtering logic properly excludes models with
.stg_
or.staging_
in their names whenIncludeStagingModels
is false.
Thanks @cougrimes for the contribution. I'll take a look 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cougrimes for working on this. It's a great start for BigQuery and dbt semantic layer integration! 👍
I've left some comments regarding the connection info and launching steps.
For the dbt semantic layer integration, could you provide a sample semantic_manifest.json
? I'm not very familiar with the dbt semantic layer, so a sample would help us confirm whether the conversion logic is reasonable.
wren-launcher/commands/dbt.go
Outdated
OutputDir string | ||
ProfileName string | ||
Target string | ||
IncludeStagingModels bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice improvement 👍
Hi @cougrimes, after #1877, we added some formatter and lint tools for |
@goldmedal Happy to make the changes requested, but I actually wound up hitting an unexpected issue when processing the MDL on my end—despite declaring BigQuery, I keep getting the MDL trying to parse as Trino rather than BQ. Do you have any other documentation on the MDL schema? I've been trying to work backward to figure out why this is happening to no avail. |
What did you mean that the MDL is parsed as Trino? Which step will do it? 🤔
You can check the doc for WrenMDL or the json schema. |
Most of the issues I had been running into around Trino-esque errors are addressed in Canner/wren-engine#1290; issues with keys and auths resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
wren-launcher/commands/dbt/data_source.go (1)
4-11
: Import os to support reading keyfile paths for service-account authWe’ll need os.ReadFile when handling dbt BigQuery profiles that use method: service-account with keyfile path (see below).
import ( "encoding/base64" "encoding/json" "fmt" + "os" "path/filepath" "strings" "github.com/pterm/pterm" )
♻️ Duplicate comments (1)
wren-launcher/commands/launch.go (1)
525-525
: IncludeStagingModels is hardcoded to false; there’s no way to enable it in the Launch flowThis permanently disables staging models in this path. Per the PR objective and previous reviewer note, expose a toggle here (prompt or config/flag) instead of hardcoding.
Apply this minimal change to pass a variable instead of a literal:
- result, err := DbtConvertProject(dbtProjectPath, targetDir, profileName, target, true, false) + includeStagingModels := false // TODO: wire from config flag or prompt + result, err := DbtConvertProject(dbtProjectPath, targetDir, profileName, target, true, includeStagingModels)If you want a quick interactive solution, add a prompt:
// place near other ask* helpers func askForIncludeStagingModels() (bool, error) { prompt := promptui.Select{ Label: "Include staging models (stg_*, staging_*)?", Items: []string{"No", "Yes"}, } _, result, err := prompt.Run() if err != nil { return false, err } return result == "Yes", nil }Then in processDbtProject(), before calling DbtConvertProject:
includeStagingModels, _ := askForIncludeStagingModels() // default to No on error
🧹 Nitpick comments (4)
wren-launcher/commands/dbt/data_source.go (1)
272-283
: Tighten validation by trimming whitespaceAvoid accepting whitespace-only values for required fields.
func (ds *WrenBigQueryDataSource) Validate() error { - if ds.Project == "" { + if strings.TrimSpace(ds.Project) == "" { return fmt.Errorf("project_id cannot be empty") } - if ds.Dataset == "" { + if strings.TrimSpace(ds.Dataset) == "" { return fmt.Errorf("dataset_id cannot be empty") } - if ds.Credentials == "" { + if strings.TrimSpace(ds.Credentials) == "" { return fmt.Errorf("credentials cannot be empty") } return nil }wren-launcher/commands/dbt/converter.go (3)
433-437
: Reduce false positives when excluding staging modelsChecking substrings on nodeKey can skip non-staging models. Prefer checking model name prefix.
- if !includeStagingModels && (strings.Contains(nodeKey, ".stg_") || strings.Contains(nodeKey, ".staging_")) { - continue - } + if !includeStagingModels { + mn := getModelNameFromNodeKey(nodeKey) + if strings.HasPrefix(mn, "stg_") || strings.HasPrefix(mn, "staging_") { + continue + } + }
458-510
: Possible duplicate relationships; add a cheap dedup by keyIf the same relationship is captured via both embedded and compiled tests, duplicates can be produced. Dedup on name+type+condition to keep output clean.
func generateRelationships(manifestData map[string]interface{}) []Relationship { var relationships []Relationship if nodes, ok := manifestData["nodes"].(map[string]interface{}); ok { for nodeKey, nodeValue := range nodes { nodeMap, ok := nodeValue.(map[string]interface{}) if !ok { continue } @@ } } - return relationships + // Dedup + seen := make(map[string]struct{}, len(relationships)) + var unique []Relationship + for _, r := range relationships { + key := r.Name + "|" + r.JoinType + "|" + r.Condition + if _, ok := seen[key]; ok { + continue + } + seen[key] = struct{}{} + unique = append(unique, r) + } + return unique }
938-945
: Nit: precompile the ref() regex onceCompiling on each call is unnecessary. Cache at package scope for micro-efficiency and readability.
You can move the regex to a package-level var:
var refRegex = regexp.MustCompile(`ref\s*\(\s*['"]([^'"]+)['"]\s*\)`)Then in parseRef:
- re := regexp.MustCompile(`ref\s*\(\s*['"]([^'"]+)['"]\s*\)`) - matches := re.FindStringSubmatch(refStr) + matches := refRegex.FindStringSubmatch(refStr)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
wren-launcher/commands/dbt/converter.go
(14 hunks)wren-launcher/commands/dbt/data_source.go
(5 hunks)wren-launcher/commands/dbt/data_source_test.go
(1 hunks)wren-launcher/commands/dbt/profiles_analyzer.go
(2 hunks)wren-launcher/commands/launch.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- wren-launcher/commands/dbt/profiles_analyzer.go
- wren-launcher/commands/dbt/data_source_test.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-09T02:43:20.433Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/data_source.go
wren-launcher/commands/dbt/converter.go
📚 Learning: 2025-08-13T05:17:30.180Z
Learnt from: goldmedal
PR: Canner/WrenAI#1887
File: wren-launcher/commands/dbt/data_source.go:273-276
Timestamp: 2025-08-13T05:17:30.180Z
Learning: In the Wren AI codebase, there's an intentional distinction between dbt connection types and Wren AI data source types: dbt uses "sqlserver" as the connection type, while Wren AI expects "mssql" as the data source type. The conversion layer in convertConnectionToDataSource() correctly matches "sqlserver" from dbt profiles and the WrenMSSQLDataSource.GetType() method correctly returns "mssql" for Wren AI compatibility.
Applied to files:
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-07-09T02:43:07.560Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:154-156
Timestamp: 2025-07-09T02:43:07.560Z
Learning: In the WrenAI codebase, the MapType methods for data sources should return the original type name for unknown types, rather than a default fallback like "string". This maintains consistency across all data source implementations.
Applied to files:
wren-launcher/commands/dbt/data_source.go
🧬 Code Graph Analysis (3)
wren-launcher/commands/launch.go (1)
wren-launcher/commands/dbt.go (1)
DbtConvertProject
(63-75)
wren-launcher/commands/dbt/data_source.go (1)
wren-launcher/commands/dbt/profiles.go (1)
DbtConnection
(16-40)
wren-launcher/commands/dbt/converter.go (3)
wren-launcher/commands/dbt/utils.go (1)
FileExists
(30-36)wren-launcher/commands/dbt/data_source.go (2)
WrenBigQueryDataSource
(260-264)DataSource
(43-47)wren-launcher/commands/dbt/wren_mdl.go (7)
WrenMDLManifest
(4-13)EnumDefinition
(16-19)WrenModel
(22-30)Relationship
(53-59)Metric
(62-69)View
(72-76)WrenColumn
(40-50)
🔇 Additional comments (5)
wren-launcher/commands/dbt/data_source.go (2)
77-89
: BigQuery type dispatch looks goodRouting "bigquery" to a dedicated converter aligns with the DataSource pattern and prior guidance to warn-and-continue on unsupported types.
286-309
: BigQuery type mapping: LGTMCovers the common BigQuery types and correctly falls back to the original type for unknowns, per codebase convention.
wren-launcher/commands/dbt/converter.go (3)
172-180
: BigQuery datasource output structure matches engine expectations and avoids credential leakageEmits project_id, dataset_id, credentials (base64) only, and writes with 0600. This aligns with the engine schema and prior security guidance.
839-847
: DisplayName mapping from dbt meta.label: LGTMCorrectly reads meta.label into WrenColumn.DisplayName as per PR objective.
328-336
: Verify MDLdataSources
field usageI wasn’t able to locate any MDL schema or engine code in this repo that defines how the
dataSources
property is interpreted. If the engine actually expects a path (for example"wren-datasource.json"
) rather than the connector type string, we should either:
- Populate
manifest.DataSources
with the relative JSON filename,- Or leave it empty and let the co-located
wren-datasource.json
be picked up.Would you like to wire
ConvertDbtProjectCore
to pass the generated datasource path intoConvertDbtCatalogToWrenMDL
so this field is populated correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
wren-launcher/commands/launch.go (1)
190-201
: Nice addition: interactive staging-models promptThis addresses the earlier feedback to expose IncludeStagingModels via an extra step. The prompt UX is clear and returns a sane bool.
wren-launcher/commands/dbt/converter.go (2)
861-869
: DisplayName mapping from meta.label: LGTMColumn.DisplayName is now sourced from meta.label per PR objectives. Thanks for addressing the earlier gap.
655-784
: Metric conversion covers simple, ratio, and derived correctlyThe ratio handling now composes expressions using each measure’s agg and expr metadata, avoiding bare-name division. Solid implementation.
🧹 Nitpick comments (2)
wren-launcher/commands/launch.go (1)
537-546
: Wire-up looks correct; minor defensive nitThe call now passes the new IncludeStagingModels argument correctly. If the prompt errors, you already warn and fall back to the default (false). To make the fallback explicit and clearer to readers, consider setting includeStagingModels = false in the error branch.
includeStagingModels, err := askForIncludeStagingModels() if err != nil { pterm.Warning.Println("Could not get staging model preference, defaulting to 'No'.") + includeStagingModels = false }
wren-launcher/commands/dbt/data_source.go (1)
151-231
: BigQuery auth handling is robust; add small trims for resilienceGood coverage of service-account-json (inline), service-account (file path), fallback when method is empty, and explicit skipping of oauth per PR scope. To harden against whitespace in profiles, trim keyfile_json and keyfile path before use.
case "service-account-json": // Extract inline JSON from Additional["keyfile_json"] var keyfileJSON string if kfj, exists := conn.Additional["keyfile_json"]; exists { if kfjStr, ok := kfj.(string); ok { - keyfileJSON = kfjStr + keyfileJSON = strings.TrimSpace(kfjStr) } } @@ case "service-account", "": // Prefer structured field; fall back to Additional["keyfile"] - keyfilePath := strings.TrimSpace(conn.Keyfile) + keyfilePath := strings.TrimSpace(conn.Keyfile) if keyfilePath == "" { if kf, ok := conn.Additional["keyfile"]; ok { if kfStr, ok := kf.(string); ok { - keyfilePath = strings.TrimSpace(kfStr) + keyfilePath = strings.TrimSpace(kfStr) } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
wren-launcher/commands/dbt/converter.go
(14 hunks)wren-launcher/commands/dbt/data_source.go
(5 hunks)wren-launcher/commands/launch.go
(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-09T02:43:20.433Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/data_source.go
wren-launcher/commands/dbt/converter.go
📚 Learning: 2025-08-13T05:17:30.180Z
Learnt from: goldmedal
PR: Canner/WrenAI#1887
File: wren-launcher/commands/dbt/data_source.go:273-276
Timestamp: 2025-08-13T05:17:30.180Z
Learning: In the Wren AI codebase, there's an intentional distinction between dbt connection types and Wren AI data source types: dbt uses "sqlserver" as the connection type, while Wren AI expects "mssql" as the data source type. The conversion layer in convertConnectionToDataSource() correctly matches "sqlserver" from dbt profiles and the WrenMSSQLDataSource.GetType() method correctly returns "mssql" for Wren AI compatibility.
Applied to files:
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-07-09T02:43:07.560Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:154-156
Timestamp: 2025-07-09T02:43:07.560Z
Learning: In the WrenAI codebase, the MapType methods for data sources should return the original type name for unknown types, rather than a default fallback like "string". This maintains consistency across all data source implementations.
Applied to files:
wren-launcher/commands/dbt/data_source.go
🧬 Code Graph Analysis (3)
wren-launcher/commands/dbt/data_source.go (1)
wren-launcher/commands/dbt/profiles.go (1)
DbtConnection
(16-40)
wren-launcher/commands/launch.go (1)
wren-launcher/commands/dbt.go (1)
DbtConvertProject
(63-75)
wren-launcher/commands/dbt/converter.go (3)
wren-launcher/commands/dbt/utils.go (1)
FileExists
(30-36)wren-launcher/commands/dbt/data_source.go (2)
WrenBigQueryDataSource
(312-316)DataSource
(44-48)wren-launcher/commands/dbt/wren_mdl.go (7)
WrenMDLManifest
(4-13)EnumDefinition
(16-19)WrenModel
(22-30)Relationship
(53-59)Metric
(62-69)View
(72-76)WrenColumn
(40-50)
🔇 Additional comments (5)
wren-launcher/commands/launch.go (1)
190-202
: No duplicate definitions of askForIncludeStagingModels found
Verified withrg -nC2 --type go -P '\bfunc\s+askForIncludeStagingModels\s*\('
; only one occurrence exists at wren-launcher/commands/launch.go:191. No redeclaration issues detected.wren-launcher/commands/dbt/data_source.go (1)
83-85
: BigQuery dispatch added correctlyconvertConnectionToDataSource now routes "bigquery" to the dedicated converter, aligning with the new data source support.
wren-launcher/commands/dbt/converter.go (3)
59-64
: Optional files handled gracefullyOptional manifest.json and semantic_manifest.json are discovered and logged with clear fallbacks. This makes the conversion resilient to missing artifacts.
Also applies to: 85-92
172-180
: BigQuery data source output matches engine requirementsProperties use project_id, dataset_id, and base64-encoded credentials as requested in the engine docs and maintainer feedback. File permissions for the output are secure elsewhere (0600).
433-437
: Staging model exclusion implemented as specifiedThe includeStagingModels filter correctly skips stg_* and staging_* when disabled. Simple and effective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
wren-launcher/commands/launch.go (1)
14-14
: Fix invalid import path (breaks build)The import has a typo ("github.comcom"). This will prevent compilation.
Apply this diff:
- "github.comcom/Canner/WrenAI/wren-launcher/config" + "github.com/Canner/WrenAI/wren-launcher/config"
🧹 Nitpick comments (1)
wren-launcher/commands/launch.go (1)
536-544
: Make the "No" fallback explicit on prompt failure; consider non-interactive overrideCurrently, if the prompt fails, includeStagingModels is already false due to the function’s (false, err) return, but making that explicit improves readability and future-proofing. Also consider honoring a non-interactive flag/env (e.g., via config) before prompting so CI/automation can bypass TTY interaction.
Minimal explicit fallback:
// Ask the user whether to include staging models includeStagingModels, err := askForIncludeStagingModels() if err != nil { pterm.Warning.Println("Could not get staging model preference, defaulting to 'No'.") + includeStagingModels = false }
If you’d like, I can wire this to a config/env-driven value (e.g., config.GetIncludeStagingModels()) with the prompt as a fallback.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
wren-launcher/commands/launch.go
(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-launcher/commands/launch.go (1)
wren-launcher/commands/dbt.go (1)
DbtConvertProject
(63-75)
🔇 Additional comments (1)
wren-launcher/commands/launch.go (1)
190-201
: All good: no duplicate definitions foundI’ve verified that
askForIncludeStagingModels
is defined only once inwren-launcher/commands/launch.go
, so there’s no risk of a redefinition compile error. Approving these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
wren-launcher/commands/dbt/converter.go (3)
447-451
: Relationship generation and de-dup: fix looks correctThe de-dup key concatenation is now valid Go and the stray brace is gone. Relationship extraction from both column-embedded tests and compiled test nodes is comprehensive.
Also applies to: 460-524
654-783
: Metrics: correct use of measure agg/expr in ratio/simple metricsThis addresses the earlier concern about bare names; building expressions as AGG(expr) and using proper parentheses for ratios is correct.
860-868
: Column DisplayName mapping from meta.label: LGTMThis fulfills the PR objective and matches the WrenColumn API.
🧹 Nitpick comments (3)
wren-launcher/commands/launch.go (1)
536-541
: Graceful fallback on prompt error is good; consider a non-interactive overrideDefaulting to "No" on prompt failure is sensible. For CI/non-interactive scenarios, consider allowing an override via a flag/env/config (e.g., WREN_INCLUDE_STAGING_MODELS=true) to skip the prompt entirely when set.
Would you like me to wire an env-based override and document it?
wren-launcher/commands/dbt/converter.go (2)
433-438
: Staging model filter: consider making prefixes configurableThe stg_/staging_ prefixes are a good default. Consider making the prefix list configurable (via opts or config/env) to support different naming conventions.
936-945
: Guard getMapFromMap against nil parent mapsMinor robustness tweak: if m is nil, return defaultValue to mirror getStringFromMap.
Apply this diff:
func getMapFromMap(m map[string]interface{}, key string, defaultValue map[string]interface{}) map[string]interface{} { + if m == nil { + return defaultValue + } if value, exists := m[key]; exists { if str, ok := value.(map[string]interface{}); ok { return str } } return defaultValue }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
wren-launcher/commands/dbt/converter.go
(14 hunks)wren-launcher/commands/launch.go
(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-09T02:43:20.433Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/converter.go
🧬 Code Graph Analysis (2)
wren-launcher/commands/launch.go (1)
wren-launcher/commands/dbt.go (1)
DbtConvertProject
(63-75)
wren-launcher/commands/dbt/converter.go (3)
wren-launcher/commands/dbt/utils.go (1)
FileExists
(30-36)wren-launcher/commands/dbt/data_source.go (2)
WrenBigQueryDataSource
(312-316)DataSource
(44-48)wren-launcher/commands/dbt/wren_mdl.go (7)
WrenMDLManifest
(4-13)EnumDefinition
(16-19)WrenModel
(22-30)Relationship
(53-59)Metric
(62-69)View
(72-76)WrenColumn
(40-50)
🔇 Additional comments (9)
wren-launcher/commands/launch.go (2)
190-200
: Include-staging prompt: LGTM; no duplicate definition detectedThe prompt implementation is clean and returns a boolean as expected. Also, despite an earlier summary claiming a duplicate, this file now contains a single definition of askForIncludeStagingModels.
544-545
: All DbtConvertProject call sites updatedI ran the grep across the Go code and found only the function definition and the single invocation in
wren-launcher/commands/launch.go
, both using the new six-argument signature. No remaining calls with the old arity exist.wren-launcher/commands/dbt/converter.go (7)
20-27
: New ConvertOptions.IncludeStagingModels: LGTMAdding this to options keeps the flow explicit and testable.
85-92
: Optional semantic manifest handling: LGTMRobust to missing/malformed semantic_manifest.json with clear warnings. Good trade-off for resilience.
213-237
: Helpful conversion summary: LGTMThe added counts for relationships, metrics, and enums improve UX and debuggability.
326-336
: Manifest initialization extended with Enums/Metrics/DataSources: LGTMTypes align with wren_mdl.go and keep the output self-descriptive.
344-395
: Pre-processing for not_null/accepted_values on struct fields: LGTMSolid coverage of both top-level and embedded field tests with enum de-dup. Nice.
907-911
: PK mapping via semantic_models alias: verify alias aligns with catalog model nameSemantic model node_relation.alias may differ from the dbt model name derived from the catalog key. If you see sparse PKs in output, consider a fallback using the dbt model name or a map from alias->model name.
172-180
: Add cautionary warning for BigQuery credentialsWriting base64-encoded credentials is necessary but sensitive—even with 0600 perms, it’s worth flagging for end users. After the data source file is written, print a clear warning and double-check the property names against your MDL schema.
• After:
pterm.Success.Printf("✓ WrenDataSource saved to: %s\n", dataSourcePath)insert:
if ds != nil && ds.GetType() == "bigquery" { pterm.Warning.Println("Note: wren-datasource.json contains BigQuery credentials. Treat it as sensitive and restrict access.") }• Ensure the JSON keys
"project_id"
,"dataset_id"
, and"credentials"
exactly match what your engine’s MDL/schema expects for BigQuery. The schema isn’t in this repo—please verify against your external specs or documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
wren-launcher/commands/dbt/data_source.go (3)
183-207
: Minor: clarify error when method is omittedWhen method is empty you treat it like service-account (good), but the error message only mentions service-account. Slight wording tweak improves UX.
- return nil, fmt.Errorf("bigquery: method 'service-account' requires 'keyfile' path") + return nil, fmt.Errorf("bigquery: auth method 'service-account' (or omitted) requires 'keyfile' path")
347-370
: Nit: reuse shared type constants for consistencyYou already declare integerType, doubleType, varcharType, booleanType, dateType, timestampType at the top. Returning those improves consistency and avoids typos.
func (ds *WrenBigQueryDataSource) MapType(sourceType string) string { switch strings.ToUpper(sourceType) { case "INT64", "INTEGER": - return "integer" + return integerType case "FLOAT64", "FLOAT": - return "double" + return doubleType case "STRING": - return "varchar" + return varcharType case "BOOL", "BOOLEAN": - return "boolean" + return booleanType case "DATE": - return "date" + return dateType case "TIMESTAMP", "DATETIME": - return "timestamp" + return timestampType case "NUMERIC", "DECIMAL", "BIGNUMERIC": - return "double" + return doubleType case "BYTES": - return "varchar" + return varcharType case "JSON": - return "varchar" + return varcharType default: return strings.ToLower(sourceType) } }
209-214
: Optional: Expand environment variables and “~” in keyfile pathsTo make keyfile resolution more robust when dbt profiles specify paths using
~
or environment variables, consider updating lines 209–214 inwren-launcher/commands/dbt/data_source.go
as follows:- // If keyfile path is not absolute, join it - // with the dbt project's home directory path. - resolvedKeyfilePath := keyfilePath - if !filepath.IsAbs(keyfilePath) && dbtHomePath != "" { - resolvedKeyfilePath = filepath.Join(dbtHomePath, keyfilePath) - } + // Expand env vars and ~ in keyfile path, then resolve relative to dbtHomePath if provided. + expanded := os.ExpandEnv(keyfilePath) + if strings.HasPrefix(expanded, "~") { + if home, err := os.UserHomeDir(); err == nil { + expanded = filepath.Join(home, strings.TrimPrefix(expanded, "~")) + } + } + resolvedKeyfilePath := expanded + if !filepath.IsAbs(resolvedKeyfilePath) && dbtHomePath != "" { + resolvedKeyfilePath = filepath.Join(dbtHomePath, resolvedKeyfilePath) + }We verified that in production code (
converter.go
), profile conversion goes throughGetActiveDataSources
, which correctly passesdbtHomePath
. This change will therefore handle both relative and env-var/tilde-based keyfile paths in the main conversion flow.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
wren-launcher/commands/dbt/data_source.go
(4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-09T02:43:20.433Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-08-13T05:17:30.180Z
Learnt from: goldmedal
PR: Canner/WrenAI#1887
File: wren-launcher/commands/dbt/data_source.go:273-276
Timestamp: 2025-08-13T05:17:30.180Z
Learning: In the Wren AI codebase, there's an intentional distinction between dbt connection types and Wren AI data source types: dbt uses "sqlserver" as the connection type, while Wren AI expects "mssql" as the data source type. The conversion layer in convertConnectionToDataSource() correctly matches "sqlserver" from dbt profiles and the WrenMSSQLDataSource.GetType() method correctly returns "mssql" for Wren AI compatibility.
Applied to files:
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-07-09T02:43:07.560Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:154-156
Timestamp: 2025-07-09T02:43:07.560Z
Learning: In the WrenAI codebase, the MapType methods for data sources should return the original type name for unknown types, rather than a default fallback like "string". This maintains consistency across all data source implementations.
Applied to files:
wren-launcher/commands/dbt/data_source.go
🧬 Code Graph Analysis (1)
wren-launcher/commands/dbt/data_source.go (1)
wren-launcher/commands/dbt/profiles.go (1)
DbtConnection
(16-40)
🔇 Additional comments (7)
wren-launcher/commands/dbt/data_source.go (7)
4-7
: Imports for base64/json/os are appropriateAll three are used by the BigQuery converter. No issues.
83-85
: Correct dispatch to BigQuery converter with dbtHomePath threadingRouting “bigquery” here and passing dbtHomePath enables proper resolution of relative keyfile paths downstream. Looks good.
157-165
: Good: validate service account JSON before base64 encodingThe encodeJSON helper prevents encoding malformed credentials and keeps the original bytes for encoding. Solid approach.
166-183
: Service-account-json flow is correctPulls from Additional["keyfile_json"], validates, and encodes. Clear error when missing. LGTM.
226-232
: Graceful handling of unsupported oauth methodsWarning + skip aligns with project preference to continue processing on unsupported configurations.
234-240
: BigQuery data source construction aligns with Wren engine schemaFields and JSON tags (“project_id”, “dataset_id”, “credentials”) match the documented shape.
333-344
: Validation covers required fieldsTrimming and checking project_id, dataset_id, credentials ensures correctness at emit-time.
hi @cougrimes
|
Hi @cougrimes, are you still working on this? I’d be happy to help resolve the conflicts and get all the tests passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
wren-launcher/commands/dbt.go (1)
15-20
: IncludeStagingModels flag plumbing in opts — LGTMThe new field is well-scoped and consistent with downstream ConvertOptions.
🧹 Nitpick comments (2)
wren-launcher/commands/dbt.go (2)
31-41
: Usage text omits new and existing flags; print generated defaults insteadHard-coded usage examples can drift (e.g., missing --include-staging-models, --profile, --target). Print the flag defaults so help stays accurate.
Apply this diff:
if opts.ProjectPath == "" { pterm.Error.Println("Error: --path parameter is required") - pterm.Info.Println("Usage: wren-launcher dbt-auto-convert --path /path/to/dbt/project --output /path/to/output") + pterm.Info.Println("Usage: wren-launcher dbt-auto-convert [flags]") + flag.PrintDefaults() os.Exit(1) } if opts.OutputDir == "" { pterm.Error.Println("Error: --output parameter is required") - pterm.Info.Println("Usage: wren-launcher dbt-auto-convert --path /path/to/dbt/project --output /path/to/output") + pterm.Info.Println("Usage: wren-launcher dbt-auto-convert [flags]") + flag.PrintDefaults() os.Exit(1) }
61-72
: Prefer options-object API; rename IncludeStagingModels → includeStagingModels and fix docsRename the trailing bool param to lowerCamelCase and remove the stale reference to processDbtProject — the name-only change is safe and requires no caller edits (1 internal caller found at wren-launcher/commands/launch.go:544). Optional: migrate the trailing bools to a dbt.ConvertOptions-based API (e.g., add DbtConvertProjectWithOptions) to future-proof; if you do, update the caller(s) above.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
wren-ui/yarn.lock
is excluded by!**/yarn.lock
,!**/*.lock
📒 Files selected for processing (2)
wren-launcher/commands/dbt.go
(3 hunks)wren-launcher/commands/launch.go
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- wren-launcher/commands/launch.go
🧰 Additional context used
🧬 Code graph analysis (1)
wren-launcher/commands/dbt.go (1)
wren-launcher/commands/dbt/converter.go (2)
ConvertResult
(30-34)ConvertOptions
(19-27)
🔇 Additional comments (2)
wren-launcher/commands/dbt.go (2)
27-27
: CLI flag exposure — LGTMFlag name/description are clear and consistent with behavior.
45-51
: Approve — IncludeStagingModels propagated and honoredConfirmed: opts.IncludeStagingModels is passed to ConvertDbtCatalogToWrenMDL and used to skip models with names starting with "stg_" or "staging_". See wren-launcher/commands/dbt/converter.go:213 and 433–436.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
wren-launcher/commands/dbt/data_source.go (1)
347-371
: BigQuery type mappings LGTMMappings cover core BigQuery types; unknowns fall back to original as expected.
wren-launcher/commands/dbt/converter.go (2)
846-864
: Ratio metric aggregation fix LGTMNow builds (AGG(expr))/(AGG(expr)); correct precedence and fidelity.
917-923
: DisplayName mapping from meta.label LGTMMatches PR objective and struct schema.
🧹 Nitpick comments (6)
wren-launcher/commands/dbt/data_source.go (2)
235-241
: Validate BigQuery DS before returning to surface config errors earlyReturn a clear error instead of deferring to later validation.
Apply this diff:
ds := &WrenBigQueryDataSource{ Project: conn.Project, Dataset: conn.Dataset, Credentials: credentials, } - return ds, nil + if err := ds.Validate(); err != nil { + return nil, fmt.Errorf("bigquery: invalid connection: %w", err) + } + return ds, nil
471-487
: Keep DATE as date (don’t upcast to timestamp) in DefaultDataSourcePreserve semantic fidelity for generic conversions.
Apply this diff:
case "timestamp", "datetime", "date": - return "timestamp" + if strings.ToLower(sourceType) == "date" { + return "date" + } + return "timestamp"wren-launcher/commands/dbt/data_source_test.go (2)
417-451
: Silence secret scanners on test fixtures (base64 “credentials”)Gitleaks flags these constants; annotate as test fixtures.
Apply this diff:
ds: &WrenBigQueryDataSource{ Project: "test-project", Dataset: "test-dataset", - Credentials: "dGVzdC1jcmVkZW50aWFscw==", // "test-credentials" + Credentials: "dGVzdC1jcmVkZW50aWFscw==", // "test-credentials" #nosec G101 gitleaks:allow (test fixture) },Repeat the inline comment on the other base64 credential in this table to avoid future false positives.
254-407
: Add negative test for oauth method (should skip/return none)Ensures unsupported auth is gracefully ignored.
Apply this diff to append a subtest:
@@ t.Run("service-account-with-relative-keyfile-path", func(t *testing.T) { ... }) + + t.Run("oauth (unsupported) is skipped", func(t *testing.T) { + profiles := &DbtProfiles{ + Profiles: map[string]DbtProfile{ + "test_profile": { + Target: "dev", + Outputs: map[string]DbtConnection{ + "dev": { + Type: "bigquery", + Method: "oauth", + Project: "test-project", + Dataset: "test-dataset", + }, + }, + }, + }, + } + dataSources, err := GetActiveDataSources(profiles, "", "test_profile", "dev") + if err != nil { + t.Fatalf("GetActiveDataSources failed: %v", err) + } + if len(dataSources) != 0 { + t.Fatalf("Expected 0 data sources for oauth, got %d", len(dataSources)) + } + })wren-launcher/commands/dbt/converter.go (2)
189-201
: Validate DS before writing wren-datasource.jsonPrevents emitting unusable configs.
Apply this diff:
- // Write WrenDataSource JSON + // Validate before writing WrenDataSource JSON + if err := ds.Validate(); err != nil { + return nil, fmt.Errorf("invalid data source: %w", err) + } + // Write WrenDataSource JSON
277-306
: Annotate os.ReadFile calls to quiet gosec/G304 on controlled pathsInputs are app-controlled; add explicit nosec comments.
Apply this diff:
- catalogBytes, err := os.ReadFile(filepath.Clean(catalogPath)) + catalogBytes, err := os.ReadFile(filepath.Clean(catalogPath)) // #nosec G304 @@ - manifestBytes, err := os.ReadFile(filepath.Clean(manifestPath)) + manifestBytes, err := os.ReadFile(filepath.Clean(manifestPath)) // #nosec G304 @@ - semanticBytes, err := os.ReadFile(filepath.Clean(semanticManifestPath)) + semanticBytes, err := os.ReadFile(filepath.Clean(semanticManifestPath)) // #nosec G304
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
wren-launcher/commands/dbt/converter.go
(7 hunks)wren-launcher/commands/dbt/data_source.go
(9 hunks)wren-launcher/commands/dbt/data_source_test.go
(6 hunks)wren-launcher/commands/dbt/profiles.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- wren-launcher/commands/dbt/profiles.go
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-09T02:43:20.433Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/converter.go
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-08-13T05:17:30.180Z
Learnt from: goldmedal
PR: Canner/WrenAI#1887
File: wren-launcher/commands/dbt/data_source.go:273-276
Timestamp: 2025-08-13T05:17:30.180Z
Learning: In the Wren AI codebase, there's an intentional distinction between dbt connection types and Wren AI data source types: dbt uses "sqlserver" as the connection type, while Wren AI expects "mssql" as the data source type. The conversion layer in convertConnectionToDataSource() correctly matches "sqlserver" from dbt profiles and the WrenMSSQLDataSource.GetType() method correctly returns "mssql" for Wren AI compatibility.
Applied to files:
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-07-09T02:43:07.560Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:154-156
Timestamp: 2025-07-09T02:43:07.560Z
Learning: In the WrenAI codebase, the MapType methods for data sources should return the original type name for unknown types, rather than a default fallback like "string". This maintains consistency across all data source implementations.
Applied to files:
wren-launcher/commands/dbt/data_source.go
🧬 Code graph analysis (3)
wren-launcher/commands/dbt/converter.go (3)
wren-launcher/commands/dbt/utils.go (1)
FileExists
(30-36)wren-launcher/commands/dbt/data_source.go (2)
WrenBigQueryDataSource
(322-326)DataSource
(44-48)wren-launcher/commands/dbt/wren_mdl.go (8)
WrenMDLManifest
(4-13)EnumDefinition
(16-19)WrenModel
(22-30)Relationship
(53-59)Metric
(62-69)View
(72-76)WrenColumn
(40-50)TableReference
(33-37)
wren-launcher/commands/dbt/data_source.go (1)
wren-launcher/commands/dbt/profiles.go (1)
DbtConnection
(16-43)
wren-launcher/commands/dbt/data_source_test.go (2)
wren-launcher/commands/dbt/profiles.go (3)
DbtProfiles
(4-7)DbtProfile
(10-13)DbtConnection
(16-43)wren-launcher/commands/dbt/data_source.go (7)
ValidateAllDataSources
(434-456)GetActiveDataSources
(376-414)WrenBigQueryDataSource
(322-326)DataSource
(44-48)WrenLocalFileDataSource
(243-246)DefaultDataSource
(459-459)WrenPostgresDataSource
(287-293)
🪛 Gitleaks (8.27.2)
wren-launcher/commands/dbt/data_source_test.go
[high] 420-420: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 429-429: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 438-438: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🔇 Additional comments (8)
wren-launcher/commands/dbt/data_source.go (2)
104-107
: Postgres default port fallback looks goodUsing 5432 when unspecified is correct and validated later.
209-226
: BigQuery keyfile path resolution (relative→project root) is correctJoining non-absolute keyfile paths with dbtHomePath prevents cwd surprises.
wren-launcher/commands/dbt/data_source_test.go (3)
265-313
: BigQuery service-account-json test is solidCovers inline JSON path and validates credentials encoding.
314-357
: BigQuery absolute keyfile test is solidGood fixture handling and base64 verification.
358-406
: BigQuery relative keyfile resolution test is solidExercises dbtHomePath join correctly.
wren-launcher/commands/dbt/converter.go (3)
172-179
: BigQuery datasource output matches engine schemaUses project_id, dataset_id, credentials as required.
359-363
: Staging model gating is correctPrefix check matches stg_/staging_ requirement.
527-537
: Relationship de-duplication is correctKeying on Name|JoinType|Condition prevents duplicates efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (10)
wren-launcher/commands/dbt/data_source.go (5)
83-85
: BigQuery branch: avoid failures when dbtHomePath is unknown (enumeration paths).FromDbtProfiles calls convertConnectionToDataSource with an empty dbtHomePath; relative BigQuery keyfile paths will fail to read and abort enumeration. Prefer logging and skipping (nil, nil) in that specific case to align with the “warn and continue” convention.
@@ case "service-account", "": - if keyfilePath == "" { + if keyfilePath == "" { ... - } else { + } else { + if !filepath.IsAbs(keyfilePath) && dbtHomePath == "" { + pterm.Warning.Printf("bigquery: relative keyfile '%s' with unknown dbt home; skipping data source\n", keyfilePath) + return nil, nil + } // If keyfile path is not absolute, join it // with the dbt project's home directory path. resolvedKeyfilePath := keyfilePath if !filepath.IsAbs(keyfilePath) && dbtHomePath != "" { resolvedKeyfilePath = filepath.Join(dbtHomePath, keyfilePath) }
112-150
: Local file URL resolves to directory (Dir(path)); confirm engine expectation.Returning the directory (not the file) is intentional? If the engine expects the file path (e.g., DuckDB .duckdb), this may misconfigure the source.
152-241
: BigQuery credentials path and inline JSON: solid; add early required-field validation.After credentials are built, fail early if project or dataset are empty to avoid writing unusable wren-datasource.json.
@@ switch method { ... } - ds := &WrenBigQueryDataSource{ + if strings.TrimSpace(conn.Project) == "" || strings.TrimSpace(conn.Dataset) == "" { + return nil, fmt.Errorf("bigquery: project and dataset are required") + } + ds := &WrenBigQueryDataSource{ Project: conn.Project, Dataset: conn.Dataset, Credentials: credentials, }
347-371
: BigQuery MapType: consider TIME and GEOGRAPHY mappings.Optional: handle “TIME” and “GEOGRAPHY” explicitly (fallback currently lowercases them). If Wren lacks native types, map to varchar.
case "BYTES": return varcharType case "JSON": return varcharType +case "TIME": + return varcharType +case "GEOGRAPHY": + return varcharType
471-487
: DefaultDataSource type mapping: preserve DATE fidelity.Map “date” to date rather than collapsing into timestamp.
- case "timestamp", "datetime", "date": - return "timestamp" + case "timestamp", "datetime": + return "timestamp" + case "date": + return "date"wren-launcher/commands/dbt/data_source_test.go (3)
208-251
: Validation suite: good coverage for Postgres; consider adding BigQuery invalids.Add cases for missing project/dataset/credentials to mirror Validate() checks.
254-407
: Add oauth-skip subtest and annotate test fixtures to silence secret scanners.
- Verify method: "oauth" returns zero data sources without error.
- Gitleaks flags the inline JSON; add “gitleaks:allow” comments near test fixtures.
@@ - t.Run("service-account-json", func(t *testing.T) { + // gitleaks:allow test fixture; non-sensitive mock JSON + t.Run("service-account-json", func(t *testing.T) { @@ - t.Run("service-account-with-absolute-keyfile-path", func(t *testing.T) { + // gitleaks:allow test fixture; non-sensitive mock JSON + t.Run("service-account-with-absolute-keyfile-path", func(t *testing.T) { @@ - t.Run("service-account-with-relative-keyfile-path", func(t *testing.T) { + // gitleaks:allow test fixture; non-sensitive mock JSON + t.Run("service-account-with-relative-keyfile-path", func(t *testing.T) {@@ t.Run("service-account-json", func(t *testing.T) { ... }) + + t.Run("oauth-unsupported-skips", func(t *testing.T) { + profiles := &DbtProfiles{ + Profiles: map[string]DbtProfile{ + "test_profile": { + Target: "dev", + Outputs: map[string]DbtConnection{ + "dev": { + Type: "bigquery", + Method: "oauth", + Project: "p", + Dataset: "d", + }, + }, + }, + }, + } + dataSources, err := GetActiveDataSources(profiles, "", "test_profile", "dev") + if err != nil { + t.Fatalf("GetActiveDataSources failed: %v", err) + } + if len(dataSources) != 0 { + t.Fatalf("Expected 0 data sources for oauth, got %d", len(dataSources)) + } + })
586-639
: MapType table tests: LGTM; add TIME/GEOGRAPHY cases if mapped.wren-launcher/commands/dbt/converter.go (2)
189-201
: Sensitive output file: file mode is 0600; optionally warn users.Consider printing a one-line hint that the file contains credentials/passwords and should be handled securely.
359-363
: Staging model filter: consider case-insensitive check.Optional: use strings.EqualFold on prefixes to avoid case surprises.
- if !includeStagingModels && (strings.HasPrefix(modelName, "stg_") || strings.HasPrefix(modelName, "staging_")) { + lower := strings.ToLower(modelName) + if !includeStagingModels && (strings.HasPrefix(lower, "stg_") || strings.HasPrefix(lower, "staging_")) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
wren-launcher/commands/dbt/converter.go
(7 hunks)wren-launcher/commands/dbt/data_source.go
(8 hunks)wren-launcher/commands/dbt/data_source_test.go
(7 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-09T02:43:20.433Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:50-53
Timestamp: 2025-07-09T02:43:20.433Z
Learning: In wren-launcher/commands/dbt/data_source.go, when encountering unsupported database types in convertConnectionToDataSource function, the user prefers to log a warning and continue processing instead of failing. The function should return nil, nil to allow processing to continue.
Applied to files:
wren-launcher/commands/dbt/converter.go
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-08-13T05:17:30.180Z
Learnt from: goldmedal
PR: Canner/WrenAI#1887
File: wren-launcher/commands/dbt/data_source.go:273-276
Timestamp: 2025-08-13T05:17:30.180Z
Learning: In the Wren AI codebase, there's an intentional distinction between dbt connection types and Wren AI data source types: dbt uses "sqlserver" as the connection type, while Wren AI expects "mssql" as the data source type. The conversion layer in convertConnectionToDataSource() correctly matches "sqlserver" from dbt profiles and the WrenMSSQLDataSource.GetType() method correctly returns "mssql" for Wren AI compatibility.
Applied to files:
wren-launcher/commands/dbt/data_source.go
📚 Learning: 2025-07-09T02:43:07.560Z
Learnt from: goldmedal
PR: Canner/WrenAI#1827
File: wren-launcher/commands/dbt/data_source.go:154-156
Timestamp: 2025-07-09T02:43:07.560Z
Learning: In the WrenAI codebase, the MapType methods for data sources should return the original type name for unknown types, rather than a default fallback like "string". This maintains consistency across all data source implementations.
Applied to files:
wren-launcher/commands/dbt/data_source.go
🧬 Code graph analysis (3)
wren-launcher/commands/dbt/converter.go (3)
wren-launcher/commands/dbt/utils.go (1)
FileExists
(30-36)wren-launcher/commands/dbt/data_source.go (2)
WrenBigQueryDataSource
(322-326)DataSource
(44-48)wren-launcher/commands/dbt/wren_mdl.go (8)
WrenMDLManifest
(4-13)EnumDefinition
(16-19)WrenModel
(22-30)Relationship
(53-59)Metric
(62-69)View
(72-76)WrenColumn
(40-50)TableReference
(33-37)
wren-launcher/commands/dbt/data_source.go (1)
wren-launcher/commands/dbt/profiles.go (1)
DbtConnection
(16-43)
wren-launcher/commands/dbt/data_source_test.go (2)
wren-launcher/commands/dbt/profiles.go (3)
DbtProfiles
(4-7)DbtProfile
(10-13)DbtConnection
(16-43)wren-launcher/commands/dbt/data_source.go (7)
ValidateAllDataSources
(434-456)GetActiveDataSources
(376-414)WrenBigQueryDataSource
(322-326)DataSource
(44-48)WrenLocalFileDataSource
(243-246)DefaultDataSource
(459-459)WrenPostgresDataSource
(287-293)
🪛 Gitleaks (8.27.2)
wren-launcher/commands/dbt/data_source_test.go
[high] 420-420: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 429-429: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 438-438: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🔇 Additional comments (14)
wren-launcher/commands/dbt/data_source_test.go (3)
87-123
: Postgres default port test: LGTM.
155-156
: Path normalization for local file: LGTM.
409-461
: BigQuery Validate() tests: LGTM.wren-launcher/commands/dbt/converter.go (10)
20-27
: IncludeStagingModels option: LGTM.
59-64
: semantic_manifest.json discovery: LGTM; non-fatal skip is appropriate.
75-84
: Optional manifest handling: concise and user-informative.
172-179
: BigQuery datasource output matches engine schema (project_id, dataset_id, credentials).Good alignment and no plaintext JSON leakage.
213-217
: New ConvertDbtCatalogToWrenMDL signature usage: LGTM.
310-319
: Manifest.DataSources as a string: confirm consumer contract.Property name suggests plural; code sets a single type string (e.g., "bigquery"). Confirm the Wren MDL consumer expects a string here.
475-538
: Relationship generation and de-dup: LGTM.The unique key uses Name|JoinType|Condition; sufficient for idempotency within a run.
590-619
: Enum de-dup and sanitization: LGTM; stable names across identical value sets.
911-944
: Column DisplayName from meta.label: LGTM.Also good: NotNull propagation and per-column descriptions in Properties.
831-870
: Metric aggregation builder: robust handling for simple/ratio/derived.Good use of measure lookup and parentheses for ratios.
wren-launcher/commands/dbt/data_source.go (1)
94-109
: Approve: Postgres port type verified — no string usages foundconvertToPostgresDataSource keeps Port as int (default 5432). Downstream code and tests use numeric ports: wren-launcher/commands/dbt/profiles_analyzer.go (getInt → connection.Port), wren-launcher/commands/dbt/converter.go (properties "port": typedDS.Port), wren-launcher/commands/dbt/data_source_test.go (asserts int). No occurrences treating port as a string found.
@cougrimes |
This PR directly builds off #1827 to provide key updates and expansions around parsing dbt information into Wren MDL, as well as establishing bigquery-dbt support.
Initial Features & BigQuery Support
1. BigQuery Integration
profiles.yml
for BigQuery-specific connection details, includingproject
,dataset
,keyfile
, andmethod
.service-account
,service-account-json
) are correctly configured.oauth
-related methods are screened out with the assumption live testing may come in a later iteration.2. Metadata Mapping
meta.label
field from the dbt project and maps it to thedisplayName
property in the Wren MDL.3. Configurable Staging Model Inclusion
--include-staging-models
command-line flag to thedbt-auto-convert
command.stg_
orstaging_
in their names; otherwise, they are skipped by default.Semantic Layer & Data Integrity Features
1. Relationship Generation from dbt Tests
Relationship
objects from dbtrelationships
tests.manifest.json
: embedded directly onstruct
fields and as top-level, compiledtest
nodes for simple columns.2. Metric Conversion from dbt Semantic Layer
semantic_manifest.json
to translate dbtmetrics
and their underlyingsemantic_models
into WrenMetric
objects.simple
,ratio
, andderived
metric types.semantic_manifest.json
is not found.3. Enum Definition Generation from
accepted_values
TestsEnumDefinition
objects from dbtaccepted_values
tests.EnumDefinition
is created, and all relevant columns are linked to it.struct
fields.4. Primary Key Identification from Semantic Models
entities
list within dbtsemantic_models
to identify the primary entity (type: "primary"
).node_relation.alias
to correctly map the identified primary key to the corresponding dbt model and populate theprimaryKey
field in the Wrenmodel
.5. Not Null Constraint Conversion
not_null
tests from all possible locations in the manifest.not_null
test is found, theNotNull
field on the corresponding WrenWrenColumn
is set totrue
.Miscellaneous notes
Made Semantic Layer Parsing Optional & Robust
semantic_manifest.json
was wrapped in checks to ensure that the converter runs successfully without errors even if the file is missing or malformed, simply skipping the dependent features.Corrected Struct Definitions
wren_mdl.go
file was updated to include the necessary structs and fields forMetric
andEnumDefinition
to support the new features.Summary by CodeRabbit
New Features
Tests
Chores