account for the issue coderabbit found with exactly one column errors. update the ai plan documentation for historic documentation. adjust the readme to add a new design philosophies section to further help illustrate, document, and guide.

tmaiaroto · tmaiaroto · commit 62bb4de460bb · 2025-08-16T10:57:10.000-07:00
diff --git a/DESIGN_PHILOSOPHIES.md b/DESIGN_PHILOSOPHIES.md
@@ -0,0 +1,47 @@
+# Design Philosophies
+
+This document includes information around design philosophies and decisions made to help document and illustrate scenarios one may encounter when using this package.
+
+## Approach
+Carta adopts the "database mapping" approach (described in Martin Fowler's [book](https://books.google.com/books?id=FyWZt5DdvFkC&lpg=PA1&dq=Patterns%20of%20Enterprise%20Application%20Architecture%20by%20Martin%20Fowler&pg=PT187#v=onepage&q=active%20record&f=false)) which is useful among organizations with strict code review processes.
+
+## Comparison to Related Projects
+
+#### GORM
+Carta is NOT an an object-relational mapper(ORM).
+
+#### sqlx
+Sqlx does not track has-many relationships when mapping SQL data. This works fine when all your relationships are at most has-one (Blog has one Author) ie, each SQL row corresponds to one struct. However, handling has-many relationships (Blog has many Posts), requires  running many queries or running manual post-processing of the result. Carta handles these complexities automatically.
+
+## Protection vs. Graceful Handling
+
+A core design principle of the `carta` mapper is to prioritize **user protection and clarity** over attempting a "graceful" but potentially incorrect guess. The library's guiding philosophy is to only proceed if the user's intent is perfectly clear. If there is any ambiguity in the mapping operation, `carta` will **fail fast** by returning an error, forcing the developer to be more explicit.
+
+Making a guess might seem helpful, but it can hide serious, silent bugs. The following scenarios illustrate the balance between failing on ambiguous operations (Protection) and handling well-defined transformations (Graceful Handling).
+
+---
+
+### Scenario 1: Multi-column Query to a Basic Slice (Protection)
+
+-   **Query:** `SELECT name, email FROM users`
+-   **Destination:** `var data []string`
+-   **Behavior:** `carta.Map` **returns an error immediately**: `carta: when mapping to a slice of a basic type, the query must return exactly one column (got 2)`.
+-   **Why this is Protection:** The library has no way of knowing if the user intended to map the `name` or the `email` column. A "graceful" solution might be to arbitrarily pick the first column, but this could lead to the wrong data being silently loaded into the slice. By failing fast, `carta` forces the developer to write an unambiguous query (e.g., `SELECT name FROM users`), ensuring the result is guaranteed to be correct.
+
+---
+
+### Scenario 2: SQL `NULL` to a Non-nullable Go Field (Protection)
+
+-   **Query:** `SELECT id, NULL AS name FROM users`
+-   **Destination:** `var users []User` (where `User.Name` is a `string`)
+-   **Behavior:** `carta.Map` **returns an error during scanning**: `carta: cannot load null value to type string for column name`.
+-   **Why this is Protection:** A standard Go `string` cannot represent a `NULL` value. A "graceful" but incorrect solution would be to use the zero value (`""`), which is valid data and semantically different from "no data". This can cause subtle bugs in application logic. By failing, `carta` forces the developer to explicitly handle nullability in their Go struct by using a pointer (`*string`) or a nullable type (`sql.NullString`), making the code more robust and correct.
+
+---
+
+### Scenario 3: Merging `JOIN`ed Rows into Structs (Graceful Handling)
+
+-   **Query:** `SELECT b.id, p.id FROM blogs b JOIN posts p ON b.id = p.blog_id`
+-   **Destination:** `var blogs []BlogWithPosts`
+-   **Behavior:** `carta` **gracefully handles** the fact that the same blog ID appears in multiple rows. It creates one `Blog` object and appends each unique `Post` to its `Posts` slice.
+-   **Why this is Graceful:** This is the core purpose of the library. There is no ambiguity. The library uses the unique ID of the `Blog` (the `b.id` column) to understand that these rows all describe the same parent entity. This is a well-defined transformation, not a guess.
diff --git a/README.md b/README.md
@@ -1,9 +1,9 @@
 # Carta
 [![codecov](https://codecov.io/github/hackafterdark/carta/graph/badge.svg?token=TYvbPGGlcL)](https://codecov.io/github/hackafterdark/carta)
 
-Dead simple SQL data mapper for complex Go structs. 
+A simple SQL data mapper for complex Go structs. Load SQL data onto Go structs while keeping track of has-one and has-many relationships.
 
-Load SQL data onto Go structs while keeping track of has-one and has-many relationships
+Carta is not an object-relational mapper(ORM). With large and complex datasets, using ORMs becomes restrictive and reduces performance when working with complex queries. [Read more about the design philosophy.](#design-philosophy)
 
 ## Examples 
 Using carta is very simple. All you need to do is: 
@@ -93,15 +93,6 @@ blogs:
 }]
 ```
 
-
-## Comparison to Related Projects
-
-#### GORM
-Carta is NOT an an object-relational mapper(ORM). Read more in [Approach](#Approach)
-
-#### sqlx
-Sqlx does not track has-many relationships when mapping SQL data. This works fine when all your relationships are at most has-one (Blog has one Author) ie, each SQL row corresponds to one struct. However, handling has-many relationships (Blog has many Posts), requires  running many queries or running manual post-processing of the result. Carta handles these complexities automatically.
-
 ## Guide
 
 ### Column and Field Names
@@ -233,19 +224,14 @@ Other types, such as TIME, will will be converted from plain text in future vers
 go get -u github.com/hackafterdark/carta
 ```
 
+## Design Philosophy
 
-## Important Notes 
+The `carta` package follows a "fail-fast" philosophy to ensure that mapping operations are unambiguous and to protect users from silent bugs. For a detailed explanation of the error handling approach and the balance between user protection and graceful handling, please see the [Design Philosophies](./DESIGN_PHILOSOPHIES.md) document.
+
+## Important Notes
 
 When mapping to **slices of structs**, Carta removes duplicate entities. This is a side effect of the data mapping process, which merges rows that identify the same entity (e.g., a `Blog` with the same ID appearing in multiple rows due to a `JOIN`). To ensure correct mapping, you should always include uniquely identifiable columns (like a primary key) in your query for each struct entity.
 
 When mapping to **slices of basic types** (e.g., `[]string`, `[]int`), every row from the query is treated as a unique element, and **no de-duplication occurs**.
  
-To prevent relatively expensive reflect operations, carta caches the structure of your struct using the column mames of your query response as well as the type of your struct. 
-
-## Approach
-Carta adopts the "database mapping" approach (described in Martin Fowler's [book](https://books.google.com/books?id=FyWZt5DdvFkC&lpg=PA1&dq=Patterns%20of%20Enterprise%20Application%20Architecture%20by%20Martin%20Fowler&pg=PT187#v=onepage&q=active%20record&f=false)) which is useful among organizations with strict code review processes.
-
-Carta is not an object-relational mapper(ORM). With large and complex datasets, using ORMs becomes restrictive and reduces performance when working with complex queries. 
-
-### License
-Apache License
+To prevent relatively expensive reflect operations, carta caches the structure of your struct using the column mames of your query response as well as the type of your struct.
diff --git a/ai_plans/FIX_DUPLICATE_ROWS.md b/ai_plans/FIX_DUPLICATE_ROWS.md
@@ -1,37 +1,50 @@
-# Problem: Incorrect De-duplication When Mapping to Basic Slices
-
-## Summary
-The `carta` library is designed to de-duplicate entities when mapping SQL rows to slices of structs (e.g., `[]User`). This is achieved by generating a unique ID for each entity based on the content of its primary key columns. This behavior is correct for handling `JOIN`s where a single entity might appear across multiple rows.
-
-However, this same logic is incorrectly applied when the mapping destination is a slice of a basic type (e.g., `[]string`, `[]int`). In this scenario, rows with duplicate values are treated as the same entity and are de-duplicated, which is incorrect. The desired behavior is to preserve every row from the result set, including duplicates.
-
-This issue is the root cause for the following problems:
-1.  The `if m.IsBasic` code path in `load.go` lacks test coverage because no tests exist for mapping to basic slices.
-2.  Attempts to write such tests lead to infinite loops and incorrect behavior because the column allocation and unique ID generation logic are not designed to handle this case.
-
-## Proposed Solution
-The solution is to create a distinct execution path for "basic mappers" (`m.IsBasic == true`) that ensures every row is treated as a unique element.
-
-This will be accomplished in two main steps:
-
-### 1. Fix Column Allocation (`allocateColumns`)
-The logic will be modified to enforce a clear rule for basic slices: the source SQL query must return **exactly one column**.
-
--   If `m.IsBasic` is true, the function will bypass the existing name-matching logic.
--   It will validate that only one column is present in the query result.
--   This single column will be assigned as the `PresentColumn` for the mapper.
--   If more than one column is found, the function will return an error to prevent ambiguity.
-
-### 2. Fix Unique ID Generation (`loadRow`)
-The logic will be modified to generate a unique ID based on the row's position rather than its content.
-
--   If `m.IsBasic` is true, the call to `getUniqueId(row, m)` will be bypassed.
--   A new, position-based unique ID will be generated for each row (e.g., using a simple counter that increments with each row processed).
--   This ensures that every row, regardless of its content, is treated as a distinct element to be added to the destination slice.
-
-This approach preserves the existing, correct behavior for struct mapping while introducing a new, robust path for handling basic slices correctly.
-
-## Plan
-1.  **Modify `column.go`**: Update the `allocateColumns` function to implement the single-column rule for basic mappers.
-2.  **Modify `load.go`**: Update the `loadRow` function to use a position-based counter for unique ID generation when `m.IsBasic` is true.
-3.  **Add Tests**: Create a new test case in `mapper_test.go` that maps a query result to a slice of a basic type (e.g., `[]string`) to validate the fix and provide coverage for the `m.IsBasic` code path.
+# Plan: Fix Incorrect De-duplication for Basic Slices
+
+## 1. Problem Summary
+The `carta` library was incorrectly de-duplicating rows when mapping to a slice of a basic type (e.g., `[]string`). The logic, designed to merge `JOIN`ed rows for slices of structs, was misapplied, causing data loss. This also meant the `m.IsBasic` code path was entirely untested.
+
+The goal was to modify the library to correctly preserve all rows, including duplicates, when mapping to a basic slice, and to add the necessary test coverage.
+
+## 2. Evolution of the Solution
+
+The final solution was reached through an iterative process of implementation and refinement based on code review feedback.
+
+### Initial Implementation
+The first version of the fix introduced two key changes:
+1.  **Position-Based Unique IDs:** In `load.go`, the `loadRow` function was modified. When `m.IsBasic` is true, it now generates a unique ID based on the row's position in the result set (e.g., "row-0", "row-1") instead of its content. This ensures every row is treated as a unique element.
+2.  **Single-Column Rule:** In `column.go`, the `allocateColumns` function was updated to enforce a strict rule: if the destination is a basic slice, the SQL query must return **exactly one column**. This prevents ambiguity.
+
+### Refinements from Code Review
+Feedback from a code review (via Coderabbit) prompted several improvements:
+-   **Performance:** In `load.go`, `fmt.Sprintf` was replaced with the more performant `strconv.Itoa` for generating the position-based unique ID.
+-   **Idiomatic Go:** Error creation was changed from `errors.New(fmt.Sprintf(...))` to the more idiomatic `fmt.Errorf`.
+-   **Clearer Errors:** The error message for the single-column rule was improved to include the actual number of columns found, aiding debugging.
+-   **Test Coverage:** A negative test case was added to `mapper_test.go` to ensure the single-column rule correctly returns an error.
+
+### Final Fix: Handling Nested Basic Mappers
+The most critical refinement came from identifying a flaw in the single-column rule: it did not correctly handle **nested** basic slices (e.g., a struct field like `Tags []string`). The initial logic would have incorrectly failed if other columns for the parent struct were present.
+
+The final patch corrected this by making the logic in `allocateColumns` more nuanced:
+-   **For top-level basic slices** (`len(m.AncestorNames) == 0`), the query must still contain exactly one column overall.
+-   **For nested basic slices**, the function now searches the remaining columns for exactly one that matches the ancestor-qualified name (e.g., `tags`). It returns an error if zero or more than one match is found.
+
+This final change ensures the logic is robust for both top-level and nested use cases.
+
+## 3. Summary of Changes Executed
+1.  **Modified `load.go`**:
+    -   Updated `loadRow` to accept a `rowCount` parameter.
+    -   Implemented logic to generate a unique ID from `rowCount` when `m.IsBasic` is true.
+    -   Refactored error handling and string formatting based on code review feedback.
+2.  **Modified `column.go`**:
+    -   Updated `allocateColumns` to differentiate between top-level and nested basic mappers, enforcing the correct single-column matching rule for each.
+    -   Improved the error message to be more descriptive.
+3.  **Modified `mapper.go`**:
+    -   Corrected the logic in `determineFieldsNames` to properly handle casing in `carta` tags, ensuring ancestor names are generated correctly.
+4.  **Added Tests to `mapper_test.go`**:
+    -   Added a test for a top-level basic slice (`[]string`) to verify that duplicates are preserved.
+    -   Added a negative test to ensure an error is returned for a multi-column query to a top-level basic slice.
+    -   Added a test for a nested basic slice (`PostWithTags.Tags []string`) to verify correct mapping.
+    -   Added negative tests to ensure errors are returned for nested basic slices with zero or multiple matching columns.
+5.  **Updated Documentation**:
+    -   Updated `README.md` to clarify the difference in de-duplication behavior.
+    -   Created `DESIGN_PHILOSOPHIES.md` to document the "fail-fast" error handling approach.
diff --git a/column.go b/column.go
@@ -18,17 +18,48 @@ type column struct {
 func allocateColumns(m *Mapper, columns map[string]column) error {
 	presentColumns := map[string]column{}
 	if m.IsBasic {
-		if len(columns) != 1 {
-			return fmt.Errorf("carta: when mapping to a slice of a basic type, the query must return exactly one column (got %d)", len(columns))
-		}
-		for cName, c := range columns {
+		if len(m.AncestorNames) == 0 {
+			// Top-level basic mapper: must map exactly one column overall
+			if len(columns) != 1 {
+				return fmt.Errorf(
+					"carta: when mapping to a slice of a basic type, "+
+						"the query must return exactly one column (got %d)",
+					len(columns),
+				)
+			}
+			for cName, c := range columns {
+				presentColumns[cName] = column{
+					typ:         c.typ,
+					name:        cName,
+					columnIndex: c.columnIndex,
+				}
+				delete(columns, cName)
+				break
+			}
+		} else {
+			// Nested basic mapper: pick exactly one matching ancestor-qualified column
+			candidates := getColumnNameCandidates("", m.AncestorNames, m.Delimiter)
+			var matched []string
+			for cName := range columns {
+				if candidates[cName] {
+					matched = append(matched, cName)
+				}
+			}
+			if len(matched) != 1 {
+				return fmt.Errorf(
+					"carta: basic sub-mapper for %v expected exactly one matching column "+
+						"(ancestors %v), got %d matches",
+					m.Typ, m.AncestorNames, len(matched),
+				)
+			}
+			cName := matched[0]
+			c := columns[cName]
 			presentColumns[cName] = column{
 				typ:         c.typ,
 				name:        cName,
 				columnIndex: c.columnIndex,
 			}
 			delete(columns, cName)
-			break
 		}
 	} else {
 		for i, field := range m.Fields {
diff --git a/mapper.go b/mapper.go
@@ -244,7 +244,7 @@ func determineFieldsNames(m *Mapper) error {
 				if tag := nameFromTag(field.Tag, CartaTagKey); tag != "" {
 					subMap.Delimiter = "->"
 					parts := strings.Split(tag, ",")
-					name = parts[0]
+					name = strings.TrimSpace(parts[0])
 					if len(parts) > 1 {
 						for _, part := range parts[1:] {
 							option := strings.Split(part, "=")
diff --git a/mapper_test.go b/mapper_test.go
@@ -571,3 +571,98 @@ func TestMapToBasicSlice_MultipleColumnsError(t *testing.T) {
 		t.Fatalf("expected error when mapping to []string with multiple columns, got nil")
 	}
 }
+
+type PostWithTags struct {
+	ID   int      `db:"id"`
+	Tags []string `carta:"Tags"`
+}
+
+func TestNestedBasicSliceMap(t *testing.T) {
+	db, mock, err := sqlmock.New()
+	if err != nil {
+		t.Fatalf("an error '%s' was not expected when opening a stub database connection", err)
+	}
+	defer db.Close()
+
+	rows := sqlmock.NewRows([]string{"id", "Tags"}).
+		AddRow(1, "tag1").
+		AddRow(1, "tag2")
+
+	mock.ExpectQuery("SELECT (.+) FROM posts").WillReturnRows(rows)
+
+	sqlRows, err := db.Query("SELECT * FROM posts")
+	if err != nil {
+		t.Fatalf("error '%s' was not expected when querying rows", err)
+	}
+
+	var posts []PostWithTags
+	err = Map(sqlRows, &posts)
+	if err != nil {
+		t.Errorf("error was not expected while mapping rows: %s", err)
+	}
+
+	if len(posts) != 1 {
+		t.Fatalf("expected 1 post, got %d", len(posts))
+	}
+
+	if len(posts[0].Tags) != 2 {
+		t.Fatalf("expected 2 tags, got %d", len(posts[0].Tags))
+	}
+
+	expectedTags := []string{"tag1", "tag2"}
+	if !reflect.DeepEqual(posts[0].Tags, expectedTags) {
+		t.Errorf("expected tags to be %+v, but got %+v", expectedTags, posts[0].Tags)
+	}
+
+	if err := mock.ExpectationsWereMet(); err != nil {
+		t.Errorf("there were unfulfilled expectations: %s", err)
+	}
+}
+
+func TestNestedBasicSliceMap_NoMatchingColumnsError(t *testing.T) {
+	db, mock, err := sqlmock.New()
+	if err != nil {
+		t.Fatalf("an error '%s' was not expected when opening a stub database connection", err)
+	}
+	defer db.Close()
+
+	rows := sqlmock.NewRows([]string{"id", "other_column"}).
+		AddRow(1, "value")
+
+	mock.ExpectQuery("SELECT (.+) FROM posts").WillReturnRows(rows)
+
+	sqlRows, err := db.Query("SELECT * FROM posts")
+	if err != nil {
+		t.Fatalf("error '%s' was not expected when querying rows", err)
+	}
+
+	var posts []PostWithTags
+	err = Map(sqlRows, &posts)
+	if err == nil {
+		t.Errorf("expected an error when mapping a nested basic slice with no matching columns, but got nil")
+	}
+}
+
+func TestNestedBasicSliceMap_MultipleMatchingColumnsError(t *testing.T) {
+	db, mock, err := sqlmock.New()
+	if err != nil {
+		t.Fatalf("an error '%s' was not expected when opening a stub database connection", err)
+	}
+	defer db.Close()
+
+	rows := sqlmock.NewRows([]string{"id", "tags", "Tags"}).
+		AddRow(1, "tag1", "tag2")
+
+	mock.ExpectQuery("SELECT (.+) FROM posts").WillReturnRows(rows)
+
+	sqlRows, err := db.Query("SELECT * FROM posts")
+	if err != nil {
+		t.Fatalf("error '%s' was not expected when querying rows", err)
+	}
+
+	var posts []PostWithTags
+	err = Map(sqlRows, &posts)
+	if err == nil {
+		t.Errorf("expected an error when mapping a nested basic slice with multiple matching columns, but got nil")
+	}
+}