aisrael
diff --git a/‎README.md‎
Lines changed: 43 additions & 15 deletions b/‎README.md‎
Lines changed: 43 additions & 15 deletions
diff --git a/‎features/cli/convert.feature‎
Lines changed: 21 additions & 0 deletions b/‎features/cli/convert.feature‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎features/cli/count.feature‎
Lines changed: 6 additions & 1 deletion b/‎features/cli/count.feature‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎features/cli/head.feature‎
Lines changed: 20 additions & 1 deletion b/‎features/cli/head.feature‎
Lines changed: 20 additions & 1 deletion
diff --git a/‎features/cli/schema.feature‎
Lines changed: 7 additions & 1 deletion b/‎features/cli/schema.feature‎
Lines changed: 7 additions & 1 deletion
diff --git a/‎features/cli/tail.feature‎
Lines changed: 13 additions & 1 deletion b/‎features/cli/tail.feature‎
Lines changed: 13 additions & 1 deletion
diff --git a/‎features/repl/conversion.feature‎
Lines changed: 28 additions & 0 deletions b/‎features/repl/conversion.feature‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎features/repl/head.feature‎
Lines changed: 12 additions & 0 deletions b/‎features/repl/head.feature‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎features/repl/tail.feature‎
Lines changed: 12 additions & 0 deletions b/‎features/repl/tail.feature‎
Lines changed: 12 additions & 0 deletions
@@ -26,8 +26,8 @@ cargo install --git https://github.com/aisrael/datu
 | Parquet (`.parquet`, `.parq`) |  ✓   |   ✓   |    —    |
 | Avro (`.avro`)                |  ✓   |   ✓   |    —    |
 | ORC (`.orc`)                  |  ✓   |   ✓   |    —    |
+| CSV (`.csv`)                  |  ✓   |   ✓   |    ✓    |
 | XLSX (`.xlsx`)                |  —   |   ✓   |    —    |
-| CSV (`.csv`)                  |  —   |   ✓   |    ✓    |
 | JSON (`.json`)                |  —   |   ✓   |    ✓    |
 | JSON (pretty)                 |  —   |   —   |    ✓    |
 | YAML                          |  —   |   —   |    ✓    |
@@ -36,6 +36,8 @@ cargo install --git https://github.com/aisrael/datu
 - **Write** — Output file formats for `convert`.
 - **Display** — Output format when printing to stdout (`schema`, `head`, `tail` via `--output`: csv, json, json-pretty, yaml).
 
+**CSV options:** When reading CSV files, the `--has-headers` option controls whether the first row is treated as column names. Omitted or `--has-headers` means true (header present); `--has-headers=false` for headerless CSV. Applies to `convert`, `count`, `schema`, `head`, and `tail`.
+
 Usage
 =====
 
@@ -60,9 +62,9 @@ Perform the same conversion and column filtering.
 
 ### `schema`
 
-Display the schema of a Parquet, Avro, or ORC file (column names, types, and nullability). Useful for inspecting file structure without reading data.
+Display the schema of a Parquet, Avro, CSV, or ORC file (column names, types, and nullability). Useful for inspecting file structure without reading data. CSV schema uses type inference from the data.
 
-**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`).
+**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), CSV (`.csv`), ORC (`.orc`).
 
 **Usage:**
 
@@ -75,6 +77,7 @@ datu schema <FILE> [OPTIONS]
 | Option | Description |
 |--------|-------------|
 | `--output <FORMAT>` | Output format: `csv`, `json`, `json-pretty`, or `yaml`. Case insensitive. Default: `csv`. |
+| `--has-headers [BOOL]` | For CSV input: whether the first row is a header. Default: true when omitted. Use `--has-headers=false` for headerless CSV. |
 
 **Output formats:**
 
@@ -104,25 +107,35 @@ datu schema events.avro -o YAML
 
 ### `count`
 
-Return the number of rows in a Parquet, Avro, or ORC file.
+Return the number of rows in a Parquet, Avro, CSV, or ORC file.
 
-**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`).
+**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), CSV (`.csv`), ORC (`.orc`).
 
 **Usage:**
 
 ```sh
-datu count <FILE>
+datu count <FILE> [OPTIONS]
 ```
 
+**Options:**
+
+| Option | Description |
+|--------|-------------|
+| `--has-headers [BOOL]` | For CSV input: whether the first row is a header. Default: true when omitted. Use `--has-headers=false` for headerless CSV. |
+
 **Examples:**
 
 ```sh
 # Count rows in a Parquet file
 datu count data.parquet
 
-# Count rows in an Avro or ORC file
+# Count rows in an Avro, CSV, or ORC file
 datu count events.avro
+datu count data.csv
 datu count data.orc
+
+# Count rows in a headerless CSV file
+datu count data.csv --has-headers=false
 ```
 
 ---
@@ -131,7 +144,7 @@ datu count data.orc
 
 Convert data between supported formats. Input and output formats are inferred from file extensions.
 
-**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`).
+**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), CSV (`.csv`), ORC (`.orc`).
 
 **Supported output formats:** CSV (`.csv`), JSON (`.json`), Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`), XLSX (`.xlsx`).
 
@@ -149,23 +162,30 @@ datu convert <INPUT> <OUTPUT> [OPTIONS]
 | `--limit <N>` | Maximum number of records to read from the input. |
 | `--sparse` | For JSON/YAML: omit keys with null/missing values. Default: true. Use `--sparse=false` to include default values (e.g. empty string). |
 | `--json-pretty` | When converting to JSON, format output with indentation and newlines. Ignored for other output formats. |
+| `--has-headers [BOOL]` | For CSV input: whether the first row is a header. Default: true when omitted. Use `--has-headers=false` for headerless CSV. |
 
 **Examples:**
 
 ```sh
 # Parquet to CSV (all columns)
 datu convert data.parquet data.csv
 
+# CSV to Parquet (with automatic type inference)
+datu convert data.csv data.parquet
+
 # Parquet to Avro (first 1000 rows)
 datu convert data.parquet data.avro --limit 1000
 
 # Avro to CSV, only specific columns
 datu convert events.avro events.csv --select id,timestamp,user_id
 
+# CSV to JSON with headerless input
+datu convert data.csv output.json --has-headers=false
+
 # Parquet to Parquet with column subset
 datu convert input.parq output.parquet --select one,two,three
 
-# Parquet, Avro, or ORC to Excel (.xlsx)
+# Parquet, Avro, CSV, or ORC to Excel (.xlsx)
 datu convert data.parquet report.xlsx
 
 # Parquet or Avro to ORC
@@ -179,9 +199,9 @@ datu convert data.parquet data.json
 
 ### `head`
 
-Print the first N rows of a Parquet, Avro, or ORC file to stdout (default CSV; use `--output` for other formats).
+Print the first N rows of a Parquet, Avro, CSV, or ORC file to stdout (default CSV; use `--output` for other formats).
 
-**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`).
+**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), CSV (`.csv`), ORC (`.orc`).
 
 **Usage:**
 
@@ -197,6 +217,7 @@ datu head <INPUT> [OPTIONS]
 | `--output <FORMAT>` | Output format: `csv`, `json`, `json-pretty`, or `yaml`. Case insensitive. Default: `csv`. |
 | `--sparse` | For JSON/YAML: omit keys with null/missing values. Default: true. Use `--sparse=false` to include default values. |
 | `--select <COLUMNS>...` | Columns to include. If not specified, all columns are printed. Same format as `convert --select`. |
+| `--has-headers [BOOL]` | For CSV input: whether the first row is a header. Default: true when omitted. Use `--has-headers=false` for headerless CSV. |
 
 **Examples:**
 
@@ -207,21 +228,25 @@ datu head data.parquet
 # First 100 rows
 datu head data.parquet -n 100
 datu head data.avro --number 100
+datu head data.csv -n 100
 datu head data.orc --number 100
 
 # First 20 rows, specific columns
 datu head data.parquet -n 20 --select id,name,email
+
+# Head from a headerless CSV file
+datu head data.csv --has-headers=false
 ```
 
 ---
 
 ### `tail`
 
-Print the last N rows of a Parquet, Avro, or ORC file to stdout (default CSV; use `--output` for other formats).
+Print the last N rows of a Parquet, Avro, CSV, or ORC file to stdout (default CSV; use `--output` for other formats).
 
-**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`).
+**Supported input formats:** Parquet (`.parquet`, `.parq`), Avro (`.avro`), CSV (`.csv`), ORC (`.orc`).
 
-> **Note:** For Avro files, `tail` requires a full file scan since Avro does not support random access to the end of the file.
+> **Note:** For Avro and CSV files, `tail` requires a full file scan since these formats do not support random access to the end of the file.
 
 **Usage:**
 
@@ -237,6 +262,7 @@ datu tail <INPUT> [OPTIONS]
 | `--output <FORMAT>` | Output format: `csv`, `json`, `json-pretty`, or `yaml`. Case insensitive. Default: `csv`. |
 | `--sparse` | For JSON/YAML: omit keys with null/missing values. Default: true. Use `--sparse=false` to include default values. |
 | `--select <COLUMNS>...` | Columns to include. If not specified, all columns are printed. Same format as `convert --select`. |
+| `--has-headers [BOOL]` | For CSV input: whether the first row is a header. Default: true when omitted. Use `--has-headers=false` for headerless CSV. |
 
 **Examples:**
 
@@ -247,6 +273,7 @@ datu tail data.parquet
 # Last 50 rows
 datu tail data.parquet -n 50
 datu tail data.avro --number 50
+datu tail data.csv -n 50
 datu tail data.orc --number 50
 
 # Last 20 rows, specific columns
@@ -285,10 +312,11 @@ read("input") |> ... |> write("output")
 
 #### `read(path)`
 
-Read a data file. Supported formats: Parquet (`.parquet`, `.parq`), Avro (`.avro`), ORC (`.orc`).
+Read a data file. Supported formats: Parquet (`.parquet`, `.parq`), Avro (`.avro`), CSV (`.csv`), ORC (`.orc`). CSV files are assumed to have a header row by default.
 
 ```text
 > read("data.parquet") |> write("data.csv")
+> read("data.csv") |> write("data.parquet")
 ```
 
 #### `write(path)`
 
@@ -41,6 +41,27 @@ Feature: Convert
     And the first line of that file should contain "one,two"
     And that file should have 4 lines
 
+  Scenario: CSV to Parquet
+    When I run `datu convert fixtures/table.csv $TEMPDIR/table_from_csv.parquet`
+    Then the command should succeed
+    And the output should contain "Converting fixtures/table.csv to $TEMPDIR/table_from_csv.parquet"
+    And the file "$TEMPDIR/table_from_csv.parquet" should exist
+
+  Scenario: CSV to JSON
+    When I run `datu convert fixtures/table.csv $TEMPDIR/table_from_csv.json`
+    Then the command should succeed
+    And the output should contain "Converting fixtures/table.csv to $TEMPDIR/table_from_csv.json"
+    And the file "$TEMPDIR/table_from_csv.json" should exist
+    And the file "$TEMPDIR/table_from_csv.json" should be valid JSON
+    And the file "$TEMPDIR/table_from_csv.json" should contain "one"
+    And the file "$TEMPDIR/table_from_csv.json" should contain "two"
+
+  Scenario: CSV to Parquet with --has-headers=false
+    When I run `datu convert fixtures/no_header.csv $TEMPDIR/no_header.parquet --has-headers=false`
+    Then the command should succeed
+    And the output should contain "Converting fixtures/no_header.csv to $TEMPDIR/no_header.parquet"
+    And the file "$TEMPDIR/no_header.parquet" should exist
+
   Scenario: Avro to CSV
     When I run `datu convert fixtures/userdata5.avro $TEMPDIR/userdata5.csv`
     Then the command should succeed
 
@@ -1,5 +1,5 @@
 Feature: Count
-  Return the number of rows in a Parquet, Avro, or ORC file.
+  Return the number of rows in a Parquet, Avro, CSV, or ORC file.
 
   Scenario: Count Parquet
     When I run `datu count fixtures/table.parquet`
@@ -17,3 +17,8 @@ Feature: Count
     When I run `datu count $TEMPDIR/userdata5.orc`
     Then the command should succeed
     And the output should contain "10"
+
+  Scenario: Count CSV
+    When I run `datu count fixtures/table.csv`
+    Then the command should succeed
+    And the output should contain "3"
@@ -1,5 +1,5 @@
 Feature: Head
-  Print the first N rows of a Parquet, Avro, or ORC file as CSV.
+  Print the first N rows of a Parquet, Avro, CSV, or ORC file as CSV.
 
   Scenario: Head Parquet default (10 lines)
     When I run `datu head fixtures/userdata.parquet`
@@ -35,6 +35,25 @@ Feature: Head
     And the output should have a header and 2 lines
     And the first line of the output should be: id,email
 
+  Scenario: Head CSV default (10 lines)
+    When I run `datu head fixtures/table.csv`
+    Then the command should succeed
+    And the output should have a header and 3 lines
+    And the first line of the output should contain "one"
+    And the first line of the output should contain "two"
+
+  Scenario: Head CSV with -n 2
+    When I run `datu head fixtures/table.csv -n 2`
+    Then the command should succeed
+    And the output should have a header and 2 lines
+    And the first line of the output should contain "one,two"
+
+  Scenario: Head CSV with --select
+    When I run `datu head fixtures/table.csv -n 2 --select two,four`
+    Then the command should succeed
+    And the output should have a header and 2 lines
+    And the first line of the output should be: two,four
+
   Scenario: Head ORC default (10 lines)
     When I run `datu convert fixtures/userdata5.avro $TEMPDIR/userdata5.orc --select id,first_name --limit 10`
     Then the command should succeed
 
@@ -1,5 +1,5 @@
 Feature: Schema
-  Display the schema of a Parquet, Avro, or ORC file.
+  Display the schema of a Parquet, Avro, CSV, or ORC file.
 
   Scenario: Schema Parquet default (csv output)
     When I run `datu schema fixtures/table.parquet`
@@ -64,3 +64,9 @@ Feature: Schema
     Then the command should succeed
     And the output should contain "id"
     And the output should contain "first_name"
+
+  Scenario: Schema CSV default (csv output)
+    When I run `datu schema fixtures/table.csv`
+    Then the command should succeed
+    And the output should contain "one"
+    And the output should contain "two"
@@ -1,5 +1,5 @@
 Feature: Tail
-  Print the last N rows of a Parquet, Avro, or ORC file as CSV.
+  Print the last N rows of a Parquet, Avro, CSV, or ORC file as CSV.
 
   Scenario: Tail Parquet default (10 lines)
     When I run `datu tail fixtures/table.parquet`
@@ -12,6 +12,18 @@ Feature: Tail
     And the first line of the output should contain "one"
     And the first line of the output should contain "two"
 
+  Scenario: Tail CSV default
+    When I run `datu tail fixtures/table.csv`
+    Then the command should succeed
+    And the first line of the output should contain "one"
+    And the first line of the output should contain "two"
+
+  Scenario: Tail CSV with -n 2
+    When I run `datu tail fixtures/table.csv -n 2`
+    Then the command should succeed
+    And the first line of the output should contain "one"
+    And the output should contain "baz"
+
   Scenario: Tail Avro default (10 lines)
     When I run `datu tail fixtures/userdata5.avro`
     Then the command should succeed
 
@@ -97,6 +97,34 @@ Feature: Conversion
     Then the file "$TEMPDIR/userdata5.xlsx" should exist
     And that file should be valid XLSX
 
+  Scenario: CSV to Parquet
+    When the REPL is ran and the user types:
+      ```
+      read("fixtures/table.csv") |> write("$TEMPDIR/table_from_csv.parquet")
+      ```
+    Then the file "$TEMPDIR/table_from_csv.parquet" should exist
+    And that file should be valid Parquet
+
+  Scenario: CSV to JSON
+    When the REPL is ran and the user types:
+      ```
+      read("fixtures/table.csv") |> write("$TEMPDIR/table_from_csv.json")
+      ```
+    Then the file "$TEMPDIR/table_from_csv.json" should exist
+    And that file should be valid JSON
+    And that file should contain "one"
+    And that file should contain "two"
+
+  Scenario: CSV to CSV with select
+    When the REPL is ran and the user types:
+      ```
+      read("fixtures/table.csv") |> select(:two, :four) |> write("$TEMPDIR/table_csv_select.csv")
+      ```
+    Then the file "$TEMPDIR/table_csv_select.csv" should exist
+    And that file should be a CSV file
+    And the first line of that file should be: "two,four"
+    And that file should have 4 lines
+
   Scenario: ORC to CSV
     When the REPL is ran and the user types:
       ```
 
@@ -42,6 +42,18 @@ Feature: Head
     And the first line of that file should contain "first_name"
     And that file should have 6 lines
 
+  Scenario: Head from CSV
+    When the REPL is ran and the user types:
+      ```
+      read("fixtures/table.csv") |> head(3) |> write("$TEMPDIR/head_csv.csv")
+      ```
+    Then the file "$TEMPDIR/head_csv.csv" should exist
+    And that file should be a CSV file
+    And the first line of that file should be: "one,two,three,four,five,__index_level_0__"
+    And that file should have 4 lines
+    And that file should contain "foo"
+    And that file should contain "bar"
+
   Scenario: Head from ORC
     When the REPL is ran and the user types:
       ```
 
@@ -42,6 +42,18 @@ Feature: Tail
     And the first line of that file should contain "first_name"
     And that file should have 6 lines
 
+  Scenario: Tail from CSV
+    When the REPL is ran and the user types:
+      ```
+      read("fixtures/table.csv") |> tail(2) |> write("$TEMPDIR/tail_csv.csv")
+      ```
+    Then the file "$TEMPDIR/tail_csv.csv" should exist
+    And that file should be a CSV file
+    And the first line of that file should be: "one,two,three,four,five,__index_level_0__"
+    And that file should have 3 lines
+    And that file should contain "bar"
+    And that file should contain "baz"
+
   Scenario: Tail from ORC
     When the REPL is ran and the user types:
       ```