Skip to content

Commit ba62fab

Browse files
authored
Convert to code: the first move, which is dplyr (#933)
* Claude's first crack at writing CLAUDE.md * First move away from inlining `suggest_code_synatax()` and `convert_to_code()` in the handler * Initial move to dedicated file * Inform claude about generated comms * Start to write actual code * First crack * Mark feature as supported * This was always just a placeholder * Refactor the design and deal with sorting * Add tests This is really all Claude, with the instruction to take inspiration from the convert-to-code tests on the Python side. * Refine text search and its tests * Get the right object name in more cases Specifically, this was motivated by a problem when manually testing with a built-in dataset, like `penguins` * Take notes on current WIP in CLAUDE.md * MVP of a test re: the output of executing the code we generate * Update notes * Rationalize the "is true" and "is false" filters * Relocate suggest_code_syntax() and other housekeeping * Surround non-syntactic column names with backticks * Tighten up the escaping for string values used in a filter * Some more assertions
1 parent 91abcdd commit ba62fab

File tree

8 files changed

+1301
-13
lines changed

8 files changed

+1301
-13
lines changed

.github/workflows/test-linux.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ jobs:
6161
with:
6262
packages:
6363
data.table
64+
dplyr
6465
rstudioapi
6566
tibble
6667
haven

.github/workflows/test-macos.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ jobs:
4141
with:
4242
packages:
4343
data.table
44+
dplyr
4445
rstudioapi
4546
tibble
4647
haven

.github/workflows/test-windows.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ jobs:
4141
with:
4242
packages:
4343
data.table
44+
dplyr
4445
rstudioapi
4546
tibble
4647
haven

CLAUDE.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Ark is an R kernel for Jupyter applications, primarily created to serve as the interface between R and the Positron IDE. It is compatible with all frontends implementing the Jupyter protocol.
8+
9+
The project includes:
10+
- A Jupyter kernel for structured interaction between R and a frontend
11+
- An LSP server for intellisense features (completions, jump-to-definition, diagnostics)
12+
- A DAP server for step-debugging of R functions
13+
14+
## Repository Structure
15+
16+
The codebase is organized as a Rust workspace containing multiple crates:
17+
18+
- **ark**: The main R Kernel implementation
19+
- **harp**: Rust wrappers for R objects and interfaces
20+
- **libr**: Bindings to R (dynamically loaded using `dlopen`/`LoadLibrary`)
21+
- **amalthea**: A Rust framework for building Jupyter and Positron kernels
22+
- **echo**: A toy kernel for testing the kernel framework
23+
- **stdext**: Extensions to Rust's standard library used by the other projects
24+
25+
## Common Development Commands
26+
27+
### Building the Project
28+
29+
```bash
30+
# Build the entire project
31+
cargo build
32+
33+
# Build in release mode
34+
cargo build --release
35+
36+
# On Windows: If Positron is running with a debug build of ark,
37+
# Windows file locking prevents overwriting ark.exe. For interim "progress"
38+
# checks during development, just check the specific crate you're working on
39+
# instead:
40+
cargo check --package ark
41+
# You'll have to quit Positron to do `cargo build`, though, on Windows.
42+
```
43+
44+
### Running Tests
45+
46+
```bash
47+
# Run all tests with nextest (recommended for CI)
48+
cargo nextest run
49+
50+
# Run specific tests
51+
cargo test <test_name>
52+
53+
# Run tests for a specific crate
54+
cargo test -p ark
55+
```
56+
57+
### Required R Packages for Testing
58+
59+
The following R packages are required for tests:
60+
- data.table
61+
- dplyr
62+
- rstudioapi
63+
- tibble
64+
- haven
65+
- R6
66+
67+
### Installation
68+
69+
After building, you can install the Jupyter kernel specification with:
70+
71+
```bash
72+
./target/debug/ark --install
73+
# or in release mode
74+
./target/release/ark --install
75+
```
76+
77+
## Generated code
78+
79+
Some of the files below `crates/amalthea/src/comm/` are automatically generated from comms specified in the Positron front end.
80+
Such files always have `// @generated` at the top and SHOULD NEVER be edited "by hand".
81+
If changes are needed in these files, that must happen in the separate Positron source repository and the comms for R and Python must be regenerated.
82+
83+
## Current Work in Progress
84+
85+
### Convert to Code Feature for Data Explorer
86+
87+
**Feature**: Backend implementation of "convert to code" for Positron's data explorer in R, allowing users to generate R code (dplyr syntax) that replicates their UI-based data manipulations (filters, sorting).
88+
89+
**Status**:
90+
- ✅ R implementation has been created with awareness of the Python implementation
91+
- ✅ Core feature is implemented and working with dplyr syntax
92+
- ✅ Unit tests exist for string output validation
93+
- ✅ An MVP exists of a test that validates the result of executing generated code
94+
95+
**Key files in R implementation**:
96+
- `crates/ark/src/data_explorer/convert_to_code.rs` - Core conversion logic with traits and handlers + tests
97+
- `crates/ark/src/data_explorer/r_data_explorer.rs` - Data explorer integration
98+
- `crates/ark/tests/data_explorer.rs` - Integration tests for data explorer
99+
100+
**Key files in Python implementation** (for reference):
101+
- `../positron/extensions/positron-python/python_files/posit/positron/convert.py` - Core conversion logic
102+
- `../positron/extensions/positron-python/python_files/posit/positron/data_explorer.py` - Main data explorer (see `convert_to_code` methods around lines 1408, 2297)
103+
- `../positron/extensions/positron-python/python_files/posit/positron/tests/test_convert.py` - Execution validation tests
104+
105+
**Architecture comparison**:
106+
- Both R and Python use similar trait/abstract class patterns for extensibility
107+
- R uses `PipeBuilder` for clean pipe chain generation; Python uses `MethodChainBuilder`
108+
- Both have comprehensive filter/sort handlers with type-aware value formatting
109+
110+
**Possible next steps**:
111+
1. Add more execution tests, e.g. for sorting, or combined filtering and sorting
112+
1. Consider a "tidyverse" syntax instead of or in addition to "dplyr", where
113+
we would use stringr function for text search filters
114+
1. Dig in to non-syntactic column names
115+
1. Dig in to filtering for date and datetime columns
116+
1. Handle "base" and "data.table" syntaxes

0 commit comments

Comments
 (0)