Skip to content

Commit 32e5040

Browse files
authored
chore: add CLAUDE.md (#32334)
1 parent a9e52ca commit 32e5040

File tree

1 file changed

+325
-0
lines changed

1 file changed

+325
-0
lines changed

CLAUDE.md

Lines changed: 325 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,325 @@
1+
# Global Development Guidelines for LangChain Projects
2+
3+
## Core Development Principles
4+
5+
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
6+
7+
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
8+
9+
**Bad - Breaking Change:**
10+
11+
```python
12+
def get_user(id, verbose=False): # Changed from `user_id`
13+
pass
14+
```
15+
16+
**Good - Stable Interface:**
17+
18+
```python
19+
def get_user(user_id: str, verbose: bool = False) -> User:
20+
"""Retrieve user by ID with optional verbose output."""
21+
pass
22+
```
23+
24+
**Before making ANY changes to public APIs:**
25+
26+
- Check if the function/class is exported in `__init__.py`
27+
- Look for existing usage patterns in tests and examples
28+
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
29+
- Mark experimental features clearly with docstring warnings (using reStructuredText, like `.. warning::`)
30+
31+
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
32+
33+
### 2. Code Quality Standards
34+
35+
**All Python code MUST include type hints and return types.**
36+
37+
**Bad:**
38+
39+
```python
40+
def p(u, d):
41+
return [x for x in u if x not in d]
42+
```
43+
44+
**Good:**
45+
46+
```python
47+
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
48+
"""Filter out users that are not in the known users set.
49+
50+
Args:
51+
users: List of user identifiers to filter.
52+
known_users: Set of known/valid user identifiers.
53+
54+
Returns:
55+
List of users that are not in the known_users set.
56+
"""
57+
return [user for user in users if user not in known_users]
58+
```
59+
60+
**Style Requirements:**
61+
62+
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
63+
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
64+
- Avoid unnecessary abstraction or premature optimization
65+
- Follow existing patterns in the codebase you're modifying
66+
67+
### 3. Testing Requirements
68+
69+
**Every new feature or bugfix MUST be covered by unit tests.**
70+
71+
**Test Organization:**
72+
73+
- Unit tests: `tests/unit_tests/` (no network calls allowed)
74+
- Integration tests: `tests/integration_tests/` (network calls permitted)
75+
- Use `pytest` as the testing framework
76+
77+
**Test Quality Checklist:**
78+
79+
- [ ] Tests fail when your new logic is broken
80+
- [ ] Happy path is covered
81+
- [ ] Edge cases and error conditions are tested
82+
- [ ] Use fixtures/mocks for external dependencies
83+
- [ ] Tests are deterministic (no flaky tests)
84+
85+
Checklist questions:
86+
87+
- [ ] Does the test suite fail if your new logic is broken?
88+
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
89+
- [ ] Do tests use fixtures or mocks where needed?
90+
91+
```python
92+
def test_filter_unknown_users():
93+
"""Test filtering unknown users from a list."""
94+
users = ["alice", "bob", "charlie"]
95+
known_users = {"alice", "bob"}
96+
97+
result = filter_unknown_users(users, known_users)
98+
99+
assert result == ["charlie"]
100+
assert len(result) == 1
101+
```
102+
103+
### 4. Security and Risk Assessment
104+
105+
**Security Checklist:**
106+
107+
- No `eval()`, `exec()`, or `pickle` on user-controlled input
108+
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
109+
- Remove unreachable/commented code before committing
110+
- Race conditions or resource leaks (file handles, sockets, threads).
111+
- Ensure proper resource cleanup (file handles, connections)
112+
113+
**Bad:**
114+
115+
```python
116+
def load_config(path):
117+
with open(path) as f:
118+
return eval(f.read()) # ⚠️ Never eval config
119+
```
120+
121+
**Good:**
122+
123+
```python
124+
import json
125+
126+
def load_config(path: str) -> dict:
127+
with open(path) as f:
128+
return json.load(f)
129+
```
130+
131+
### 5. Documentation Standards
132+
133+
**Use Google-style docstrings with Args section for all public functions.**
134+
135+
**Insufficient Documentation:**
136+
137+
```python
138+
def send_email(to, msg):
139+
"""Send an email to a recipient."""
140+
```
141+
142+
**Complete Documentation:**
143+
144+
```python
145+
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
146+
"""
147+
Send an email to a recipient with specified priority.
148+
149+
Args:
150+
to: The email address of the recipient.
151+
msg: The message body to send.
152+
priority: Email priority level (``'low'``, ``'normal'``, ``'high'``).
153+
154+
Returns:
155+
True if email was sent successfully, False otherwise.
156+
157+
Raises:
158+
InvalidEmailError: If the email address format is invalid.
159+
SMTPConnectionError: If unable to connect to email server.
160+
"""
161+
```
162+
163+
**Documentation Guidelines:**
164+
165+
- Types go in function signatures, NOT in docstrings
166+
- Focus on "why" rather than "what" in descriptions
167+
- Document all parameters, return values, and exceptions
168+
- Keep descriptions concise but clear
169+
- Use reStructuredText for docstrings to enable rich formatting
170+
171+
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
172+
173+
### 6. Architectural Improvements
174+
175+
**When you encounter code that could be improved, suggest better designs:**
176+
177+
**Poor Design:**
178+
179+
```python
180+
def process_data(data, db_conn, email_client, logger):
181+
# Function doing too many things
182+
validated = validate_data(data)
183+
result = db_conn.save(validated)
184+
email_client.send_notification(result)
185+
logger.log(f"Processed {len(data)} items")
186+
return result
187+
```
188+
189+
**Better Design:**
190+
191+
```python
192+
@dataclass
193+
class ProcessingResult:
194+
"""Result of data processing operation."""
195+
items_processed: int
196+
success: bool
197+
errors: List[str] = field(default_factory=list)
198+
199+
class DataProcessor:
200+
"""Handles data validation, storage, and notification."""
201+
202+
def __init__(self, db_conn: Database, email_client: EmailClient):
203+
self.db = db_conn
204+
self.email = email_client
205+
206+
def process(self, data: List[dict]) -> ProcessingResult:
207+
"""Process and store data with notifications."""
208+
validated = self._validate_data(data)
209+
result = self.db.save(validated)
210+
self._notify_completion(result)
211+
return result
212+
```
213+
214+
**Design Improvement Areas:**
215+
216+
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
217+
218+
- Reduce code duplication through shared utilities
219+
- Make unit testing easier
220+
- Improve separation of concerns (single responsibility)
221+
- Make unit testing easier through dependency injection
222+
- Add clarity without adding complexity
223+
- Prefer dataclasses for structured data
224+
225+
## Development Tools & Commands
226+
227+
### Package Management
228+
229+
```bash
230+
# Add package
231+
uv add package-name
232+
233+
# Sync project dependencies
234+
uv sync
235+
uv lock
236+
```
237+
238+
### Testing
239+
240+
```bash
241+
# Run unit tests (no network)
242+
make test
243+
244+
# Don't run integration tests, as API keys must be set
245+
246+
# Run specific test file
247+
uv run --group test pytest tests/unit_tests/test_specific.py
248+
```
249+
250+
### Code Quality
251+
252+
```bash
253+
# Lint code
254+
make lint
255+
256+
# Format code
257+
make format
258+
259+
# Type checking
260+
uv run --group lint mypy .
261+
```
262+
263+
### Dependency Management Patterns
264+
265+
**Local Development Dependencies:**
266+
267+
```toml
268+
[tool.uv.sources]
269+
langchain-core = { path = "../core", editable = true }
270+
langchain-tests = { path = "../standard-tests", editable = true }
271+
```
272+
273+
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
274+
275+
```python
276+
from langchain_core.tools import tool
277+
278+
@tool
279+
def search_database(query: str) -> str:
280+
"""Search the database for relevant information.
281+
282+
Args:
283+
query: The search query string.
284+
"""
285+
# Implementation here
286+
return results
287+
```
288+
289+
## Commit Standards
290+
291+
**Use Conventional Commits format for PR titles:**
292+
293+
- `feat(core): add multi-tenant support`
294+
- `fix(cli): resolve flag parsing error`
295+
- `docs: update API usage examples`
296+
- `docs(openai): update API usage examples`
297+
298+
## Framework-Specific Guidelines
299+
300+
- Follow the existing patterns in `langchain-core` for base abstractions
301+
- Use `langchain_core.callbacks` for execution tracking
302+
- Implement proper streaming support where applicable
303+
- Avoid deprecated components like legacy `LLMChain`
304+
305+
### Partner Integrations
306+
307+
- Follow the established patterns in existing partner libraries
308+
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
309+
- Include comprehensive integration tests
310+
- Document API key requirements and authentication
311+
312+
---
313+
314+
## Quick Reference Checklist
315+
316+
Before submitting code changes:
317+
318+
- [ ] **Breaking Changes**: Verified no public API changes
319+
- [ ] **Type Hints**: All functions have complete type annotations
320+
- [ ] **Tests**: New functionality is fully tested
321+
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
322+
- [ ] **Documentation**: Google-style docstrings for public functions
323+
- [ ] **Code Quality**: `make lint` and `make format` pass
324+
- [ ] **Architecture**: Suggested improvements where applicable
325+
- [ ] **Commit Message**: Follows Conventional Commits format

0 commit comments

Comments
 (0)