|
| 1 | +# Global Development Guidelines for LangChain Projects |
| 2 | + |
| 3 | +## Core Development Principles |
| 4 | + |
| 5 | +### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL |
| 6 | + |
| 7 | +**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.** |
| 8 | + |
| 9 | +❌ **Bad - Breaking Change:** |
| 10 | + |
| 11 | +```python |
| 12 | +def get_user(id, verbose=False): # Changed from `user_id` |
| 13 | + pass |
| 14 | +``` |
| 15 | + |
| 16 | +✅ **Good - Stable Interface:** |
| 17 | + |
| 18 | +```python |
| 19 | +def get_user(user_id: str, verbose: bool = False) -> User: |
| 20 | + """Retrieve user by ID with optional verbose output.""" |
| 21 | + pass |
| 22 | +``` |
| 23 | + |
| 24 | +**Before making ANY changes to public APIs:** |
| 25 | + |
| 26 | +- Check if the function/class is exported in `__init__.py` |
| 27 | +- Look for existing usage patterns in tests and examples |
| 28 | +- Use keyword-only arguments for new parameters: `*, new_param: str = "default"` |
| 29 | +- Mark experimental features clearly with docstring warnings (using reStructuredText, like `.. warning::`) |
| 30 | + |
| 31 | +🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?" |
| 32 | + |
| 33 | +### 2. Code Quality Standards |
| 34 | + |
| 35 | +**All Python code MUST include type hints and return types.** |
| 36 | + |
| 37 | +❌ **Bad:** |
| 38 | + |
| 39 | +```python |
| 40 | +def p(u, d): |
| 41 | + return [x for x in u if x not in d] |
| 42 | +``` |
| 43 | + |
| 44 | +✅ **Good:** |
| 45 | + |
| 46 | +```python |
| 47 | +def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]: |
| 48 | + """Filter out users that are not in the known users set. |
| 49 | +
|
| 50 | + Args: |
| 51 | + users: List of user identifiers to filter. |
| 52 | + known_users: Set of known/valid user identifiers. |
| 53 | +
|
| 54 | + Returns: |
| 55 | + List of users that are not in the known_users set. |
| 56 | + """ |
| 57 | + return [user for user in users if user not in known_users] |
| 58 | +``` |
| 59 | + |
| 60 | +**Style Requirements:** |
| 61 | + |
| 62 | +- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers. |
| 63 | +- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense |
| 64 | +- Avoid unnecessary abstraction or premature optimization |
| 65 | +- Follow existing patterns in the codebase you're modifying |
| 66 | + |
| 67 | +### 3. Testing Requirements |
| 68 | + |
| 69 | +**Every new feature or bugfix MUST be covered by unit tests.** |
| 70 | + |
| 71 | +**Test Organization:** |
| 72 | + |
| 73 | +- Unit tests: `tests/unit_tests/` (no network calls allowed) |
| 74 | +- Integration tests: `tests/integration_tests/` (network calls permitted) |
| 75 | +- Use `pytest` as the testing framework |
| 76 | + |
| 77 | +**Test Quality Checklist:** |
| 78 | + |
| 79 | +- [ ] Tests fail when your new logic is broken |
| 80 | +- [ ] Happy path is covered |
| 81 | +- [ ] Edge cases and error conditions are tested |
| 82 | +- [ ] Use fixtures/mocks for external dependencies |
| 83 | +- [ ] Tests are deterministic (no flaky tests) |
| 84 | + |
| 85 | +Checklist questions: |
| 86 | + |
| 87 | +- [ ] Does the test suite fail if your new logic is broken? |
| 88 | +- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)? |
| 89 | +- [ ] Do tests use fixtures or mocks where needed? |
| 90 | + |
| 91 | +```python |
| 92 | +def test_filter_unknown_users(): |
| 93 | + """Test filtering unknown users from a list.""" |
| 94 | + users = ["alice", "bob", "charlie"] |
| 95 | + known_users = {"alice", "bob"} |
| 96 | + |
| 97 | + result = filter_unknown_users(users, known_users) |
| 98 | + |
| 99 | + assert result == ["charlie"] |
| 100 | + assert len(result) == 1 |
| 101 | +``` |
| 102 | + |
| 103 | +### 4. Security and Risk Assessment |
| 104 | + |
| 105 | +**Security Checklist:** |
| 106 | + |
| 107 | +- No `eval()`, `exec()`, or `pickle` on user-controlled input |
| 108 | +- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages |
| 109 | +- Remove unreachable/commented code before committing |
| 110 | +- Race conditions or resource leaks (file handles, sockets, threads). |
| 111 | +- Ensure proper resource cleanup (file handles, connections) |
| 112 | + |
| 113 | +❌ **Bad:** |
| 114 | + |
| 115 | +```python |
| 116 | +def load_config(path): |
| 117 | + with open(path) as f: |
| 118 | + return eval(f.read()) # ⚠️ Never eval config |
| 119 | +``` |
| 120 | + |
| 121 | +✅ **Good:** |
| 122 | + |
| 123 | +```python |
| 124 | +import json |
| 125 | + |
| 126 | +def load_config(path: str) -> dict: |
| 127 | + with open(path) as f: |
| 128 | + return json.load(f) |
| 129 | +``` |
| 130 | + |
| 131 | +### 5. Documentation Standards |
| 132 | + |
| 133 | +**Use Google-style docstrings with Args section for all public functions.** |
| 134 | + |
| 135 | +❌ **Insufficient Documentation:** |
| 136 | + |
| 137 | +```python |
| 138 | +def send_email(to, msg): |
| 139 | + """Send an email to a recipient.""" |
| 140 | +``` |
| 141 | + |
| 142 | +✅ **Complete Documentation:** |
| 143 | + |
| 144 | +```python |
| 145 | +def send_email(to: str, msg: str, *, priority: str = "normal") -> bool: |
| 146 | + """ |
| 147 | + Send an email to a recipient with specified priority. |
| 148 | +
|
| 149 | + Args: |
| 150 | + to: The email address of the recipient. |
| 151 | + msg: The message body to send. |
| 152 | + priority: Email priority level (``'low'``, ``'normal'``, ``'high'``). |
| 153 | +
|
| 154 | + Returns: |
| 155 | + True if email was sent successfully, False otherwise. |
| 156 | +
|
| 157 | + Raises: |
| 158 | + InvalidEmailError: If the email address format is invalid. |
| 159 | + SMTPConnectionError: If unable to connect to email server. |
| 160 | + """ |
| 161 | +``` |
| 162 | + |
| 163 | +**Documentation Guidelines:** |
| 164 | + |
| 165 | +- Types go in function signatures, NOT in docstrings |
| 166 | +- Focus on "why" rather than "what" in descriptions |
| 167 | +- Document all parameters, return values, and exceptions |
| 168 | +- Keep descriptions concise but clear |
| 169 | +- Use reStructuredText for docstrings to enable rich formatting |
| 170 | + |
| 171 | +📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious. |
| 172 | + |
| 173 | +### 6. Architectural Improvements |
| 174 | + |
| 175 | +**When you encounter code that could be improved, suggest better designs:** |
| 176 | + |
| 177 | +❌ **Poor Design:** |
| 178 | + |
| 179 | +```python |
| 180 | +def process_data(data, db_conn, email_client, logger): |
| 181 | + # Function doing too many things |
| 182 | + validated = validate_data(data) |
| 183 | + result = db_conn.save(validated) |
| 184 | + email_client.send_notification(result) |
| 185 | + logger.log(f"Processed {len(data)} items") |
| 186 | + return result |
| 187 | +``` |
| 188 | + |
| 189 | +✅ **Better Design:** |
| 190 | + |
| 191 | +```python |
| 192 | +@dataclass |
| 193 | +class ProcessingResult: |
| 194 | + """Result of data processing operation.""" |
| 195 | + items_processed: int |
| 196 | + success: bool |
| 197 | + errors: List[str] = field(default_factory=list) |
| 198 | + |
| 199 | +class DataProcessor: |
| 200 | + """Handles data validation, storage, and notification.""" |
| 201 | + |
| 202 | + def __init__(self, db_conn: Database, email_client: EmailClient): |
| 203 | + self.db = db_conn |
| 204 | + self.email = email_client |
| 205 | + |
| 206 | + def process(self, data: List[dict]) -> ProcessingResult: |
| 207 | + """Process and store data with notifications.""" |
| 208 | + validated = self._validate_data(data) |
| 209 | + result = self.db.save(validated) |
| 210 | + self._notify_completion(result) |
| 211 | + return result |
| 212 | +``` |
| 213 | + |
| 214 | +**Design Improvement Areas:** |
| 215 | + |
| 216 | +If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would: |
| 217 | + |
| 218 | +- Reduce code duplication through shared utilities |
| 219 | +- Make unit testing easier |
| 220 | +- Improve separation of concerns (single responsibility) |
| 221 | +- Make unit testing easier through dependency injection |
| 222 | +- Add clarity without adding complexity |
| 223 | +- Prefer dataclasses for structured data |
| 224 | + |
| 225 | +## Development Tools & Commands |
| 226 | + |
| 227 | +### Package Management |
| 228 | + |
| 229 | +```bash |
| 230 | +# Add package |
| 231 | +uv add package-name |
| 232 | + |
| 233 | +# Sync project dependencies |
| 234 | +uv sync |
| 235 | +uv lock |
| 236 | +``` |
| 237 | + |
| 238 | +### Testing |
| 239 | + |
| 240 | +```bash |
| 241 | +# Run unit tests (no network) |
| 242 | +make test |
| 243 | + |
| 244 | +# Don't run integration tests, as API keys must be set |
| 245 | + |
| 246 | +# Run specific test file |
| 247 | +uv run --group test pytest tests/unit_tests/test_specific.py |
| 248 | +``` |
| 249 | + |
| 250 | +### Code Quality |
| 251 | + |
| 252 | +```bash |
| 253 | +# Lint code |
| 254 | +make lint |
| 255 | + |
| 256 | +# Format code |
| 257 | +make format |
| 258 | + |
| 259 | +# Type checking |
| 260 | +uv run --group lint mypy . |
| 261 | +``` |
| 262 | + |
| 263 | +### Dependency Management Patterns |
| 264 | + |
| 265 | +**Local Development Dependencies:** |
| 266 | + |
| 267 | +```toml |
| 268 | +[tool.uv.sources] |
| 269 | +langchain-core = { path = "../core", editable = true } |
| 270 | +langchain-tests = { path = "../standard-tests", editable = true } |
| 271 | +``` |
| 272 | + |
| 273 | +**For tools, use the `@tool` decorator from `langchain_core.tools`:** |
| 274 | + |
| 275 | +```python |
| 276 | +from langchain_core.tools import tool |
| 277 | + |
| 278 | +@tool |
| 279 | +def search_database(query: str) -> str: |
| 280 | + """Search the database for relevant information. |
| 281 | +
|
| 282 | + Args: |
| 283 | + query: The search query string. |
| 284 | + """ |
| 285 | + # Implementation here |
| 286 | + return results |
| 287 | +``` |
| 288 | + |
| 289 | +## Commit Standards |
| 290 | + |
| 291 | +**Use Conventional Commits format for PR titles:** |
| 292 | + |
| 293 | +- `feat(core): add multi-tenant support` |
| 294 | +- `fix(cli): resolve flag parsing error` |
| 295 | +- `docs: update API usage examples` |
| 296 | +- `docs(openai): update API usage examples` |
| 297 | + |
| 298 | +## Framework-Specific Guidelines |
| 299 | + |
| 300 | +- Follow the existing patterns in `langchain-core` for base abstractions |
| 301 | +- Use `langchain_core.callbacks` for execution tracking |
| 302 | +- Implement proper streaming support where applicable |
| 303 | +- Avoid deprecated components like legacy `LLMChain` |
| 304 | + |
| 305 | +### Partner Integrations |
| 306 | + |
| 307 | +- Follow the established patterns in existing partner libraries |
| 308 | +- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.) |
| 309 | +- Include comprehensive integration tests |
| 310 | +- Document API key requirements and authentication |
| 311 | + |
| 312 | +--- |
| 313 | + |
| 314 | +## Quick Reference Checklist |
| 315 | + |
| 316 | +Before submitting code changes: |
| 317 | + |
| 318 | +- [ ] **Breaking Changes**: Verified no public API changes |
| 319 | +- [ ] **Type Hints**: All functions have complete type annotations |
| 320 | +- [ ] **Tests**: New functionality is fully tested |
| 321 | +- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.) |
| 322 | +- [ ] **Documentation**: Google-style docstrings for public functions |
| 323 | +- [ ] **Code Quality**: `make lint` and `make format` pass |
| 324 | +- [ ] **Architecture**: Suggested improvements where applicable |
| 325 | +- [ ] **Commit Message**: Follows Conventional Commits format |
0 commit comments