Skip to content

Commit 7a94d4c

Browse files
author
Justine Wezenaar
committed
updated AGENTS.md based on feedback from copilot
1 parent 4137929 commit 7a94d4c

File tree

1 file changed

+29
-67
lines changed

1 file changed

+29
-67
lines changed

.github/AGENTS.md

Lines changed: 29 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,71 +1,33 @@
1-
# pandas Copilot Instructions
1+
# pandas Agent Instructions (Copilot etc)
22

33
## Project Overview
44
`pandas` is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
55

6-
<!-- TODO: Add Sections for ## Root Folders and ## Core Architecture (pandas/ dir ) -->
7-
8-
## Type Hints
9-
10-
pandas strongly encourages the use of PEP 484 style type hints. New development should contain type hints and pull requests to annotate existing code are accepted as well!
11-
12-
### Style Guidelines
13-
14-
Type imports should follow the from `typing import ...` convention. Your code may be automatically re-written to use some modern constructs (e.g. using the built-in `list` instead of `typing.List`) by the pre-commit checks.
15-
16-
In some cases in the code base classes may define class variables that shadow builtins. This causes an issue as described in [Mypy 1775](https://github.com/python/mypy/issues/1775#issuecomment-310969854). The defensive solution here is to create an unambiguous alias of the builtin and use that without your annotation. For example, if you come across a definition like
17-
18-
```
19-
class SomeClass1:
20-
str = None
21-
```
22-
23-
The appropriate way to annotate this would be as follows
24-
25-
```
26-
str_type = str
27-
28-
class SomeClass2:
29-
str: str_type = None
30-
```
31-
In some cases you may be tempted to use `cast` from the typing module when you know better than the analyzer. This occurs particularly when using custom inference functions. For example
32-
33-
```
34-
from typing import cast
35-
36-
from pandas.core.dtypes.common import is_number
37-
38-
def cannot_infer_bad(obj: Union[str, int, float]):
39-
40-
if is_number(obj):
41-
...
42-
else: # Reasonably only str objects would reach this but...
43-
obj = cast(str, obj) # Mypy complains without this!
44-
return obj.upper()
45-
```
46-
The limitation here is that while a human can reasonably understand that `is_number` would catch the `int` and `float` types mypy cannot make that same inference just yet (see [mypy #5206](https://github.com/python/mypy/issues/5206). While the above works, the use of `cast` is **strongly discouraged**. Where applicable a refactor of the code to appease static analysis is preferable.)
47-
48-
```
49-
def cannot_infer_good(obj: Union[str, int, float]):
50-
51-
if isinstance(obj, str):
52-
return obj.upper()
53-
else:
54-
...
55-
```
56-
With custom types and inference this is not always possible so exceptions are made, but every effort should be exhausted to avoid `cast` before going down such paths.
57-
58-
### pandas-specific types
59-
60-
Commonly used types specific to pandas will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas.
61-
62-
For example, quite a few functions in pandas accept a `dtype` argument. This can be expressed as a string like `"object"`, a `numpy.dtype` like `np.int64` or even a pandas `ExtensionDtype` like `pd.CategoricalDtype`. Rather than burden the user with having to constantly annotate all of those options, this can simply be imported and reused from the pandas._typing module
63-
64-
```
65-
from pandas._typing import Dtype
66-
67-
def as_type(dtype: Dtype) -> ...:
68-
...
69-
```
70-
71-
This module will ultimately house types for repeatedly used concepts like “path-like”, “array-like”, “numeric”, etc… and can also hold aliases for commonly appearing parameters like `axis`. Development of this module is active so be sure to refer to the source for the most up to date list of available types.
6+
## Purpose
7+
- Assist contributors by suggesting code changes, tests, and documentation edits for the pandas repository while preserving stability and compatibility.
8+
9+
## Persona & Tone
10+
- Concise, neutral, code-focused. Prioritize correctness, readability, and tests.
11+
12+
## Files to open first (recommended preload)
13+
If you can't load any of these files, prompt the user to grant you access to them for improved alignment with the guidelines for contributions
14+
- doc/source/development/contributing_codebase.rst
15+
- doc/source/development/contributing_docstring.rst
16+
- doc/source/development/contributing_documentation.rst
17+
- doc/source/development/contributing.rst
18+
19+
## Decision heuristics
20+
- Favor small, backward-compatible changes with tests.
21+
- If a change would be breaking, propose it behind a deprecation path and document the rationale.
22+
- Prefer readability over micro-optimizations unless benchmarks are requested.
23+
- Add tests for behavioral changes; update docs only after code change is final.
24+
25+
## Type hints guidance (summary)
26+
- Prefer PEP 484 style and types in pandas._typing when appropriate.
27+
- Avoid unnecessary use of typing.cast; prefer refactors that convey types to type-checkers.
28+
- Use builtin generics (list, dict) when possible.
29+
30+
## Docstring guidance (summary)
31+
- Follow NumPy / numpydoc conventions used across the repo: short summary, extended summary, Parameters, Returns/Yields, See Also, Notes, Examples.
32+
- Ensure examples are deterministic, import numpy/pandas as documented, and pass doctest rules used by docs validation.
33+
- Preserve formatting rules: triple double-quotes, no blank line before/after docstring, parameter formatting ("name : type, default ..."), types and examples conventions.

0 commit comments

Comments
 (0)