Skip to content

Commit 6e7c80d

Browse files
authored
Merge pull request #93 from OoriData/feature/kb-rearch
OgbujiPT 0.10.0 Phase 1: Foundation & Rearchitecture
2 parents 03c88ba + 707756c commit 6e7c80d

File tree

80 files changed

+6331
-1493
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+6331
-1493
lines changed

.github/workflows/main.yml

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
name: Python package
1+
name: OgbujiPT CI
22

3-
on: [push]
3+
on: [push, pull_request]
44

55
jobs:
6-
build:
6+
test:
77
runs-on: ubuntu-latest
88
strategy:
99
matrix:
10-
python-version: ["3.10", "3.11", "3.12"]
10+
python-version: ["3.12", "3.13"]
1111
services:
1212
mockdb:
1313
image: pgvector/pgvector:pg16
@@ -20,28 +20,27 @@ jobs:
2020
- 5432:5432
2121

2222
steps:
23-
- uses: actions/checkout@v3
23+
- uses: actions/checkout@v4
2424
- name: Set up Python ${{ matrix.python-version }}
25-
uses: actions/setup-python@v4
25+
uses: actions/setup-python@v5
2626
with:
2727
python-version: ${{ matrix.python-version }}
28+
2829
- name: Install dependencies
2930
run: |
3031
python -m pip install --upgrade pip
31-
pip install ruff pytest pytest-mock pytest-asyncio respx # pytest-httpx # Now using respx instead
32-
pip install pgvector asyncpg pytest-asyncio # Added by Kai, Osi
33-
32+
# pip install ruff pytest pytest-mock pytest-asyncio respx
3433
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
35-
# Install OgbujiPT itself, as checked out
36-
pip install -U $GITHUB_WORKSPACE
34+
# Install OgbujiPT itself, plus dependencies needed to run the full CI, with test suite
35+
pip install -U ".[testall]"
36+
3737
- name: Lint with ruff
3838
run: |
39-
# stop the build if there are Python syntax errors or undefined names
40-
# ruff --format=github --select=E9,F63,F7,F82 --target-version=py311 . # No longer works
41-
ruff check --select=E9,F63,F7,F82 --target-version=py311 --exclude **/*.ipynb .
42-
# default set of ruff rules with GitHub Annotations
43-
# ruff --format=github --target-version=py311 . # No longer works
44-
ruff check --target-version=py311 --exclude **/*.ipynb .
39+
# Stop the build if there are Python syntax errors or undefined names
40+
ruff check --select=E9,F63,F7,F82 --target-version=py312 --exclude **/*.ipynb .
41+
# Run default ruff checks
42+
ruff check --target-version=py312 --exclude **/*.ipynb .
43+
4544
- name: Test with pytest
4645
env:
4746
PG_HOST: 0.0.0.0
@@ -50,4 +49,4 @@ jobs:
5049
PG_PASSWORD: mock_password
5150
PG_PORT: 5432
5251
run: |
53-
pytest
52+
pytest test/ -v

.github/workflows/publish.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
name: Publish to PyPI # https://pypi.org/p/OgbujiPT
2+
3+
on:
4+
release:
5+
types: [published]
6+
7+
jobs:
8+
publish:
9+
runs-on: ubuntu-latest
10+
environment: pypi # GitHub environment for additional protection
11+
permissions:
12+
id-token: write # Required for trusted publishing
13+
14+
steps:
15+
- uses: actions/checkout@v4
16+
17+
- name: Set up Python
18+
uses: actions/setup-python@v5
19+
with:
20+
python-version: '3.12'
21+
22+
- name: Install build tools
23+
run: |
24+
python -m pip install --upgrade pip
25+
pip install build twine
26+
27+
- name: Build package
28+
run: python -m build
29+
30+
- name: Publish to PyPI
31+
uses: pypa/gh-action-pypi-publish@release/v1
32+
# with:
33+
# Uses OIDC trusted publishing - no token needed if configured
34+
# If you need to use a token instead, uncomment below:
35+
# password: ${{ secrets.PYPI_API_TOKEN }}

.github/workflows/python-publish.yml

Lines changed: 0 additions & 48 deletions
This file was deleted.

AICONTEXT.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
Additional context on this repository for AI tools & coding agents
2+
3+
- Python 3.12+ code, unless otherwise specified
4+
- Python code uses single outer quotes, including triple single quotes for e.g. docstrings
5+
- prefer absolute imports to relative imports
6+
- Use a decent amount of comments
7+
- not *too* many, just enough that anybody familiar with the code can use them as a reference point. Not meant to teach somebody new every intricacy of the code, just help keep the reader oriented.
8+
- if it saves a line, put a comment after a line rather than above it
9+
- use the standard two spaces before the comment character, eg. `CODE # COMMENT`
10+
- Try to stick to 120 characters per line
11+
- if one of those comments would break this guideline, just put that comment above the line instead, as is standard convention
12+
- If there is a pyproject.toml in place, use it as a reference for builds, installs, etc. The basic packaging and dev preference, including if you have to supply your own pyproject.toml, is as follows:
13+
- Use pyproject.toml with hatchling, not e.g. setup.py
14+
- Reusable Python code modules are developed in the `pylib` folder, and installed using e.g. `uv pip install -U .`, which includes proper mapping to Python library package namespace via `tool.hatch.build.sources`. The `__init__.py` and other modules in the top-level package go directly in `pylib`, though submodules can use subdirectories, e.g. `pylib/a/b` becomes `installed_library_name.a.b`. Ultimately this will mean the installed package is importable as `from installed_library_name.etc import …`
15+
- Yes this means editable and "dev mode" environments are NOT desirable, nor are shenanigans adding pylib to `sys.path`. Layer-efficient dockerization is an option if that's needed.
16+
- The ethos is to always develop keeping things properly installable. No dev mode shortcuts
17+
- Prefer hatchling build system over setuptools, poetry, etc. Avoid setuptools as much as possible. Use `[tool.hatch.build.sources]` to map source directories to package namespaces (e.g., `"pylib" = "installed_library_name"`).
18+
- Use `[tool.hatch.build.targets.wheel]` with `only-include = ["pylib"]` to ensure the pylib directory structure gets included properly in the wheel, avoiding the duplication issue that can occur with sources mapping
19+
- **Debugging package issues**: When modules aren't importing correctly after installation, check:
20+
- That you are in the correct virtualenv (you may have to ask the developer)
21+
- Package structure in site-packages (e.g., `ls -la /path/to/site-packages/package_name/`)
22+
- Use uv, but pay attention to the above
23+
- Again always use `uv pip install -U .` for full installation, never editable installs (`pip install -e`). This ensures proper testing of the actual distribution.
24+
- Use async (e.g. asyncio) wherever it makes sense. Avoid multithreading, though multiprocessing is OK. Multiprocess for CPU-bound concurrency, and asyncIO for I/O bound, cooperative etc.
25+
- Be pythonic. Avoid e.g. complex abstract class hierarchies for the sake of them, though classes are also fine in many usage patterns. We love dictionaries, dynamic dispatch, etc.
26+
- I don't consider Pydantic very Pythonic, so we can tolerate it if need be (e.g. we're using a toolkit that strictly works with Pydantic), but otherwise, simple dataclasses are better.
27+
- Type hints are OK in moderation, but avoid absolutely littering the code with them.
28+
- No excess imports & symbols, e.g. Use type | None rather than Optional[type]
29+
- use iterator patterns as much as practical. Also functional programming approaches, including partials (currying) and decorators
30+
- Prefereed tools:
31+
- Logging: structlog
32+
- Retries on failure: tenacity
33+
- CLI argument processing: fire—avoid argparse except for truly trivial usage
34+
- CLI formatting: rich
35+
- HTTP client: httpx (async)
36+
- HTML/XML parsing: selectolax (though for now we're using html5-modern as the base implementation for our html5 features)
37+
- Browser-like Web crawling/scraping: Python playwright (with playwright_stealth if needed)
38+
- pytest, as well as pytest-mock, pytest-httpx, pytest-asyncio
39+
- rapidfuzz for fuzzy text matching
40+
- AVOID the following unless explicitly requested or otherwise unavoidable:
41+
- langchain
42+
43+
- Once again PREFER SINGLE QUOTES

0 commit comments

Comments
 (0)