Open sourcing ether0 rewards, prompts, data utilities#2
Merged
Conversation
whitead
reviewed
Jun 4, 2025
There was a problem hiding this comment.
Pull Request Overview
This PR open-sources the ether0 package and its remotes extension, adds comprehensive unit tests, and ensures type and lint checks pass via mypy and CI updates.
- Introduce fixtures and unit tests covering clients, rewards, and core utilities.
- Add retrying dataset loader, text‐validation helpers, prompt templates, data/models, chat formatting, and client/server implementations.
- Update
pyproject.tomland CI workflows to include newether0.remotespackage, extras, and environment setup.
Reviewed Changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/conftest.py | Add ether0_test fixture to load the Hugging Face dataset. |
| src/ether0/utils.py | New retry loader, invalid‐character/language checks. |
| src/ether0/problem_prompts.py | Lots of new prompt template lists for problems. |
| src/ether0/models.py | Data models, enums, and filtering logic added. |
| src/ether0/model_prompts.py | XML‐based answer extraction and prompt enums. |
| src/ether0/data.py | SMILES parsing, drawing helpers, and ring/fingerprint checks. |
| src/ether0/clients.py | HTTPX‐based remote clients with retry logic. |
| src/ether0/chat.py | Chat conversation formatting for SFT/RL. |
| pyproject.toml | Pin dependencies, add ether0.remotes, mypy overrides. |
| packages/remotes/... | New ether0.remotes server, client tests, docs, and packaging. |
| docs/updated_mistral_chat_template.jinja | Jinja template for chat rendering with new tokens. |
| docs/adding_tokens.ipynb | Notebook demonstrating how to add reasoning tokens. |
Comments suppressed due to low confidence (4)
src/ether0/utils.py:48
- Add a docstring to
load_dataset_retryingsummarizing the retry behavior, exceptions handled, and retry parameters, so future readers immediately understand why and how retries are configured.
def load_dataset_retrying(
src/ether0/data.py:72
- The comment notes that counterion-containing SMILES currently fail. Add a unit test exercising that pattern (e.g.,
[Cl-]or multi-fragment SMILES) to catch regressions or document the limitation explicitly.
SMILES_PATTERN = re.compile(
src/ether0/clients.py:17
- This global
Countertracks errors across calls and may not be thread-safe. Consider moving it into a thread-local or function-scoped context, or use an atomic structure if concurrency is expected.
SERVER_ERRORS_COUNTER = Counter({
docs/updated_mistral_chat_template.jinja:20
- The
raise_exceptioncall isn’t a built-in Jinja directive. Replace it with the supported{% raise "message" %}directive or register a custom function so unmatched roles actually trigger an error at render time.
{{- raise_exception("Only user, system and assistant roles are supported!") }}
maykcaldas
reviewed
Jun 4, 2025
db4fa0b to
55da377
Compare
maykcaldas
reviewed
Jun 5, 2025
26298a8 to
008bda9
Compare
dff6d07 to
7029c60
Compare
Co-authored-by: James Braza <jamesbraza@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Also includes:
mypy🥳