Skip to content
Merged
Show file tree
Hide file tree
Changes from 40 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
6af889b
perf: implement lazy loading for 30x faster imports
devjerry0 Dec 12, 2025
e2700ed
refactor: use early returns instead of if-elif-else
devjerry0 Dec 12, 2025
41ee86f
refactor: use itertools.chain and dict comprehensions
devjerry0 Dec 12, 2025
77adc40
refactor: make code read like poetry with descriptive names
devjerry0 Dec 12, 2025
d6fb246
refactor: move month abbreviations from code to data file
devjerry0 Dec 12, 2025
9c7180b
refactor: move safety_keys from code to data file
devjerry0 Dec 12, 2025
7bfc868
refactor: replace nested comprehension with explicit loop
devjerry0 Dec 12, 2025
a9a6919
refactor: extract apostrophe variant generation into method
devjerry0 Dec 12, 2025
df524af
refactor: extract helper methods for step-down readability
devjerry0 Dec 12, 2025
826ec5f
refactor: separate data loading from core matching logic
devjerry0 Dec 12, 2025
60f13b4
refactor: remove magic numbers from intersperse
devjerry0 Dec 12, 2025
783d985
perf: skip apostrophe normalization for strings without apostrophes
devjerry0 Dec 12, 2025
8c5c2d2
refactor: optimize user-facing API with caching and validation
devjerry0 Dec 12, 2025
de04eeb
test: add missing tests to achieve 100% coverage
devjerry0 Dec 12, 2025
1617cde
refactor: eliminate global statements using state class
devjerry0 Dec 12, 2025
f722c7c
refactor: move State class to separate file
devjerry0 Dec 12, 2025
817587a
refactor: proper single responsibility separation
devjerry0 Dec 12, 2025
9466c58
refactor: eliminate all type ignores with proper typing
devjerry0 Dec 12, 2025
17ad886
refactor: eliminate cryptic abbreviations from naming
devjerry0 Dec 12, 2025
6413f6b
refactor: rename 'flank' to 'context_chars' for clarity
devjerry0 Dec 12, 2025
a89b27a
refactor: eliminate code and string duplication
devjerry0 Dec 12, 2025
9fb5863
feat: comprehensive validation and production safety
devjerry0 Dec 12, 2025
847ca3c
refactor: extract viewing window logic into clear helper method
devjerry0 Dec 12, 2025
22b2e95
refactor: rename files for clarity - no more vague names
devjerry0 Dec 12, 2025
c48bae7
feat: create processor module for text processing
devjerry0 Dec 12, 2025
019ea08
feat: create extensions module for user customizations
devjerry0 Dec 12, 2025
a9cc1d7
refactor: convert api.py to pure facade pattern
devjerry0 Dec 12, 2025
eafd7f5
feat: rename fix() to expand() with deprecation
devjerry0 Dec 12, 2025
deb849b
refactor: rename json_loader.py → file_loader.py
devjerry0 Dec 12, 2025
649ddbe
refactor: rename dict_builders.py → transformers.py
devjerry0 Dec 12, 2025
8d3f86c
refactor: rename loaders.py → bootstrap.py
devjerry0 Dec 12, 2025
ba3b22a
refactor: update imports in matchers.py
devjerry0 Dec 12, 2025
aa8e801
feat: add validate_file_contains_dict() to validation
devjerry0 Dec 12, 2025
a87f73b
test: update tests for expand() and new API
devjerry0 Dec 12, 2025
020508b
test: rename test_data_io.py → test_file_loader.py
devjerry0 Dec 12, 2025
1a81d47
docs: update README for expand() and new API
devjerry0 Dec 12, 2025
f995b8c
chore: remove obsolete files
devjerry0 Dec 12, 2025
740b00b
chore: remove unused modules and tests
devjerry0 Dec 12, 2025
d3312cb
fix: restrict publish workflow to version tags only
devjerry0 Dec 12, 2025
97ca413
chore: remove old renamed files to restore 100% coverage
devjerry0 Dec 12, 2025
8a73b25
chore: bump version to 0.3.0
devjerry0 Dec 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Publish to PyPI
on:
push:
tags:
- 'v*'
- 'v*.*.*'

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
Expand Down
42 changes: 23 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# sane-contractions

[![Tests](https://github.com/devjerry0/sane-contractions/actions/workflows/commit.yml/badge.svg)](https://github.com/devjerry0/sane-contractions/actions/workflows/commit.yml)
[![Coverage](https://img.shields.io/badge/coverage-100%25-brightgreen.svg)](https://github.com/devjerry0/sane-contractions)
[![codecov](https://codecov.io/gh/devjerry0/sane-contractions/branch/main/graph/badge.svg)](https://codecov.io/gh/devjerry0/sane-contractions)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Expand Down Expand Up @@ -31,18 +31,22 @@ pip install sane-contractions
uv pip install sane-contractions
```

[uv](https://github.com/astral-sh/uv) is 10-100x faster than pip and is a drop-in replacement.
[uv](https://github.com/astral-sh/uv)

## Quick Start

```python
import contractions

contractions.fix("you're happy now")
contractions.expand("you're happy now")
# "you are happy now"

contractions.fix("I'm sure you'll love it!")
contractions.expand("I'm sure you'll love it!")
# "I am sure you will love it!"

# Shorthand aliases
contractions.e("you're") # "you are"
contractions.p("you're", 5) # preview with context
```

## Usage
Expand All @@ -53,21 +57,21 @@ contractions.fix("I'm sure you'll love it!")
import contractions

text = "I'm sure you're going to love what we've done"
expanded = contractions.fix(text)
expanded = contractions.expand(text)
print(expanded)
# "I am sure you are going to love what we have done"
```

### Controlling Slang Expansion

```python
contractions.fix("yall're gonna love this", slang=True)
contractions.expand("yall're gonna love this", slang=True)
# "you all are going to love this"

contractions.fix("yall're gonna love this", slang=False)
contractions.expand("yall're gonna love this", slang=False)
# "yall are going to love this"

contractions.fix("yall're gonna love this", leftovers=False)
contractions.expand("yall're gonna love this", leftovers=False)
# "yall are gonna love this"
```

Expand All @@ -76,9 +80,9 @@ contractions.fix("yall're gonna love this", leftovers=False)
The library intelligently preserves the case pattern of the original contraction:

```python
contractions.fix("you're happy") # "you are happy"
contractions.fix("You're happy") # "You are happy"
contractions.fix("YOU'RE HAPPY") # "YOU ARE HAPPY"
contractions.expand("you're happy") # "you are happy"
contractions.expand("You're happy") # "You are happy"
contractions.expand("YOU'RE HAPPY") # "YOU ARE HAPPY"
```

### Adding Custom Contractions
Expand All @@ -87,7 +91,7 @@ Add a single contraction:

```python
contractions.add('myword', 'my word')
contractions.fix('myword is great')
contractions.expand('myword is great')
# "my word is great"
```

Expand All @@ -102,17 +106,17 @@ custom_contractions = {
}
contractions.add_dict(custom_contractions)

contractions.fix("ain't gonna happen")
contractions.expand("ain't gonna happen")
# "are not going to happen"
```

Load contractions from a JSON file:

```python
# custom_contractions.json contains: {"myterm": "my expansion", "another": "another word"}
contractions.load_json("custom_contractions.json")
contractions.load_file("custom_contractions.json")

contractions.fix("myterm is great")
contractions.expand("myterm is great")
# "my expansion is great"
```

Expand All @@ -137,7 +141,7 @@ for item in preview:

## API Reference

### `fix(text, leftovers=True, slang=True)`
### `expand(text, leftovers=True, slang=True)`

Expands contractions in the given text.

Expand All @@ -163,7 +167,7 @@ Adds multiple custom contractions at once.
**Parameters:**
- `dictionary` (dict): Dictionary mapping contractions to their expansions

### `load_json(filepath)`
### `load_file(filepath)`

Loads custom contractions from a JSON file.

Expand All @@ -180,7 +184,7 @@ Preview contractions in text before expanding.

**Parameters:**
- `text` (str): The text to analyze
- `flank` (int): Number of characters to show before/after each match
- `context_chars` (int): Number of characters to show before/after each match

**Returns:** `list[dict]` - List of matches with context information

Expand Down Expand Up @@ -280,7 +284,7 @@ This fork includes several enhancements over the original `contractions` library

### 🆕 New Features
- **`add_dict()`** - Bulk add custom contractions from a dictionary
- **`load_json()`** - Load contractions from JSON files
- **`load_file()`** - Load contractions from JSON files
- **Type hints** - Full type coverage with mypy validation
- **Better structure** - Modular code organization (core, api modules)

Expand Down
129 changes: 0 additions & 129 deletions README.rst

This file was deleted.

17 changes: 14 additions & 3 deletions contractions/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
import warnings

from ._version import __version__
from .api import add, add_dict, load_json, preview
from .core import fix
from .api import add, add_dict, e, expand, load_file, p, preview


def fix(*args, **kwargs):
warnings.warn(
"fix() is deprecated and will be removed in v1.0.0. Use expand() instead.",
DeprecationWarning,
stacklevel=2
)
return expand(*args, **kwargs)


__all__ = ["fix", "add", "add_dict", "load_json", "preview", "__version__"]
__all__ = ["expand", "fix", "add", "add_dict", "load_file", "preview", "e", "p", "__version__"]
51 changes: 16 additions & 35 deletions contractions/api.py
Original file line number Diff line number Diff line change
@@ -1,47 +1,28 @@
import json
from .extensions import add_custom_contraction, add_custom_dict, load_custom_from_file
from .processor import expand as _expand
from .processor import preview as _preview

from .core import replacers, ts_view_window

def expand(text: str, leftovers: bool = True, slang: bool = True) -> str:
return _expand(text, leftovers, slang)

def add(key, value):
for ts in replacers.values():
ts.add(key, value)
ts_view_window.add([key])

def preview(text: str, context_chars: int) -> list[dict[str, str | int]]:
return _preview(text, context_chars)

def add_dict(dictionary):
for ts in replacers.values():
ts.add(dictionary)
ts_view_window.add(list(dictionary.keys()))

def add(contraction: str, expansion: str) -> None:
return add_custom_contraction(contraction, expansion)

def load_json(filepath):
with open(filepath, encoding="utf-8") as f:
data = json.load(f)
add_dict(data)

def add_dict(contractions_dict: dict[str, str]) -> None:
return add_custom_dict(contractions_dict)

def preview(text, flank):
"""
Return all contractions and their location before fix for manual check. Also provide a viewing window to quickly
preview the contractions in the text.
:param text: texture.
:param flank: int number, control the size of the preview window. The window would be "flank-contraction-flank".
:return: preview_items, a list includes all matched contractions and their locations.
"""
if not isinstance(flank, int):
raise TypeError("Argument flank must be integer!")

results = ts_view_window.findall(text)
text_len = len(text)
def load_file(filepath: str) -> None:
return load_custom_from_file(filepath)

return [
{
"match": result.match,
"start": result.start,
"end": result.end,
"viewing_window": text[max(0, result.start - flank):min(text_len, result.end + flank)]
}
for result in results
]

e = expand
p = preview

17 changes: 17 additions & 0 deletions contractions/bootstrap.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from .file_loader import load_dict_data, load_list_data
from .transformers import build_apostrophe_variants, normalize_apostrophes


def load_all_contractions() -> tuple[dict[str, str], dict[str, str], dict[str, str]]:
contractions_dict = load_dict_data("contractions_dict.json")
leftovers_dict = load_dict_data("leftovers_dict.json")
slang_dict = load_dict_data("slang_dict.json")
safety_keys = frozenset(load_list_data("safety_keys.json"))

contractions_dict |= normalize_apostrophes(contractions_dict)
leftovers_dict |= normalize_apostrophes(leftovers_dict)

unsafe_dict = build_apostrophe_variants(contractions_dict, safety_keys)
slang_dict.update(unsafe_dict)

return contractions_dict, leftovers_dict, slang_dict
Loading
Loading