Skip to content

Commit e7f6c69

Browse files
Create tests and set up CI to run them (GH-18)
2 parents acf5edf + 97f733b commit e7f6c69

File tree

14 files changed

+204
-92
lines changed

14 files changed

+204
-92
lines changed

.github/workflows/tests.yml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
name: Tests
2+
3+
on:
4+
push:
5+
branches: [ master ]
6+
pull_request:
7+
branches: [ master, tests ]
8+
9+
jobs:
10+
test:
11+
runs-on: ${{ matrix.os || 'ubuntu-latest' }}
12+
strategy:
13+
matrix:
14+
include:
15+
- python-version: "3.6"
16+
env: python3.6
17+
os: ubuntu-20.04 # 3.6 is not available on ubuntu-20.04
18+
- python-version: "3.7"
19+
env: python3.7
20+
- python-version: "3.8"
21+
env: python3.8
22+
- python-version: "3.9"
23+
env: python3.9
24+
- python-version: "3.10"
25+
env: python3.10
26+
- python-version: "3.11"
27+
env: python3.11
28+
29+
steps:
30+
- uses: actions/checkout@v2
31+
- name: Set up Python ${{ matrix.python-version }}
32+
uses: actions/setup-python@v2
33+
with:
34+
python-version: ${{ matrix.python-version }}
35+
- name: Install dependencies
36+
run: |
37+
pip install --upgrade pip
38+
python -m pip install -e .
39+
pip install tox tox-gh-actions
40+
- name: Run tests on different Python versions
41+
run: tox -e ${{ matrix.env }}

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,5 @@
1-
.idea
1+
.idea
2+
.pytest_cache
3+
.tox
4+
*.egg-info
5+
__pycache__

README.md

Lines changed: 32 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,64 +1,60 @@
1-
# FuzzyMap <img src="https://github.com/pysnippet.png" align="right" height="64" />
1+
# Fuzzy Map <img src="https://github.com/pysnippet.png" align="right" height="64" />
22

33
[![PyPI](https://img.shields.io/pypi/v/fuzzymap.svg)](https://pypi.org/project/fuzzymap/)
4-
[![License](https://img.shields.io/pypi/l/fuzzymap.svg)](https://github.com/pysnippet/fuzzymap/blob/master/LICENSE)
4+
[![License](https://img.shields.io/pypi/l/fuzzymap.svg?color=blue)](https://github.com/pysnippet/fuzzymap/blob/master/LICENSE)
55
[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Fpysnippet%2Ffuzzymap.svg?type=shield)](https://app.fossa.com/projects/git%2Bgithub.com%2Fpysnippet%2Ffuzzymap?ref=badge_shield)
6+
[![Tests](https://github.com/pysnippet/fuzzymap/actions/workflows/tests.yml/badge.svg)](https://github.com/pysnippet/fuzzymap/actions/workflows/tests.yml)
67

7-
## What is FuzzyMap?
8+
## What is the Fuzzy Map?
89

9-
`FuzzyMap` is a polymorph Python dictionary. This kind of dictionary returns the value of the exact key if there is such
10-
a key. Otherwise, it will return the value of the most similar key satisfying the given ratio. The same mechanism works
11-
when setting a new or replacing an old key in the dictionary. If the key is not found and does not match any of the keys
12-
by the given ratio, it returns `None`.
10+
The Fuzzy Map is a polymorph Python dictionary that always returns the value of the closest similar key. This kind of
11+
dictionary returns the value of the exact key if there is such a key. Otherwise, it will return the value of the most
12+
similar key satisfying the given ratio. The exact mechanism works when setting a new or replacing an old key in the
13+
dictionary. If the key is not found and does not match any of the keys by the given ratio, it returns none.
1314

14-
## How does it work?
15+
## Usage with a real-world example
1516

16-
Suppose you have scraped data from multiple sources that do not have a unique identifier, and you want to compare the
17-
values of the items having the same identifiers. Sure there will be found a field that mostly has an equivalent value
18-
at each source. And you can use that field to identify the corresponding items of other sources' data.
19-
20-
## Let's look at the following example
21-
22-
There is a live data parser that collects the coefficients of football matches from different bookmakers at once, then
23-
calculates and logs the existing forks. Many bookmakers change the name of the teams to be incomparable with names on
24-
other sites.
17+
A live data parser collects the coefficients of sports games from different bookmakers at once, and then an analyzer
18+
tries to find the existing forks. Different bookmakers use different names for the same games. Some of them use the full
19+
names, and others use names with a partial abbreviation that makes the analyzer's job harder to find and compare the
20+
coefficients of the same game. Rather this could be hard without `FuzzyMap` that can find the game using the name used
21+
in one of the sources.
2522

2623
```python
2724
from fuzzymap import FuzzyMap
2825

29-
src1 = {
26+
source_1 = {
3027
'Rapid Wien - First Vienna': {'w1': 1.93, 'x': 2.32, 'w2': 7.44},
3128
'Al Bourj - Al Nejmeh': {'w1': 26, 'x': 11.5, 'w2': 1.05},
32-
# hundreds of other teams' data
29+
# hundreds of other games' data
3330
}
3431

35-
src2 = FuzzyMap({
32+
source_2 = FuzzyMap({
3633
'Bourj FC - Nejmeh SC Beirut': {'w1': 32, 'x': 12, 'w2': 1.05},
3734
'SK Rapid Wien - First Vienna FC': {'w1': 1.97, 'x': 2.3, 'w2': 8.2},
38-
# hundreds of other teams' data
35+
# hundreds of other games' data
3936
})
4037

41-
for team, coefs1 in src1.items():
42-
coefs2 = src2[team]
38+
for game, odds1 in source_1.items():
39+
odds2 = source_2[game]
4340

44-
# coefs1 = {"w1": 1.93, "x": 2.32, "w2": 7.44}
45-
# coefs2 = {"w1": 1.97, "x": 2.3, "w2": 8.2}
46-
handle_fork(coefs1, coefs2)
41+
# odds1 = {"w1": 1.93, "x": 2.32, "w2": 7.44}
42+
# odds2 = {"w1": 1.97, "x": 2.3, "w2": 8.2}
43+
handle_fork(odds1, odds2)
4744
```
4845

49-
With a human brain, it is not difficult to identify that "Rapid Wien - First Vienna" and "SK Rapid Wien - First Vienna
50-
FC" matches are the same. In the above example, the `src2` is defined as `FuzzyMap`, it makes its keys fuzzy-matchable,
51-
and we can get an item corresponding to the key of `src1`. See the below graph demonstrating the associations of
52-
`FuzzyMap` keys.
46+
In this code example, `source_1` and `source_2` are the dictionary of game and coefficients key-value pairs parsed from
47+
different sources. And converting the `source_2` dictionary to the `FuzzyMap` dictionary makes it able to find the
48+
corresponding game using the game's key used in the `source_1` dictionary.
5349

5450
```mermaid
5551
graph LR
56-
src1team1[Rapid Wien - First Vienna]-->src1coefs1["{'w1': 1.93, 'x': 2.32, 'w2': 7.44}"]
57-
src1team2[Al Bourj - Al Nejmeh]-->src1coefs2["{'w1': 26, 'x': 11.5, 'w2': 1.05}"]
58-
src2team1[SK Rapid Wien - First Vienna FC]-->src2coefs1["{'w1': 1.97, 'x': 2.3, 'w2': 8.2}"]
59-
src2team2[Bourj FC - Nejmeh SC Beirut]-->src2coefs2["{'w1': 32, 'x': 12, 'w2': 1.05}"]
60-
src1team1-->src2coefs1
61-
src1team2-->src2coefs2
52+
src1team1[Rapid Wien - First Vienna] --> src1coefs1["{'w1': 1.93, 'x': 2.32, 'w2': 7.44}"]
53+
src1team2[Al Bourj - Al Nejmeh] --> src1coefs2["{'w1': 26, 'x': 11.5, 'w2': 1.05}"]
54+
src2team1[SK Rapid Wien - First Vienna FC] --> src2coefs1["{'w1': 1.97, 'x': 2.3, 'w2': 8.2}"]
55+
src2team2[Bourj FC - Nejmeh SC Beirut] --> src2coefs2["{'w1': 32, 'x': 12, 'w2': 1.05}"]
56+
src1team1 --> src2coefs1
57+
src1team2 --> src2coefs2
6258
```
6359

6460
## License

fuzzymap/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,4 @@
1717
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
1818

1919
__all__ = ('FuzzyMap',)
20+
__version__ = '1.1.0'

fuzzymap/fuzzymap.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ class FuzzyMap(dict):
1818
# diff between the compared keys
1919
ratio = 60
2020

21-
def closest_key(self, key):
21+
def _closest_key(self, key):
2222
"""Returns the closest key matched by the given ratio"""
2323

2424
if len(self):
@@ -35,7 +35,7 @@ def get(self, key, default=None):
3535
return self[key] or default
3636

3737
def __missing__(self, key):
38-
return super().get(self.closest_key(key))
38+
return super().get(self._closest_key(key))
3939

4040
def __setitem__(self, key, value):
41-
super().__setitem__(self.closest_key(key), value)
41+
super().__setitem__(self._closest_key(key), value)

pyproject.toml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[build-system]
2+
requires = ["setuptools>=42.0", "wheel"]
3+
build-backend = "setuptools.build_meta"
4+
5+
[tool.pytest.ini_options]
6+
testpaths = ["tests"]
7+
filterwarnings = ["ignore::DeprecationWarning"]

requirements.txt

Lines changed: 0 additions & 2 deletions
This file was deleted.

setup.cfg

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
[metadata]
2+
name = fuzzymap
3+
version = attr: fuzzymap.__version__
4+
author = Artyom Vancyan
5+
author_email = [email protected]
6+
description = The Fuzzy Map is a polymorph Python dictionary that always returns the value of the closest similar key.
7+
long_description = file: README.md
8+
long_description_content_type = text/markdown
9+
url = https://github.com/pysnippet/fuzzymap
10+
keywords =
11+
python
12+
map
13+
dict
14+
match
15+
fuzzy
16+
matching
17+
dictionary
18+
fuzzywuzzy
19+
license = GPLv2
20+
license_files = LICENSE
21+
platforms = unix, linux, osx, win32
22+
classifiers =
23+
License :: OSI Approved :: GNU General Public License v2 (GPLv2)
24+
Topic :: Software Development :: Libraries :: Python Modules
25+
Operating System :: OS Independent
26+
Programming Language :: Python :: 3.6
27+
Programming Language :: Python :: 3.7
28+
Programming Language :: Python :: 3.8
29+
Programming Language :: Python :: 3.9
30+
Programming Language :: Python :: 3.10
31+
Programming Language :: Python :: 3.11
32+
Programming Language :: Python :: 3.12
33+
34+
[options]
35+
packages =
36+
fuzzymap
37+
install_requires =
38+
fuzzywuzzy>=0.3.0
39+
python-Levenshtein>=0.12.1
40+
python_requires = >=3.6
41+
zip_safe = no
42+
43+
[options.extras_require]
44+
testing =
45+
pytest>=6.0
46+
tox>=3.24

setup.py

Lines changed: 3 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,7 @@
11
# Copyright (C) 2022 Artyom Vancyan
22
# See full copyright notice at __init__.py
3-
import subprocess
43

5-
import setuptools
4+
from setuptools import setup
65

7-
version = (
8-
subprocess.run(["git", "describe", "--tags"], stdout=subprocess.PIPE)
9-
.stdout.decode("utf-8")
10-
.strip()
11-
)
12-
13-
if "-" in version:
14-
# when not on tag, git describe outputs: "v1.0.0-4-g24a8f40"
15-
# pip has gotten strict with version numbers
16-
# so change it to: "1.0.0+4.git.g24a8f40"
17-
# See: https://peps.python.org/pep-0440/#local-version-segments
18-
v, i, s = version.split("-")
19-
version = v + "+" + i + ".git." + s
20-
21-
assert "-" not in version
22-
assert "." in version
23-
24-
with open("README.md", "r", encoding="utf-8") as fp:
25-
long_description = fp.read()
26-
27-
setuptools.setup(
28-
name="fuzzymap",
29-
version=version,
30-
author="Artyom Vancyan",
31-
author_email="[email protected]",
32-
description="Python dictionary with a FUZZY key-matching opportunity",
33-
long_description=long_description,
34-
long_description_content_type="text/markdown",
35-
url="https://github.com/pysnippet/fuzzymap",
36-
packages=setuptools.find_packages(),
37-
classifiers=[
38-
"License :: OSI Approved :: GNU General Public License v2 (GPLv2)",
39-
"Operating System :: OS Independent",
40-
'Programming Language :: Python :: 3.6',
41-
'Programming Language :: Python :: 3.7',
42-
'Programming Language :: Python :: 3.8',
43-
'Programming Language :: Python :: 3.9',
44-
'Programming Language :: Python :: 3.10',
45-
'Programming Language :: Python :: 3.11',
46-
'Programming Language :: Python :: 3.12',
47-
'Topic :: Software Development :: Libraries :: Python Modules',
48-
],
49-
python_requires=">=3.6",
50-
install_requires=[
51-
"fuzzywuzzy>=0.3.0",
52-
"python-Levenshtein>=0.12.1",
53-
],
54-
)
6+
if __name__ == "__main__":
7+
setup()

tests/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)