Skip to content

Commit 151be76

Browse files
authored
Update README.md
1 parent 52fe411 commit 151be76

File tree

1 file changed

+7
-15
lines changed

1 file changed

+7
-15
lines changed

README.md

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# [Kex](https://pypi.org/project/kex/)
2-
*Kex* is a python library for unsurpervised keyword extractions:
3-
- [Easy interface for keyword extraction with a variety of algorithms](#extract-keywords-with-kex)
4-
- [Quick benchmarking over 15 English public datasets](#benchmark-on-15-public-datasets)
5-
- [Custom keyword extractor implementation support](#implement-custom-extractor-with-kex)
2+
*Kex* is a python library for unsurpervised keyword extractions, supporting the following features:
3+
- [Easy interface for keyword extraction with a variety of algorithms](https://github.com/asahi417/kex#extract-keywords-with-kex)
4+
- [Quick benchmarking over 15 English public datasets](https://github.com/asahi417/kex#benchmark-on-15-public-datasets)
5+
- [Custom keyword extractor implementation support]((https://github.com/asahi417/kex#implement-custom-extractor-with-kex)
66

77
## Get Started
88
Install via pip
@@ -58,16 +58,15 @@ of-the-art and recent unsupervised keyphrase extraction methods.
5858

5959
### Compute a prior
6060
Algorithms such as `TF`, `TFIDF`, `TFIDFRank`, `LexSpec`, `LexRank`, `TopicalPageRank`, and `SingleTPR` need to compute
61-
a prior distribution beforehand:
62-
61+
a prior distribution beforehand by
6362
```python
6463
>>> import kex
6564
>>> model = kex.SingleTPR()
6665
>>> test_sentences = ['documentA', 'documentB', 'documentC']
6766
>>> model.train(test_sentences, export_directory='./tmp')
6867
```
6968

70-
Priors are cached and can be loaded on the fly:
69+
Priors are cached and can be loaded on the fly as
7170
```python
7271
>>> import kex
7372
>>> model = kex.SingleTPR()
@@ -90,15 +89,8 @@ Users can fetch 15 public keyword extraction datasets via [`kex.get_benchmark_da
9089
'id': '1053.txt'
9190
}
9291
```
93-
94-
High level statistics of each dataset can be found [here](./benchmark/data_statistics.csv), and the benchmark results below:
95-
- [***Precision at top 5***](./benchmark/result.5.precision.fixed.csv)
96-
- [***Precision at top 10***](./benchmark/result.10.precision.fixed.csv)
97-
- [***MRR***](./benchmark/result.mrr.csv)
98-
- [***Complexity (process time)***](./benchmark/complexity.csv)
9992

100-
A prior distributions are computed within each dataset, and complexity is an average over 100 trial on Inspec dataset.
101-
To reproduce the above benchmark results, please take a look an [example script](./examples/benchmark.py).
93+
Please take a look an [example script](https://github.com/asahi417/kex/blob/master/examples/benchmark_custom_model.py) to run a benchmark on those datasets.
10294

10395
## Implement Custom Extractor with `kex`
10496
We provide an API to run a basic pipeline for preprocessing, by which one can implement a custom keyword extractor.

0 commit comments

Comments
 (0)