Skip to content

Commit e628bec

Browse files
committed
Refactor prose
1 parent 0d365e7 commit e628bec

File tree

1 file changed

+50
-26
lines changed

1 file changed

+50
-26
lines changed

readme.md

Lines changed: 50 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,29 @@
1-
# retext-keywords [![Build][build-badge]][build] [![Coverage][coverage-badge]][coverage] [![Downloads][downloads-badge]][downloads] [![Chat][chat-badge]][chat]
1+
# retext-keywords
22

3-
Keyword extraction with [**retext**][retext].
3+
[![Build][build-badge]][build]
4+
[![Coverage][coverage-badge]][coverage]
5+
[![Downloads][downloads-badge]][downloads]
6+
[![Size][size-badge]][size]
7+
[![Sponsors][sponsors-badge]][collective]
8+
[![Backers][backers-badge]][collective]
9+
[![Chat][chat-badge]][chat]
410

5-
## Installation
11+
[**retext**][retext] plugin to extract keywords and key-phrases.
12+
13+
## Install
614

715
[npm][]:
816

9-
```bash
17+
```sh
1018
npm install retext-keywords
1119
```
1220

13-
## Usage
21+
## Use
1422

1523
Say we have the following file, `example.txt`, with the first four paragraphs
1624
on [Term Extraction][term-extraction] from Wikipedia:
1725

18-
```text
26+
```txt
1927
Terminology mining, term extraction, term recognition, or glossary extraction, is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus.
2028
2129
In the semantic web era, a growing number of communities and networked enterprises started to access and interoperate through the internet. Modeling these communities and their information needs is important for several web applications, like topic-driven web crawlers, web services, recommender systems, etc. The development of terminology extraction is essential to the language industry.
@@ -25,9 +33,9 @@ One of the first steps to model the knowledge domain of a virtual community is t
2533
Typically, approaches to automatic term extraction make use of linguistic processors (part of speech tagging, phrase chunking) to extract terminological candidates, i.e. syntactically plausible terminological noun phrases, NPs (e.g. compounds "credit card", adjective-NPs "local tourist information office", and prepositional-NPs "board of directors" - in English, the first two constructs are the most frequent). Terminological entries are then filtered from the candidate list using statistical and machine learning methods. Once filtered, because of their low ambiguity and high specificity, these terms are particularly useful for conceptualizing a knowledge domain or for supporting the creation of a domain ontology. Furthermore, terminology extraction is a very useful starting point for semantic similarity, knowledge management, human translation and machine translation, etc.
2634
```
2735

28-
And our script, `example.js`, looks as follows:
36+
…and our script, `example.js`, looks as follows:
2937

30-
```javascript
38+
```js
3139
var vfile = require('to-vfile')
3240
var retext = require('retext')
3341
var keywords = require('retext-keywords')
@@ -58,7 +66,7 @@ function done(err, file) {
5866

5967
Now, running `node example` yields:
6068

61-
```text
69+
```txt
6270
Keywords:
6371
term
6472
extraction
@@ -80,8 +88,9 @@ communities
8088

8189
Extract keywords and key-phrases from the document.
8290

83-
The results are stored on `file.data`: keywords at `file.data.keywords`
84-
and key-phrases at `file.data.keyphrases`. Both are lists.
91+
The results are stored on `file.data`: keywords at `file.data.keywords` and
92+
key-phrases at `file.data.keyphrases`.
93+
Both are lists.
8594

8695
A single keyword looks as follows:
8796

@@ -97,7 +106,7 @@ A single keyword looks as follows:
97106
}
98107
```
99108

100-
...and a key-phrase:
109+
and a key-phrase:
101110

102111
```js
103112
{
@@ -112,22 +121,23 @@ A single keyword looks as follows:
112121
}
113122
```
114123

115-
###### `options`
124+
###### `options.maximum`
116125

117-
* `maximum` (default: `5`) — Try to detect `words` and `phrases`
118-
words;
126+
Try to detect at most `maximum` `words` and `phrases` (`number`, default: `5`).
119127

120-
Note that actual counts may differ. For example, when two words
121-
have the same score, both will be returned. Or when too few words
122-
exist, less will be returned. the same goes for phrases.
128+
Note that actual counts may differ.
129+
For example, when two words have the same score, both will be returned.
130+
Or when too few words exist, less will be returned. the same goes for phrases.
123131

124132
## Contribute
125133

126-
See [`contributing.md` in `retextjs/retext`][contributing] for ways to get
127-
started.
134+
See [`contributing.md`][contributing] in [`retextjs/.github`][health] for ways
135+
to get started.
136+
See [`support.md`][support] for ways to get help.
128137

129-
This organisation has a [Code of Conduct][coc]. By interacting with this
130-
repository, organisation, or community you agree to abide by its terms.
138+
This project has a [Code of Conduct][coc].
139+
By interacting with this repository, organisation, or community you agree to
140+
abide by its terms.
131141

132142
## License
133143

@@ -147,20 +157,34 @@ repository, organisation, or community you agree to abide by its terms.
147157

148158
[downloads]: https://www.npmjs.com/package/retext-keywords
149159

160+
[size-badge]: https://img.shields.io/bundlephobia/minzip/retext-keywords.svg
161+
162+
[size]: https://bundlephobia.com/result?p=retext-keywords
163+
164+
[sponsors-badge]: https://opencollective.com/unified/sponsors/badge.svg
165+
166+
[backers-badge]: https://opencollective.com/unified/backers/badge.svg
167+
168+
[collective]: https://opencollective.com/unified
169+
150170
[chat-badge]: https://img.shields.io/badge/join%20the%20community-on%20spectrum-7b16ff.svg
151171

152172
[chat]: https://spectrum.chat/unified/retext
153173

154174
[npm]: https://docs.npmjs.com/cli/install
155175

176+
[health]: https://github.com/retextjs/.github
177+
178+
[contributing]: https://github.com/retextjs/.github/blob/master/contributing.md
179+
180+
[support]: https://github.com/retextjs/.github/blob/master/support.md
181+
182+
[coc]: https://github.com/retextjs/.github/blob/master/code-of-conduct.md
183+
156184
[license]: license
157185

158186
[author]: https://wooorm.com
159187

160188
[retext]: https://github.com/retextjs/retext
161189

162190
[term-extraction]: https://en.wikipedia.org/wiki/Terminology_extraction
163-
164-
[contributing]: https://github.com/retextjs/retext/blob/master/contributing.md
165-
166-
[coc]: https://github.com/retextjs/retext/blob/master/code-of-conduct.md

0 commit comments

Comments
 (0)