Skip to content

Commit 98002bf

Browse files
authored
Update README.md
1 parent fd443c7 commit 98002bf

1 file changed

Lines changed: 22 additions & 4 deletions

File tree

README.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
# Discovering spelling variants on Urban Dictionary
2+
Source code of the paper [How to Evaluate Word Representations of Informal Domain?](https://arxiv.org/abs/1911.04669)
23

3-
4-
Scraping Urban Dict through website and API :bamboo:
5-
-------------
4+
## Scraping data from [Urban Dictionary](https://www.urbandictionary.com/) :bamboo:
65

76
* Scraping data from webpage:
87
```diff
@@ -13,5 +12,24 @@ Scraping Urban Dict through website and API :bamboo:
1312
```diff
1413
+ scrapy crawl UD_API
1514
```
15+
## Bootstrapping algorithms
16+
`UD_Extractor/`
17+
18+
## self-training based CRF tagging
19+
`SeqLabeling/`
20+
21+
## Embedding pretraining with Tweets
22+
train Word2Vec, FastText, GloVe with tweets data.
23+
`trainEmbedding/'
24+
25+
## Twitter hashtag prediction task using pretrained embedding
26+
Employ Twitter hashtag prediction downstream task using above pretrained informal word vectors as the extrinsic evaluation.
27+
`HashtagPrediction/`
28+
29+
## Analysis
30+
Use Mean Average Precision (MAP) as the intrinsic evaluation rate on word analogy task. Compare the correlations beween the intrinsic and extrinsic tasks.
31+
`calcSim`
32+
33+
## Web interface
34+
informal word pair search tool, written in Flask: `demo/`
1635

17-
Source code of the paper [How to Evaluate Word Representations of Informal Domain?](https://arxiv.org/abs/1911.04669)

0 commit comments

Comments
 (0)