You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 8, 2022. It is now read-only.
Copy file name to clipboardExpand all lines: examples/np_semantic_segmentation/README.md
+3-4Lines changed: 3 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,24 +19,23 @@ The expected dataset is a CSV file with 2 columns. the first column contains the
19
19
20
20
If you wish to use an existing dataset for training the model, you can download Tratz 2011 et al. dataset [1,2] from the following link:
21
21
[Tratz 2011 Dataset](https://vered1986.github.io/papers/Tratz2011_Dataset.tar.gz). Is also available in [here](https://www.isi.edu/publications/licensed-sw/fanseparser/index.html).
22
-
(The terms and conditions of the data set license apply. Intel does not grant any rights to the data files or database. see relevant [license agreement](http://www.apache.org/licenses/LICENSE-2.0))
22
+
(The terms and conditions of the data set license apply. Intel does not grant any rights to the data files or database.
23
23
24
24
25
25
After downloading and unzipping the dataset, run `preprocess_tratz2011.py` in order to construct the labeled data and save it in a CSV file (as expected for the model).
26
26
the scripts read 2 .tsv files ('tratz2011_coarse_grained_random/train.tsv' and 'tratz2011_coarse_grained_random/val.tsv') and outputs 2 .csv files accordingly.
A feature vector is extracted from each Noun-Phrase string using the command `python data.py`
36
35
37
36
* Word2Vec word embedding (300 size vector for each word in the Noun-Phrase) .
38
37
* Pre-trained Google News Word2vec model can download [here](https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing)
39
-
* The terms and conditions of the data set license apply. Intel does not grant any rights to the data files or database. see relevant [license agreement](http://www.apache.org/licenses/LICENSE-2.0)
38
+
* The terms and conditions of the data set license apply. Intel does not grant any rights to the data files or database.
40
39
* Cosine distance between 2 words in the Noun-Phrase.
41
40
* NLTKCollocations score (NPMI and UCI scores).
42
41
* A binary features whether the Noun-Phrase has existing entity in Wikidata.
0 commit comments