Skip to content

Commit f3e4e5d

Browse files
committed
all examples docs
1 parent da8887a commit f3e4e5d

File tree

3 files changed

+183
-24
lines changed

3 files changed

+183
-24
lines changed

docs/source/data_transformations.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,9 @@ create the respective ``tsv`` file.
1414
Sample transform functions
1515
^^^^^^^^^^^^^^^^^^^^^^^^^^
1616
.. automodule:: utils.tranform_functions
17-
:members: snips_intent_ner_to_tsv, snli_entailment_to_tsv,create_fragment_detection_tsv,
18-
msmarco_answerability_detection_to_tsv, msmarco_query_type_to_tsv, bio_ner_to_tsv, msmarco_query_type_to_tsv, qqp_query_similarity_to_tsv
17+
:members: snips_intent_ner_to_tsv, snli_entailment_to_tsv, create_fragment_detection_tsv,
18+
msmarco_answerability_detection_to_tsv, msmarco_query_type_to_tsv, bio_ner_to_tsv, coNLL_ner_pos_to_tsv, qqp_query_similarity_to_tsv,
19+
query_correctness_to_tsv, imdb_sentiment_data_to_tsv
1920

2021
Your own transform function
2122
^^^^^^^^^^^^^^^^^^^^^^^^^^^

docs/source/examples.rst

Lines changed: 179 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Example-1 Intent detection, NER, Fragment detection
77

88
**Tasks Description**
99

10-
``Intent Detection`` :- This can be modeled as a single sentence classification task where an `intent` specifies which class the query belongs to.
10+
``Intent Detection`` :- This is a single sentence classification task where an `intent` specifies which class the data sample belongs to.
1111

1212
``NER`` :- This is a Named Entity Recognition/ Sequence Labelling/ Slot filling task where individual words of the sentence are tagged with an entity label it belongs to. The words which don't belong to any entity label are simply labeled as "O".
1313

@@ -17,54 +17,212 @@ Example-1 Intent detection, NER, Fragment detection
1717

1818
NER helps in extracting values for required entities (eg. location, date-time) from query.
1919

20-
Fragment detection is a very useful piece in conversational system as knowing if a query/sentence is incomplete can aid in recognising ill-formed , incomplete user queries and apprpriate expanded options can be provided to the user .
20+
Fragment detection is a very useful piece in conversational system as knowing if a query/sentence is incomplete can aid in discarding bad queries beforehand.
2121

22-
**Data** :- In this example, we are using the `SNIPS <https://snips-nlu.readthedocs.io/en/latest/dataset.html>`_ data for intent and entity detection. For the sake of simplicity, we provide
23-
the data in simpler form under ``snips_data`` directory taken from `here <https://github.com/LeePleased/StackPropagation-SLU/tree/master/data/snips>`_.
22+
**Intent Detection**
2423

25-
**Transform file** :- `transform_file_snips <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/intent_ner_fragment/transform_file_snips.yml>`_
24+
Query: I need a reservation for a bar in bangladesh on feb the 11th 2032
25+
26+
Intent: BookRestaurant
2627

27-
**Tasks file** :- `tasks_file_snips <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/intent_ner_fragment/tasks_file_snips.yml>`_
28+
**NER**
29+
30+
31+
Query: ['book', 'a', 'spot', 'for', 'ten', 'at', 'a', 'top-rated', 'caucasian', 'restaurant', 'not', 'far', 'from', 'selmer']
32+
33+
NER tags: ['O', 'O', 'O', 'O', 'B-party_size_number', 'O', 'O', 'B-sort', 'B-cuisine', 'B-restaurant_type', 'B-spatial_relation', 'I-spatial_relation', 'O', 'B-city']
34+
35+
36+
**Fragment Detection**
37+
38+
39+
Query: a reservation for
40+
41+
Label: fragment
42+
2843

2944
**Notebook** :- `intent_ner_fragment <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/intent_ner_fragment/intent_ner_fragment.ipynb>`_
3045

46+
**Transform file** :- `transform_file_snips <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/intent_ner_fragment/transform_file_snips.yml>`_
47+
48+
**Tasks file** :- `tasks_file_snips <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/intent_ner_fragment/tasks_file_snips.yml>`_
49+
3150
Example-2 Recognising Textual Entailment
3251
----------------------------------------
3352

3453
**Tasks Description**
3554

36-
``Entailment`` :- This is a sentence pair classification task . Given two text instances ‘premise’ and ‘Hypothesis’, Textual Entailment Recognition is the task of determining whether the hypothesis is entailed (can be inferred) from the premise or not.
55+
``Entailment`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.
3756

38-
**Conversational Utility** :- it can be used for evaluating pairwise similarity between queries , also to determine if a query is out-of-context or adversarial in conversational AI usecases like FAQ question answering.
57+
**Conversational Utility** :- In conversational AI context, this task can be seen as determining whether the second sentence is similar to first or not. Additionally, the probability score can also be used as a similarity score between the sentences.
58+
59+
Query1: An old man with a package poses in front of an advertisement.
3960

40-
**Data** :- In this example, we are using the `SNLI <https://nlp.stanford.edu/projects/snli>`_ data which is having sentence pairs and labels.
61+
Query2: A man poses in front of an ad.
62+
63+
Label: entailment
64+
65+
Query1: An old man with a package poses in front of an advertisement.
66+
67+
Query2: A man poses in front of an ad for beer.
68+
69+
Label: non-entailment
70+
71+
72+
73+
**Notebook** :- `entailment_snli <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/entailment_detection/entailment_snli.ipynb>`_
4174

4275
**Transform file** :- `transform_file_snli <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/entailment_detection/transform_file_snli.yml>`_
4376

4477
**Tasks file** :- `tasks_file_snli <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/entailment_detection/tasks_file_snli.yml>`_
4578

46-
**Notebook** :- `entailment_snli <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/entailment_detection/entailment_snli.ipynb>`_
79+
4780

4881
Example-3 Answerability detection
4982
---------------------------------
50-
5183
**Tasks Description**
5284

53-
``answerability`` :- This is modeled as a sentence pair classification task where the first sentence is a query and second sentence is a context passage.
54-
The objective of this task is to determine whether the query can be answered from the context passage or not.
85+
``answerability`` :- This is modeled as a sentence pair classification task where the first sentence is a query and second sentence is a context passage. The objective of this task is to determine whether the query can be answered from the context passage or not.
5586

56-
**Conversational Utility** :- This can be a useful component for building a question-answering/ machine comprehension based system.
57-
In such cases, it becomes very important to determine whether the given query can be answered with given context passage or not before extracting/abstracting an answer from it.
58-
Performing question-answering for a query which is not answerable from the context, could lead to incorrect answer extraction.
87+
**Conversational Utility** :- This can be a useful component for building a question-answering/ machine comprehension based system. In such cases, it becomes very important to determine whether the given query can be answered with given context passage or not before extracting/abstracting an answer from it. Performing question-answering for a query which is not answerable from the context, could lead to incorrect answer extraction.
88+
89+
Query: how much money did evander holyfield make
5990

60-
**Data** :- In this example, we are using the `MSMARCO_triples <https://msmarco.blob.core.windows.net/msmarcoranking/triples.train.small.tar.gz">`_ data which is having sentence pairs and labels.
61-
The data contains triplets where the first entry is the query, second one is the context passage from which the query can be answered (positive passage) , while the third entry is a context
62-
passage from which the query cannot be answered (negative passage).
91+
Context: Evander Holyfield Net Worth. How much is Evander Holyfield Worth? Evander Holyfield Net Worth: Evander Holyfield is a retired American professional boxer who has a net worth of $500 thousand. A professional boxer, Evander Holyfield has fought at the Heavyweight, Cruiserweight, and Light-Heavyweight Divisions, and won a Bronze medal a the 1984 Olympic Games.
6392

64-
Data is transformed into sentence pair classification format, with query-positive context pair labeled as 1 (answerable) and query-negative context pair labeled as 0 (non-answerable)
93+
Label: answerable
94+
95+
**Notebook** :- `answerability_detection_msmarco <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/answerability_detection/answerability_detection_msmarco.ipynb>`_
6596

6697
**Transform file** :- `transform_file_answerability <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/answerability_detection/transform_file_answerability.yml>`_
6798

6899
**Tasks file** :- `tasks_file_answerability <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/answerability_detection/tasks_file_answerability.yml>`_
69100

70-
**Notebook** :- `answerability_detection_msmarco <https://github.com/hellohaptik/multi-task-NLP/tree/master/examples/answerability_detection/answerability_detection_msmarco.ipynb>`_
101+
Example-4 Query type detection
102+
------------------------------
103+
104+
**Tasks Description**
105+
106+
``querytype`` :- This is a single sentence classification task to determine what type (category) of answer is expected for the given query. The queries are divided into 5 major classes according to the answer expected for them.
107+
108+
**Conversational Utility** :- While returning a response for a query, knowing what kind of answer is expected for the query can help in both curating and cross-verfying an answer according to the type.
109+
110+
Query: what's the distance between destin florida and birmingham alabama?
111+
112+
Label: NUMERIC
113+
114+
Query: who is suing scott wolter
115+
116+
Label: PERSON
117+
118+
119+
120+
**Notebook** :- `query_type_detection <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_type_detection/query_type_detection.ipynb>`_
121+
122+
**Transform file** :- `transform_file_querytype <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_type_detection/transform_file_querytype.yml>`_
123+
124+
**Tasks file** :- `tasks_file_querytype <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_type_detection/tasks_file_querytype.yml>`_
125+
126+
Example-5 POS tagging, NER tagging
127+
----------------------------------
128+
129+
**Tasks Description**
130+
131+
``NER`` :-This is a Named Entity Recognition task where individual words of the sentence are tagged with an entity label it belongs to. The words which don't belong to any entity label are simply labeled as "O".
132+
133+
``POS`` :- This is a Part of Speech tagging task. A part of speech is a category of words that have similar grammatical properties. Each word of the sentence is tagged with the part of speech label it belongs to. The words which don't belong to any part of speech label are simply labeled as "O".
134+
135+
**Conversational Utility** :- In conversational AI context, determining the syntactic parts of the sentence can help in extracting noun-phrases or important keyphrases from the sentence.
136+
137+
Query: ['Despite', 'winning', 'the', 'Asian', 'Games', 'title', 'two', 'years', 'ago', ',', 'Uzbekistan', 'are', 'in', 'the', 'finals', 'as', 'outsiders', '.']
138+
139+
NER tags: ['O', 'O', 'O', 'I-MISC', 'I-MISC', 'O', 'O', 'O', 'O', 'O', 'I-LOC', 'O', 'O', 'O', 'O', 'O', 'O', 'O']
140+
141+
POS tags: ['I-PP', 'I-VP', 'I-NP', 'I-NP', 'I-NP', 'I-NP', 'B-NP', 'I-NP', 'I-ADVP', 'O', 'I-NP', 'I-VP', 'I-PP', 'I-NP', 'I-NP', 'I-SBAR', 'I-NP', 'O']
142+
143+
144+
145+
**Notebook** :- `ner_pos_tagging_conll <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/ner_pos_tagging/ner_pos_tagging_conll.ipynb>`_
146+
147+
**Transform file** :- `transform_file_conll <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/ner_pos_tagging/transform_file_conll.yml>`_
148+
149+
**Tasks file** :- `tasks_file_conll <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/ner_pos_tagging/tasks_file_conll.yml>`_
150+
151+
Example-6 Query correctness
152+
---------------------------
153+
154+
**Tasks Description**
155+
156+
``querycorrectness`` :- This is modeled as single sentence classification task identifying whether or not a query is structurally well formed. can enhance query un-derstanding.
157+
158+
**Conversational Utility** :- Determining how much the query is structured would help in enhancing query understanding and improve reliability of tasks which depend on query structure to extract information.
159+
160+
Query: What places have the oligarchy government ?
161+
162+
Label: well-formed
163+
164+
Query: What day of Diwali in 1980 ?
165+
166+
Label: not well-formed
167+
168+
169+
170+
**Notebook** :- `query_correctness <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_correctness/query_correctness.ipynb>`_
171+
172+
**Transform file** :- `transform_file_query_correctness <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_correctness/transform_file_query_correctness.yml>`_
173+
174+
**Tasks file** :- `tasks_file_query_correctness <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_correctness/tasks_file_query_correctness.yml>`_
175+
176+
177+
Example-7 Query similarity
178+
--------------------------
179+
180+
**Tasks Description**
181+
182+
``Query similarity`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.
183+
184+
**Conversational Utility** :- In conversational AI context, this task can be seen as determining whether the second sentence is similar to first or not. Additionally, the probability score can also be used as a similarity score between the sentences.
185+
186+
187+
Query1: What is the most used word in Malayalam?
188+
189+
Query2: What is meaning of the Malayalam word ""thumbatthu""?
190+
191+
Label: not similar
192+
193+
Query1: Which is the best compliment you have ever received?
194+
195+
Query2: What's the best compliment you've got?
196+
197+
Label: similar
198+
199+
200+
**Notebook** :- `query_similarity <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_pair_similarity/query_similarity_qqp.ipynb>`_
201+
202+
**Transform file** :- `transform_file_qqp <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_pair_similarity/transform_file_qqp.yml>`_
203+
204+
**Tasks file** :- `tasks_file_qqp <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/query_pair_similarity/tasks_file_query_qqp.yml>`_
205+
206+
Example-8 Sentiment Analysis
207+
----------------------------
208+
209+
**Tasks Description**
210+
211+
``sentiment`` :- This is modeled as single sentence classification task to determine where a piece of text conveys a positive or negative sentiment.
212+
213+
**Conversational Utility** :- To determine whether a review is positive or negative.
214+
215+
Review: What I enjoyed most in this film was the scenery of Corfu, being Greek I adore my country and I liked the flattering director's point of view. Based on a true story during the years when Greece was struggling to stand on her own two feet through war, Nazis and hardship.
216+
An Italian soldier and a Greek girl fall in love but the times are hard and they have a lot of sacrifices to make. Nicholas Cage looking great in a uniform gives a passionate account of this unfulfilled (in the beginning) love. I adored Christian Bale playing Mandras
217+
the heroine's husband-to-be, he looks very very good as a Greek, his personality matched the one of the Greek patriot! A true fighter in there, or what! One of the movies I would like to buy and keep it in my collection...for ever!
218+
219+
Label: positive
220+
221+
222+
223+
**Notebook** :- `IMDb_sentiment_analysis <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/sentiment_analysis/IMDb_sentiment_analysis.ipynb>`_
224+
225+
**Transform file** :- `transform_file_imdb <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/sentiment_analysis/transform_file_imdb.yml>`_
226+
227+
**Tasks file** :- `tasks_file_imdb <https://github.com/hellohaptik/multi-task-NLP/blob/master/examples/sentiment_analysis/tasks_file_query_imdb.yml>`_
228+

examples/query_pair_similarity/query_similarity_qqp.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
"\n",
1111
"**Tasks Description**\n",
1212
"\n",
13-
"``Entailment`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.\n",
13+
"``Query similarity`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.\n",
1414
"\n",
1515
"**Conversational Utility** :- In conversational AI context, this task can be seen as determining whether the second sentence is similar to first or not. Additionally, the probability score can also be used as a similarity score between the sentences. \n",
1616
"\n",

0 commit comments

Comments
 (0)