You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -7,7 +7,7 @@ Example-1 Intent detection, NER, Fragment detection
7
7
8
8
**Tasks Description**
9
9
10
-
``Intent Detection`` :- This can be modeled as a single sentence classification task where an `intent` specifies which class the query belongs to.
10
+
``Intent Detection`` :- This is a single sentence classification task where an `intent` specifies which class the data sample belongs to.
11
11
12
12
``NER`` :- This is a Named Entity Recognition/ Sequence Labelling/ Slot filling task where individual words of the sentence are tagged with an entity label it belongs to. The words which don't belong to any entity label are simply labeled as "O".
13
13
@@ -17,54 +17,212 @@ Example-1 Intent detection, NER, Fragment detection
17
17
18
18
NER helps in extracting values for required entities (eg. location, date-time) from query.
19
19
20
-
Fragment detection is a very useful piece in conversational system as knowing if a query/sentence is incomplete can aid in recognising ill-formed , incomplete user queries and apprpriate expanded options can be provided to the user .
20
+
Fragment detection is a very useful piece in conversational system as knowing if a query/sentence is incomplete can aid in discarding bad queries beforehand.
21
21
22
-
**Data** :- In this example, we are using the `SNIPS <https://snips-nlu.readthedocs.io/en/latest/dataset.html>`_ data for intent and entity detection. For the sake of simplicity, we provide
23
-
the data in simpler form under ``snips_data`` directory taken from `here <https://github.com/LeePleased/StackPropagation-SLU/tree/master/data/snips>`_.
``Entailment`` :- This is a sentence pair classification task . Given two text instances ‘premise’ and ‘Hypothesis’, Textual Entailment Recognition is the task of determining whether the hypothesis is entailed (can be inferred) from the premise or not.
55
+
``Entailment`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.
37
56
38
-
**Conversational Utility** :- it can be used for evaluating pairwise similarity between queries , also to determine if a query is out-of-context or adversarial in conversational AI usecases like FAQ question answering.
57
+
**Conversational Utility** :- In conversational AI context, this task can be seen as determining whether the second sentence is similar to first or not. Additionally, the probability score can also be used as a similarity score between the sentences.
58
+
59
+
Query1: An old man with a package poses in front of an advertisement.
39
60
40
-
**Data** :- In this example, we are using the `SNLI <https://nlp.stanford.edu/projects/snli>`_ data which is having sentence pairs and labels.
61
+
Query2: A man poses in front of an ad.
62
+
63
+
Label: entailment
64
+
65
+
Query1: An old man with a package poses in front of an advertisement.
``answerability`` :- This is modeled as a sentence pair classification task where the first sentence is a query and second sentence is a context passage.
54
-
The objective of this task is to determine whether the query can be answered from the context passage or not.
85
+
``answerability`` :- This is modeled as a sentence pair classification task where the first sentence is a query and second sentence is a context passage. The objective of this task is to determine whether the query can be answered from the context passage or not.
55
86
56
-
**Conversational Utility** :- This can be a useful component for building a question-answering/ machine comprehension based system.
57
-
In such cases, it becomes very important to determine whether the given query can be answered with given context passage or not before extracting/abstracting an answer from it.
58
-
Performing question-answering for a query which is not answerable from the context, could lead to incorrect answer extraction.
87
+
**Conversational Utility** :- This can be a useful component for building a question-answering/ machine comprehension based system. In such cases, it becomes very important to determine whether the given query can be answered with given context passage or not before extracting/abstracting an answer from it. Performing question-answering for a query which is not answerable from the context, could lead to incorrect answer extraction.
88
+
89
+
Query: how much money did evander holyfield make
59
90
60
-
**Data** :- In this example, we are using the `MSMARCO_triples <https://msmarco.blob.core.windows.net/msmarcoranking/triples.train.small.tar.gz">`_ data which is having sentence pairs and labels.
61
-
The data contains triplets where the first entry is the query, second one is the context passage from which the query can be answered (positive passage) , while the third entry is a context
62
-
passage from which the query cannot be answered (negative passage).
91
+
Context: Evander Holyfield Net Worth. How much is Evander Holyfield Worth? Evander Holyfield Net Worth: Evander Holyfield is a retired American professional boxer who has a net worth of $500 thousand. A professional boxer, Evander Holyfield has fought at the Heavyweight, Cruiserweight, and Light-Heavyweight Divisions, and won a Bronze medal a the 1984 Olympic Games.
63
92
64
-
Data is transformed into sentence pair classification format, with query-positive context pair labeled as 1 (answerable) and query-negative context pair labeled as 0 (non-answerable)
``querytype`` :- This is a single sentence classification task to determine what type (category) of answer is expected for the given query. The queries are divided into 5 major classes according to the answer expected for them.
107
+
108
+
**Conversational Utility** :- While returning a response for a query, knowing what kind of answer is expected for the query can help in both curating and cross-verfying an answer according to the type.
109
+
110
+
Query: what's the distance between destin florida and birmingham alabama?
``NER`` :-This is a Named Entity Recognition task where individual words of the sentence are tagged with an entity label it belongs to. The words which don't belong to any entity label are simply labeled as "O".
132
+
133
+
``POS`` :- This is a Part of Speech tagging task. A part of speech is a category of words that have similar grammatical properties. Each word of the sentence is tagged with the part of speech label it belongs to. The words which don't belong to any part of speech label are simply labeled as "O".
134
+
135
+
**Conversational Utility** :- In conversational AI context, determining the syntactic parts of the sentence can help in extracting noun-phrases or important keyphrases from the sentence.
``querycorrectness`` :- This is modeled as single sentence classification task identifying whether or not a query is structurally well formed. can enhance query un-derstanding.
157
+
158
+
**Conversational Utility** :- Determining how much the query is structured would help in enhancing query understanding and improve reliability of tasks which depend on query structure to extract information.
159
+
160
+
Query: What places have the oligarchy government ?
``Query similarity`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.
183
+
184
+
**Conversational Utility** :- In conversational AI context, this task can be seen as determining whether the second sentence is similar to first or not. Additionally, the probability score can also be used as a similarity score between the sentences.
185
+
186
+
187
+
Query1: What is the most used word in Malayalam?
188
+
189
+
Query2: What is meaning of the Malayalam word ""thumbatthu""?
190
+
191
+
Label: not similar
192
+
193
+
Query1: Which is the best compliment you have ever received?
``sentiment`` :- This is modeled as single sentence classification task to determine where a piece of text conveys a positive or negative sentiment.
212
+
213
+
**Conversational Utility** :- To determine whether a review is positive or negative.
214
+
215
+
Review: What I enjoyed most in this film was the scenery of Corfu, being Greek I adore my country and I liked the flattering director's point of view. Based on a true story during the years when Greece was struggling to stand on her own two feet through war, Nazis and hardship.
216
+
An Italian soldier and a Greek girl fall in love but the times are hard and they have a lot of sacrifices to make. Nicholas Cage looking great in a uniform gives a passionate account of this unfulfilled (in the beginning) love. I adored Christian Bale playing Mandras
217
+
the heroine's husband-to-be, he looks very very good as a Greek, his personality matched the one of the Greek patriot! A true fighter in there, or what! One of the movies I would like to buy and keep it in my collection...for ever!
Copy file name to clipboardExpand all lines: examples/query_pair_similarity/query_similarity_qqp.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@
10
10
"\n",
11
11
"**Tasks Description**\n",
12
12
"\n",
13
-
"``Entailment`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.\n",
13
+
"``Query similarity`` :- This is a sentence pair classification task which determines whether the second sentence in a sample can be inferred from the first.\n",
14
14
"\n",
15
15
"**Conversational Utility** :- In conversational AI context, this task can be seen as determining whether the second sentence is similar to first or not. Additionally, the probability score can also be used as a similarity score between the sentences. \n",
0 commit comments