Skip to content

Import all the CrossFit tasks #7

@dirkgr

Description

@dirkgr

CrossFit has a somewhat unified format for their tasks. We could use it to get a bunch of tasks with very little code.

Here is a list of patterns that @ibeltagy found in CrossFit:

classification
- plan input / output: https://github.com/INK-USC/CrossFit/blob/master/tasks/ade_classification.py
- title: .... [SEP] content: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/amazon_polarity.py
- premise: ... [SEP] hypothesis: ....https://github.com/INK-USC/CrossFit/blob/master/tasks/anli.py
- observation1: ...[SEP] observation2: ... [SEP] hypothesis1: .... ..... https://github.com/INK-USC/CrossFit/blob/master/tasks/art.py
- question: .... [SEP] context: ....  https://github.com/INK-USC/CrossFit/blob/master/tasks/boolq.py
- ... [SEP] .... https://github.com/INK-USC/CrossFit/blob/master/tasks/scicite.py
-  ... and many more similar to above with different field names

text to text
- summarize: .....
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/gigaword.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/multi_news.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/reddit_tifu.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/samsum.py
	- 
- question: ... context: ... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/adversarial_qa.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/ropes.py
	- (Most follow this template)
- question: ... [SEP] category: ... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/jeopardy.py
	- very few follow this template
- ... [SEP] .... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/ade_effect.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/definite_pronoun_resolution.py
- ..<question string>.. [SEP] ..<context string>.. [SEP] ..<choices>... https://github.com/INK-USC/CrossFit/blob/master/tasks/cosmos_qa.py
- <question string>. <choices>. 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/ai2_arc.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/hellaswag.py
	- should have been converted to classification
	- (multiple choice datasets is a huge mess)
- question: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/break.py
- 

sequence tagging: 
- ... [SEP] acronym: .... https://github.com/INK-USC/CrossFit/blob/master/tasks/acronym_identification.py
- <string>
	- input: <string>
	- output: <entity> [SEP]  <entity> .... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/limit.py
	- 
- 

regression
- review: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/app_reviews.py
- https://github.com/INK-USC/CrossFit/blob/master/tasks/google_wellformed_query.py
- question: .... [SEP] context: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/mocha.py
- 
Other:
- https://github.com/INK-USC/CrossFit/blob/master/tasks/numer_sense.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions