Skip to content

Commit 02ce9fd

Browse files
author
The TensorFlow Datasets Authors
committed
Automated documentation update.
PiperOrigin-RevId: 787922173
1 parent d5d3ab1 commit 02ce9fd

File tree

3 files changed

+175
-0
lines changed

3 files changed

+175
-0
lines changed

docs/catalog/_toc.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,9 @@ toc:
116116
title: ai2_arc_with_ir
117117
- path: /datasets/catalog/arc
118118
title: arc
119+
- path: /datasets/catalog/covr
120+
status: nightly
121+
title: covr
119122
- path: /datasets/catalog/natural_questions
120123
title: natural_questions
121124
- path: /datasets/catalog/openbookqa
@@ -274,6 +277,9 @@ toc:
274277
title: clic
275278
- path: /datasets/catalog/coil100
276279
title: coil100
280+
- path: /datasets/catalog/covr
281+
status: nightly
282+
title: covr
277283
- path: /datasets/catalog/div2k
278284
title: div2k
279285
- path: /datasets/catalog/downsampled_imagenet
@@ -684,6 +690,9 @@ toc:
684690
title: bool_q
685691
- path: /datasets/catalog/bot_adversarial_dialogue
686692
title: bot_adversarial_dialogue
693+
- path: /datasets/catalog/covr
694+
status: nightly
695+
title: covr
687696
- path: /datasets/catalog/dices
688697
title: dices
689698
- path: /datasets/catalog/dolma
@@ -768,6 +777,9 @@ toc:
768777
- section:
769778
- path: /datasets/catalog/anli
770779
title: anli
780+
- path: /datasets/catalog/covr
781+
status: nightly
782+
title: covr
771783
- path: /datasets/catalog/dices
772784
title: dices
773785
- path: /datasets/catalog/paws_wiki
@@ -788,6 +800,9 @@ toc:
788800
title: bool_q
789801
- path: /datasets/catalog/clevr
790802
title: clevr
803+
- path: /datasets/catalog/covr
804+
status: nightly
805+
title: covr
791806
- path: /datasets/catalog/databricks_dolly
792807
title: databricks_dolly
793808
- path: /datasets/catalog/dices
@@ -852,6 +867,9 @@ toc:
852867
title: coco
853868
- path: /datasets/catalog/coco_captions
854869
title: coco_captions
870+
- path: /datasets/catalog/covr
871+
status: nightly
872+
title: covr
855873
- path: /datasets/catalog/flic
856874
title: flic
857875
- path: /datasets/catalog/kitti
@@ -1380,6 +1398,9 @@ toc:
13801398
title: corr2cause
13811399
- path: /datasets/catalog/cos_e
13821400
title: cos_e
1401+
- path: /datasets/catalog/covr
1402+
status: nightly
1403+
title: covr
13831404
- path: /datasets/catalog/databricks_dolly
13841405
title: databricks_dolly
13851406
- path: /datasets/catalog/definite_pronoun_resolution

docs/catalog/covr.md

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
<div itemscope itemtype="http://schema.org/Dataset">
2+
<div itemscope itemprop="includedInDataCatalog" itemtype="http://schema.org/DataCatalog">
3+
<meta itemprop="name" content="TensorFlow Datasets" />
4+
</div>
5+
<meta itemprop="name" content="covr" />
6+
<meta itemprop="description" content="[COVR](https://covr-dataset.github.io/) dataset with [imSitu](https://github.com/my89/imSitu) and [Visual Genome](https://homes.cs.washington.edu/~ranjay/visualgenome/index.html) images.&#10;&#10;To use this dataset:&#10;&#10;```python&#10;import tensorflow_datasets as tfds&#10;&#10;ds = tfds.load(&#x27;covr&#x27;, split=&#x27;train&#x27;)&#10;for ex in ds.take(4):&#10; print(ex)&#10;```&#10;&#10;See [the guide](https://www.tensorflow.org/datasets/overview) for more&#10;informations on [tensorflow_datasets](https://www.tensorflow.org/datasets).&#10;&#10;" />
7+
<meta itemprop="url" content="https://www.tensorflow.org/datasets/catalog/covr" />
8+
<meta itemprop="sameAs" content="https://covr-dataset.github.io/" />
9+
<meta itemprop="citation" content="@inproceedings{bogin-etal-2021-covr,&#10; title = &quot;{COVR}: A Test-Bed for Visually Grounded Compositional Generalization with Real Images&quot;,&#10; author = &quot;Bogin, Ben and&#10; Gupta, Shivanshu and&#10; Gardner, Matt and&#10; Berant, Jonathan&quot;,&#10; editor = &quot;Moens, Marie-Francine and&#10; Huang, Xuanjing and&#10; Specia, Lucia and&#10; Yih, Scott Wen-tau&quot;,&#10; booktitle = &quot;Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing&quot;,&#10; month = nov,&#10; year = &quot;2021&quot;,&#10; address = &quot;Online and Punta Cana, Dominican Republic&quot;,&#10; publisher = &quot;Association for Computational Linguistics&quot;,&#10; url = &quot;https://aclanthology.org/2021.emnlp-main.774/&quot;,&#10; doi = &quot;10.18653/v1/2021.emnlp-main.774&quot;,&#10; pages = &quot;9824--9846&quot;,&#10; abstract = &quot;While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In this work, we propose COVR, a new test-bed for visually-grounded compositional generalization with real images. To create COVR, we use real images annotated with scene graphs, and propose an almost fully automatic procedure for generating question-answer pairs along with a set of context images. COVR focuses on questions that require complex reasoning, including higher-order operations such as quantification and aggregation. Due to the automatic generation process, COVR facilitates the creation of compositional splits, where models at test time need to generalize to new concepts and compositions in a zero- or few-shot setting. We construct compositional splits using COVR and demonstrate a myriad of cases where state-of-the-art pre-trained language-and-vision models struggle to compositionally generalize.&quot;&#10;}&#10;&#10;@inproceedings{yatskar2016,&#10; title={Situation Recognition: Visual Semantic Role Labeling for Image Understanding},&#10; author={Yatskar, Mark and Zettlemoyer, Luke and Farhadi, Ali},&#10; booktitle={Conference on Computer Vision and Pattern Recognition},&#10; year={2016}&#10;}&#10;&#10;@article{cite-key,&#10; abstract = {Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering. Cognition is core to tasks that involve not just recognizing, but reasoning about our visual world. However, models used to tackle the rich content in images for cognitive tasks are still being trained using the same datasets designed for perceptual tasks. To achieve success at cognitive tasks, models need to understand the interactions and relationships between objects in an image. When asked ``What vehicle is the person riding?&#x27;&#x27;, computers will need to identify the objects in an image as well as the relationships riding(man, carriage) and pulling(horse, carriage) to answer correctly that ``the person is riding a horse-drawn carriage.&#x27;&#x27;In this paper, we present the Visual Genome dataset to enable the modeling of such relationships. We collect dense annotations of objects, attributes, and relationships within each image to learn these models. Specifically, our dataset contains over 108K images where each image has an average of {\$}{\$}35{\$}{\$}objects, {\$}{\$}26{\$}{\$}attributes, and {\$}{\$}21{\$}{\$}pairwise relationships between objects. We canonicalize the objects, attributes, relationships, and noun phrases in region descriptions and questions answer pairs to WordNet synsets. Together, these annotations represent the densest and largest dataset of image descriptions, objects, attributes, relationships, and question answer pairs.},&#10; author = {Krishna, Ranjay and Zhu, Yuke and Groth, Oliver and Johnson, Justin and Hata, Kenji and Kravitz, Joshua and Chen, Stephanie and Kalantidis, Yannis and Li, Li-Jia and Shamma, David A. and Bernstein, Michael S. and Fei-Fei, Li},&#10; date = {2017/05/01},&#10; date-added = {2025-07-10 08:32:03 -0700},&#10; date-modified = {2025-07-10 08:32:03 -0700},&#10; doi = {10.1007/s11263-016-0981-7},&#10; id = {Krishna2017},&#10; isbn = {1573-1405},&#10; journal = {International Journal of Computer Vision},&#10; number = {1},&#10; pages = {32--73},&#10; title = {Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations},&#10; url = {https://doi.org/10.1007/s11263-016-0981-7},&#10; volume = {123},&#10; year = {2017},&#10; bdsk-url-1 = {https://doi.org/10.1007/s11263-016-0981-7}}" />
10+
</div>
11+
12+
# `covr`
13+
14+
15+
Note: This dataset was added recently and is only available in our
16+
`tfds-nightly` package
17+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>.
18+
19+
* **Description**:
20+
21+
[COVR](https://covr-dataset.github.io/) dataset with
22+
[imSitu](https://github.com/my89/imSitu) and
23+
[Visual Genome](https://homes.cs.washington.edu/~ranjay/visualgenome/index.html)
24+
images.
25+
26+
* **Homepage**:
27+
[https://covr-dataset.github.io/](https://covr-dataset.github.io/)
28+
29+
* **Source code**:
30+
[`tfds.datasets.covr.Builder`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/datasets/covr/covr_dataset_builder.py)
31+
32+
* **Versions**:
33+
34+
* **`1.0.0`** (default): Initial release.
35+
36+
* **Download size**: `48.35 GiB`
37+
38+
* **Dataset size**: `173.96 GiB`
39+
40+
* **Auto-cached**
41+
([documentation](https://www.tensorflow.org/datasets/performances#auto-caching)):
42+
No
43+
44+
* **Splits**:
45+
46+
Split | Examples
47+
:------------- | -------:
48+
`'test'` | 7,024
49+
`'train'` | 248,154
50+
`'validation'` | 6,891
51+
52+
* **Feature structure**:
53+
54+
```python
55+
FeaturesDict({
56+
'images': Sequence(Image(shape=(None, None, 3), dtype=uint8)),
57+
'label': Text(shape=(), dtype=string),
58+
'pattern_name': Text(shape=(), dtype=string),
59+
'program': Text(shape=(), dtype=string),
60+
'properties': Sequence(Text(shape=(), dtype=string)),
61+
'scenes': Sequence(Text(shape=(), dtype=string)),
62+
'utterance': Text(shape=(), dtype=string),
63+
})
64+
```
65+
66+
* **Feature documentation**:
67+
68+
Feature | Class | Shape | Dtype | Description
69+
:----------- | :-------------- | :-------------------- | :----- | :----------
70+
| FeaturesDict | | |
71+
images | Sequence(Image) | (None, None, None, 3) | uint8 |
72+
label | Text | | string |
73+
pattern_name | Text | | string |
74+
program | Text | | string |
75+
properties | Sequence(Text) | (None,) | string |
76+
scenes | Sequence(Text) | (None,) | string |
77+
utterance | Text | | string |
78+
79+
* **Supervised keys** (See
80+
[`as_supervised` doc](https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args)):
81+
`None`
82+
83+
* **Figure**
84+
([tfds.show_examples](https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples)):
85+
Not supported.
86+
87+
* **Examples**
88+
([tfds.as_dataframe](https://www.tensorflow.org/datasets/api_docs/python/tfds/as_dataframe)):
89+
Missing.
90+
91+
* **Citation**:
92+
93+
```
94+
@inproceedings{bogin-etal-2021-covr,
95+
title = "{COVR}: A Test-Bed for Visually Grounded Compositional Generalization with Real Images",
96+
author = "Bogin, Ben and
97+
Gupta, Shivanshu and
98+
Gardner, Matt and
99+
Berant, Jonathan",
100+
editor = "Moens, Marie-Francine and
101+
Huang, Xuanjing and
102+
Specia, Lucia and
103+
Yih, Scott Wen-tau",
104+
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
105+
month = nov,
106+
year = "2021",
107+
address = "Online and Punta Cana, Dominican Republic",
108+
publisher = "Association for Computational Linguistics",
109+
url = "https://aclanthology.org/2021.emnlp-main.774/",
110+
doi = "10.18653/v1/2021.emnlp-main.774",
111+
pages = "9824--9846",
112+
abstract = "While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In this work, we propose COVR, a new test-bed for visually-grounded compositional generalization with real images. To create COVR, we use real images annotated with scene graphs, and propose an almost fully automatic procedure for generating question-answer pairs along with a set of context images. COVR focuses on questions that require complex reasoning, including higher-order operations such as quantification and aggregation. Due to the automatic generation process, COVR facilitates the creation of compositional splits, where models at test time need to generalize to new concepts and compositions in a zero- or few-shot setting. We construct compositional splits using COVR and demonstrate a myriad of cases where state-of-the-art pre-trained language-and-vision models struggle to compositionally generalize."
113+
}
114+
115+
@inproceedings{yatskar2016,
116+
title={Situation Recognition: Visual Semantic Role Labeling for Image Understanding},
117+
author={Yatskar, Mark and Zettlemoyer, Luke and Farhadi, Ali},
118+
booktitle={Conference on Computer Vision and Pattern Recognition},
119+
year={2016}
120+
}
121+
122+
@article{cite-key,
123+
abstract = {Despite progress in perceptual tasks such as image classification, computers still perform poorly on cognitive tasks such as image description and question answering. Cognition is core to tasks that involve not just recognizing, but reasoning about our visual world. However, models used to tackle the rich content in images for cognitive tasks are still being trained using the same datasets designed for perceptual tasks. To achieve success at cognitive tasks, models need to understand the interactions and relationships between objects in an image. When asked ``What vehicle is the person riding?'', computers will need to identify the objects in an image as well as the relationships riding(man, carriage) and pulling(horse, carriage) to answer correctly that ``the person is riding a horse-drawn carriage.''In this paper, we present the Visual Genome dataset to enable the modeling of such relationships. We collect dense annotations of objects, attributes, and relationships within each image to learn these models. Specifically, our dataset contains over 108K images where each image has an average of {\$}{\$}35{\$}{\$}objects, {\$}{\$}26{\$}{\$}attributes, and {\$}{\$}21{\$}{\$}pairwise relationships between objects. We canonicalize the objects, attributes, relationships, and noun phrases in region descriptions and questions answer pairs to WordNet synsets. Together, these annotations represent the densest and largest dataset of image descriptions, objects, attributes, relationships, and question answer pairs.},
124+
author = {Krishna, Ranjay and Zhu, Yuke and Groth, Oliver and Johnson, Justin and Hata, Kenji and Kravitz, Joshua and Chen, Stephanie and Kalantidis, Yannis and Li, Li-Jia and Shamma, David A. and Bernstein, Michael S. and Fei-Fei, Li},
125+
date = {2017/05/01},
126+
date-added = {2025-07-10 08:32:03 -0700},
127+
date-modified = {2025-07-10 08:32:03 -0700},
128+
doi = {10.1007/s11263-016-0981-7},
129+
id = {Krishna2017},
130+
isbn = {1573-1405},
131+
journal = {International Journal of Computer Vision},
132+
number = {1},
133+
pages = {32--73},
134+
title = {Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations},
135+
url = {https://doi.org/10.1007/s11263-016-0981-7},
136+
volume = {123},
137+
year = {2017},
138+
bdsk-url-1 = {https://doi.org/10.1007/s11263-016-0981-7}}
139+
```
140+

docs/catalog/overview.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,8 @@ for ex in tfds.load('cifar10', split='train'):
9999

100100
* [`ai2_arc_with_ir`](ai2_arc_with_ir.md)
101101
* [`arc`](arc.md)
102+
* [`covr`](covr.md)
103+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
102104
* [`natural_questions`](natural_questions.md)
103105
* [`openbookqa`](openbookqa.md)
104106

@@ -210,6 +212,8 @@ for ex in tfds.load('cifar10', split='train'):
210212
* [`clevr`](clevr.md)
211213
* [`clic`](clic.md)
212214
* [`coil100`](coil100.md)
215+
* [`covr`](covr.md)
216+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
213217
* [`div2k`](div2k.md)
214218
* [`downsampled_imagenet`](downsampled_imagenet.md)
215219
* [`dsprites`](dsprites.md)
@@ -439,6 +443,8 @@ for ex in tfds.load('cifar10', split='train'):
439443
* [`booksum`](booksum.md)
440444
* [`bool_q`](bool_q.md)
441445
* [`bot_adversarial_dialogue`](bot_adversarial_dialogue.md)
446+
* [`covr`](covr.md)
447+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
442448
* [`dices`](dices.md)
443449
* [`dolma`](dolma.md)
444450
* [`e2e_cleaned`](e2e_cleaned.md)
@@ -487,6 +493,8 @@ for ex in tfds.load('cifar10', split='train'):
487493
### `Natural language inference`
488494

489495
* [`anli`](anli.md)
496+
* [`covr`](covr.md)
497+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
490498
* [`dices`](dices.md)
491499
* [`paws_wiki`](paws_wiki.md)
492500
* [`sci_tail`](sci_tail.md)
@@ -499,6 +507,8 @@ for ex in tfds.load('cifar10', split='train'):
499507
* [`beir`](beir.md)
500508
* [`bool_q`](bool_q.md)
501509
* [`clevr`](clevr.md)
510+
* [`covr`](covr.md)
511+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
502512
* [`databricks_dolly`](databricks_dolly.md)
503513
* [`dices`](dices.md)
504514
* [`imdb_reviews`](imdb_reviews.md)
@@ -535,6 +545,8 @@ for ex in tfds.load('cifar10', split='train'):
535545

536546
* [`coco`](coco.md)
537547
* [`coco_captions`](coco_captions.md)
548+
* [`covr`](covr.md)
549+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
538550
* [`flic`](flic.md)
539551
* [`kitti`](kitti.md)
540552
* [`lvis`](lvis.md)
@@ -849,6 +861,8 @@ for ex in tfds.load('cifar10', split='train'):
849861
* [`conll2003`](conll2003.md)
850862
* [`corr2cause`](corr2cause.md)
851863
* [`cos_e`](cos_e.md)
864+
* [`covr`](covr.md)
865+
<span class="material-icons" title="Available only in the tfds-nightly package">nights_stay</span>
852866
* [`databricks_dolly`](databricks_dolly.md)
853867
* [`definite_pronoun_resolution`](definite_pronoun_resolution.md)
854868
* [`dices`](dices.md)

0 commit comments

Comments
 (0)