Skip to content
Open
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,8 @@ Currently, it contains the following demos:
* X-CLIP ([paper](https://arxiv.org/abs/2208.02816)):
- performing zero-shot video classification with X-CLIP [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/X-CLIP/Video_text_matching_with_X_CLIP.ipynb)
- zero-shot classifying a YouTube video with X-CLIP [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/X-CLIP/Zero_shot_classify_a_YouTube_video_with_X_CLIP.ipynb)
* Table Transformer ([paper](https://arxiv.org/abs/2110.00061)):
- detects table and recognizes table structure on image with table [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NielsRogge/Transformers-Tutorials/blob/master/Table%20Transformer/Using_Table_Transformer_for_table_detection_and_table_structure_recognition.ipynb)

... more to come! 🤗

Expand Down Expand Up @@ -142,6 +144,7 @@ Btw, I was also the main contributor to add the following algorithms to the libr
- VideoMAE by Multimedia Computing Group, Nanjing University
- X-CLIP by Microsoft Research
- MarkupLM by Microsoft Research
- Table Transformer by Microsoft Research

All of them were an incredible learning experience. I can recommend anyone to contribute an AI algorithm to the library!

Expand Down
5 changes: 5 additions & 0 deletions Table Transformer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@ can be done as shown in the notebooks found in [this folder](https://github.com/

The only difference is that the Table Transformer applies a "normalize before" operation, which means that layernorms are applied before,
rather than after MLPs/attention.

To automatically parse a table and turn it into a CSV file, check out [this demo](https://huggingface.co/spaces/SalML/TableTransformer2CSV) on HuggingFace Spaces based on the Table Transformer + OCR.


![432d09f05f9178c0929729ae27b2928e](https://user-images.githubusercontent.com/31631107/197332016-de9314bc-2159-44bb-9428-ef07c6a96850.png)