Skip to content

Commit e590b32

Browse files
authored
Updating supported file types for the Unstructured UI/API (#629)
1 parent 66d7bf3 commit e590b32

File tree

2 files changed

+16
-30
lines changed

2 files changed

+16
-30
lines changed

api-reference/supported-file-types.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,6 @@
22
title: Supported file types
33
---
44

5-
import SupportedFileTypes from '/snippets/general-shared-text/supported-file-types.mdx';
5+
import SupportedFileTypes from '/snippets/general-shared-text/supported-file-types-platform.mdx';
66

77
<SupportedFileTypes />
Lines changed: 15 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Unstructured supports processing of the following file types:
1+
The Unstructured user interface (UI) and Unstructured API support processing of the following file types:
22

33
By file extension:
44

@@ -8,19 +8,14 @@ By file extension:
88
| `.bmp` |
99
| `.csv` |
1010
| `.cwk` |
11-
| `.dbf` |
12-
| `.dif` |
11+
| `.dif`[*](#notes) |
1312
| `.doc` |
14-
| `.docm` |
1513
| `.docx` |
1614
| `.dot` |
17-
| `.dotm` |
1815
| `.eml` |
1916
| `.epub` |
2017
| `.et` |
2118
| `.eth` |
22-
| `.fods` |
23-
| `.gif` |
2419
| `.heic` |
2520
| `.htm` |
2621
| `.html` |
@@ -29,66 +24,57 @@ By file extension:
2924
| `.jpg` |
3025
| `.md` |
3126
| `.mcw` |
27+
| `.msg` |
3228
| `.mw` |
33-
| `.odt` |
3429
| `.org` |
3530
| `.p7s` |
36-
| `.pages` |
3731
| `.pbd` |
3832
| `.pdf` |
3933
| `.png` |
4034
| `.pot` |
41-
| `.potm` |
4235
| `.ppt` |
4336
| `.pptm` |
4437
| `.pptx` |
4538
| `.prn` |
4639
| `.rst` |
4740
| `.rtf` |
4841
| `.sdp` |
49-
| `.sgl` |
5042
| `.svg` |
5143
| `.sxg` |
5244
| `.tiff` |
5345
| `.txt` |
5446
| `.tsv` |
55-
| `.uof` |
56-
| `.uos1` |
57-
| `.uos2` |
58-
| `.web` |
59-
| `.webp` |
60-
| `.wk2` |
6147
| `.xls` |
62-
| `.xlsb` |
6348
| `.xlsm` |
6449
| `.xlsx` |
65-
| `.xlw` |
6650
| `.xml` |
6751
| `.zabw` |
6852

6953
By file type:
7054

7155
| Category | File types |
7256
| --- | --- |
73-
| Apple | `.cwk`, `.mcw`, `.pages`
57+
| Apple | `.cwk`, `.mcw`
7458
| CSV | `.csv` |
75-
| Data interchange | `.dif` |
76-
| dBase | `.dbf` |
77-
| E-mail | `.eml`, `.p7s` |
59+
| E-mail | `.eml`, `.msg`, `.p7s` |
7860
| EPUB | `.epub` |
7961
| HTML | `.htm`, `.html` |
80-
| Image | `.bmp`, `.gif`, `.heic`, `.jpeg`, `.jpg`, `.png`, `.prn`, `.svg`, `.tiff`, `.webp` |
62+
| Image | `.bmp`, `.heic`, `.jpeg`, `.jpg`, `.png`, `.prn`, `.svg`, `.tiff` |
8163
| Markdown | `.md` |
8264
| Org Mode | `.org` |
83-
| Open Office | `.odt`, `.sgl` |
84-
| Other | `.eth`, `.mw`, `.pbd`, `.sdp`, `.uof`, `.web` |
65+
| Other | `.dif`[*](#notes), `.eth`, `.mw`, `.pbd`, `.sdp` |
8566
| PDF | `.pdf` |
8667
| Plain text | `.txt` |
87-
| PowerPoint | `.pot`, `.potm`, `.ppt`, `.pptm`, `.pptx` |
68+
| PowerPoint | `.pot`, `.ppt`, `.pptm`, `.pptx` |
8869
| reStructured Text | `.rst` |
8970
| Rich Text | `.rtf` |
90-
| Spreadsheet | `.et`, `.fods`, `.uos1`, `.uos2`, `.wk2`, `.xls`, `.xlsb`, `.xlsm`, `.xlsx`, `.xlw` |
71+
| Spreadsheet | `.et`, `.xls`, `.xlsm`, `.xlsx` |
9172
| StarOffice | `.sxg` |
9273
| TSV | `.tsv` |
93-
| Word processing | `.abw`, `.doc`, `.docm`, `.docx`, `.dot`, `.dotm`, `.hwp`, `.zabw` |
74+
| Word processing | `.abw`, `.doc`, `.docx`, `.dot`, `.hwp`, `.zabw` |
9475
| XML | `.xml` |
76+
77+
## Notes
78+
79+
* For `.dif`, `\n` characters in `.dif` files are supported, but `\r\n` characters will raise the error
80+
`UnsupportedFileFormatError: Partitioning is not supported for the FileType.UNK file type`.

0 commit comments

Comments
 (0)