You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`attached_to_filename`| MSG | The name of the file that the attached file is attached to. |
150
+
|`bcc_recipient`| EML | The related [email](#email) BCC recipient. |
151
+
|`cc_recipient`| EML | The related [email](#email) CC recipient. |
152
+
|`email_message_id`| EML | The related [email](#email) message ID. |
153
+
|`header_footer_type`| Word Doc | The pages that a header or footer applies to in a [Word document](#microsoft-word-files): `primary`, `even_only`, and `first_page`. |
154
+
|`link_urls`| HTML | The URL that is associated with a link in a document. |
155
+
|`link_texts`| HTML | The text that is associated with a link in a document. |
156
+
|`page_name`| XLSX | The related sheet's name in an [Excel file](#microsoft-excel-files). |
157
+
|`page_number`| DOCX, PDF, PPT, XLSX | The related file's page number. |
158
+
|`section`| EPUB | The book section title corresponding to a table of contents. |
159
+
|`sent_from`| EML | The related [email](#email) sender. |
160
+
|`sent_to`| EML | The related [email](#email) recipient. |
161
+
|`signature`| EML | The related [email](#email) signature. |
162
+
|`subject`| EML | The related [email](#email) subject. |
160
163
161
164
Notes on additional metadata by document type:
162
165
163
166
#### Email
164
167
165
-
Emails will include `sent_from`, `sent_to`, and `subject` metadata. `sent_from` is a list of strings because
166
-
the [RFC 822](https://www.rfc-editor.org/rfc/rfc822) spec for emails allows for multiple sent from email addresses.
168
+
For emails, metadata will contain the following fields, where available:
169
+
170
+
-`bcc_recipient`
171
+
-`cc_recipient`
172
+
-`email_message_id`
173
+
-`sent_from`
174
+
-`sent_to`
175
+
-`signature`
176
+
-`subject`
177
+
178
+
`sent_from` is a list of strings because the [RFC 822](https://www.rfc-editor.org/rfc/rfc822) spec for emails allows for multiple sent from email addresses.
|`page_number`| DOCX, PDF, PPT, XLSX | The related file's page number. |
134
-
|`page_name`| XLSX | The related sheet's name in an [Excel file](#microsoft-excel-files). |
135
-
|`sent_from`| EML | The related [email](#email) sender. |
136
-
|`sent_to`| EML | The related [email](#email) recipient. |
137
-
|`subject`| EML | The related [email](#email) subject. |
138
133
|`attached_to_filename`| MSG | The name of the file that the attached file is attached to. |
134
+
|`bcc_recipient`| EML | The related [email](#email) BCC recipient. |
135
+
|`cc_recipient`| EML | The related [email](#email) CC recipient. |
136
+
|`email_message_id`| EML | The related [email](#email) message ID. |
139
137
|`header_footer_type`| Word Doc | The pages that a header or footer applies to in a [Word document](#microsoft-word-files): `primary`, `even_only`, and `first_page`. |
140
138
|`link_urls`| HTML | The URL that is associated with a link in a document. |
141
139
|`link_texts`| HTML | The text that is associated with a link in a document. |
140
+
|`page_name`| XLSX | The related sheet's name in an [Excel file](#microsoft-excel-files). |
141
+
|`page_number`| DOCX, PDF, PPT, XLSX | The related file's page number. |
142
142
|`section`| EPUB | The book section title corresponding to a table of contents. |
143
+
|`sent_from`| EML | The related [email](#email) sender. |
144
+
|`sent_to`| EML | The related [email](#email) recipient. |
145
+
|`signature`| EML | The related [email](#email) signature. |
146
+
|`subject`| EML | The related [email](#email) subject. |
143
147
144
148
Here are some notes on additional metadata fields by file type:
145
149
146
150
#### Email
147
151
148
-
Emails will include `sent_from`, `sent_to`, and `subject` metadata. `sent_from` is a list of strings because
149
-
the [RFC 822](https://www.rfc-editor.org/rfc/rfc822) spec for emails allows for multiple sent from email addresses.
152
+
For emails, metadata will contain the following fields, where available:
153
+
154
+
-`bcc_recipient`
155
+
-`cc_recipient`
156
+
-`email_message_id`
157
+
-`sent_from`
158
+
-`sent_to`
159
+
-`signature`
160
+
-`subject`
161
+
162
+
`sent_from` is a list of strings because the [RFC 822](https://www.rfc-editor.org/rfc/rfc822) spec for emails allows for multiple sent from email addresses.
0 commit comments