You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+28-39Lines changed: 28 additions & 39 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,7 @@
2
2
3
3
RTF Parser. So far it can only de-encapsulate HTML content from an RTF, but it properly parses the RTF structure and allows you to write your own custom RTF renderers. The HTML de-encapsulator provided with `rtfparse` is just one such custom renderer which liberates the HTML content from its RTF encapsulation and saves it in a given html file.
4
4
5
-
# Dependencies
6
-
7
-
```
8
-
argcomplete
9
-
extract-msg
10
-
compressed_rtf
11
-
```
5
+
rtfparse can also decompressed RTF from MS Outlook `.msg` files and parse that.
12
6
13
7
# Installation
14
8
@@ -18,65 +12,60 @@ Install rtfparse from your local repository with pip:
18
12
19
13
Installation creates an executable file `rtfparse` in your python scripts folder which should be in your `$PATH`.
20
14
21
-
# First Run
15
+
# Usage From Command Line
22
16
23
-
When you run `rtfparse`for the first time it will start a configuration wizard which will guide you through the process of creating a default configuration file and specifying the location of its folders. (These folders serve as locations for saving extracted rtf or html files.)
17
+
Use the `rtfparse`executable from the command line. Read `rtfparse --help`.
24
18
25
-
In the configuration wizard you can press `A` for care-free automatic configuration, which would look something like this:
19
+
rtfparse writes logs into `~/rtfparse/` into these files:
26
20
27
21
```
28
-
$ rtfparse
29
-
Config file missing, creating new default config file
◊ email_rtf (C:\Users\nagidal\rtfparse\email_rtf) does not exist!
31
+
## Example: De-encapsulate HTML from MS Outlook email file
38
32
39
-
(A) Automatically configure this and all remaining rtfparse settings
40
-
(C) Create this path automatically
41
-
(M) Manually input correct path to use or to create
42
-
(Q) Quit and edit `email_rtf` in rtfparse_configuration.ini
33
+
Thanks to [extract_msg](https://github.com/TeamMsgExtractor/msg-extractor) and [compressed_rtf](https://github.com/delimitry/compressed_rtf), rtfparse internally uses them:
43
34
44
-
Created directory C:\Users\nagidal\rtfparse
45
-
Created directory C:\Users\nagidal\rtfparse\email_rtf
`rtfparse` also creates the folder `.rtfparse` (beginning with a dot) in your home directory where it saves its default configuration and its log files.
37
+
## Example: Only decompress the RTF from MS Outlook email file
Use the `rtfparse` executable from the command line. For example if you want to de-encapsulate the HTML from an RTF file, do it like this:
41
+
## Example: De-encapsulate HTML from MS Outlook email file and save (and later embed) the attachments
54
42
55
-
rtfparse -f "path/to/rtf_file.rtf" -d
43
+
When extracting the RTF from the `.msg` file, you can save the attachments (which includes images embedded in the email text) in a directory:
56
44
57
-
Or you can de-encapsulate the HTML from an MS Outlook message, thanks to [extract_msg](https://github.com/TeamMsgExtractor/msg-extractor) and [compressed_rtf](https://github.com/delimitry/compressed_rtf):
In `rtfparse` version 1.x you will be able to embed these images in the de-encapsulated HTML. This functionality will be provided by the package [embedimg](https://github.com/fleetingbytes/embedimg).
60
48
61
-
The resulting html file will be saved to the `html` folder you set in the `rtfparse_configuration.ini`. Command reference is in `rtfparse --help`.
0 commit comments