You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+12Lines changed: 12 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,18 @@
2
2
3
3
<!-- towncrier release notes start -->
4
4
5
+
## 0.9.0 (2024-03-11)
6
+
7
+
8
+
### Bugfixes
9
+
10
+
- Recognize control words with where the parameter's digital sequence is delimited by any character other than an ASCII digit [#18](https://github.com/fleetingbytes/rtfparse/issues/18)
11
+
12
+
13
+
### Development Details
14
+
15
+
- Renamed a few things, improved readme [#17](https://github.com/fleetingbytes/rtfparse/issues/17)
RTF Parser. So far it can only de-encapsulate HTML content from an RTF, but it properly parses the RTF structure and allows you to write your own custom RTF renderers. The HTML de-encapsulator provided with `rtfparse` is just one such custom renderer which liberates the HTML content from its RTF encapsulation and saves it in a given html file.
3
+
Parses Microsofts Rich Text Format (RTF) documents. It creates an in-memory object which represents the tree structure of the RTF document. This object can in turn be rendered by using one of the renderers.
4
+
So far, rtfparse provides only one renderer (`Decapsulate_HTML`) which liberates the HTML code encapsulated in RTF. This will come handy, for examle, if you ever need to extract the HTML from a HTML-formatted email message saved by Microsoft Outlook.
5
+
6
+
MS Outlook also tends to use RTF compression, so the CLI of rtfparse can optionally do that, too.
7
+
8
+
You can of course write your own renderers of parsed RTF documents and consider contributing them to this project.
4
9
5
-
rtfparse can also decompressed RTF from MS Outlook `.msg` files and parse that.
6
10
7
11
# Installation
8
12
9
13
Install rtfparse from your local repository with pip:
10
14
11
15
pip install rtfparse
12
16
13
-
Installation creates an executable file `rtfparse` in your python scripts folder which should be in your `$PATH`.
17
+
Installation creates an executable file `rtfparse` in your python scripts folder which should be in your `$PATH`.
14
18
15
19
# Usage From Command Line
16
20
@@ -24,49 +28,48 @@ rtfparse.info.log
24
28
rtfparse.errors.log
25
29
```
26
30
27
-
## Example: De-encapsulate HTML from an uncompressed RTF file
31
+
## Example: Decapsulate HTML from an uncompressed RTF file
## Example: De-encapsulate HTML from MS Outlook email file
35
+
## Example: Decapsulate HTML from MS Outlook email file
32
36
33
-
Thanks to [extract_msg](https://github.com/TeamMsgExtractor/msg-extractor) and [compressed_rtf](https://github.com/delimitry/compressed_rtf), rtfparse internally uses them:
37
+
For this, the CLI of rtfparse uses [extract_msg](https://github.com/TeamMsgExtractor/msg-extractor) and [compressed_rtf](https://github.com/delimitry/compressed_rtf).
In `rtfparse` version 1.x you will be able to embed these images in the de-encapsulated HTML. This functionality will be provided by the package [embedimg](https://github.com/fleetingbytes/embedimg).
51
+
In `rtfparse` version 1.x you will be able to embed these images in the decapsulated HTML. This functionality will be provided by the package [embedimg](https://github.com/fleetingbytes/embedimg).
with open(target_path, mode="w", encoding="utf-8") as html_file:
72
75
renderer.render(parsed, html_file)
@@ -76,6 +79,5 @@ with open(target_path, mode="w", encoding="utf-8") as html_file:
76
79
77
80
If you find a working official Microsoft link to the RTF specification and add it here, you'll be remembered fondly.
78
81
79
-
*[Swissmains Link to RTF Spec 1.9.1](https://manuals.swissmains.com/pages/viewpage.action?pageId=1376332&preview=%2F1376332%2F10620104%2FWord2007RTFSpec9.pdf)
80
82
*[Webarchive Link to RTF Spec 1.9.1](https://web.archive.org/web/20190708132914/http://www.kleinlercher.at/tools/Windows_Protocols/Word2007RTFSpec9.pdf)
0 commit comments