You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-2Lines changed: 4 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,9 @@ via Text Density (CETD) algorithm described in the paper by
12
12
13
13
## What Problem Does This Solve?
14
14
15
-
Web pages often contain a lot of peripheral content like navigation menus, advertisements, footers, and sidebars. This makes it challenging to extract just the main content programmatically. This library helps solve this problem by:
15
+
Web pages often contain a lot of peripheral content like navigation menus,
16
+
advertisements, footers, and sidebars. This makes it challenging to extract just
17
+
the main content programmatically. This library helps solve this problem by:
16
18
17
19
- Analyzing the text density patterns in HTML documents
18
20
- Identifying content-rich sections versus navigational/peripheral elements
@@ -47,7 +49,7 @@ This ensures accurate content extraction from web pages in any language, with pr
47
49
48
50
## Usage
49
51
50
-
Due to "LazyLock" MSRV is 1.80
52
+
MSRV is 1.85 due to 2024 edition. Living on the edge!
0 commit comments