Skip to content

Commit 8df957d

Browse files
committed
Add CHANGELOG.md entry for Browser processor
1 parent 1aa6b82 commit 8df957d

File tree

1 file changed

+11
-4
lines changed

1 file changed

+11
-4
lines changed

CHANGELOG.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,15 @@
66

77
#### New features
88

9+
- **Browser processor:** Loads fetched pages in a local browser (Firefox/ChromeDriver), records all browser requests,
10+
and runs pluggable behaviors (e.g. scrolling, link extraction). [#653](https://github.com/internetarchive/heritrix3/pull/653)
11+
- Uses the [WebDriver BiDi protocol](https://www.w3.org/TR/webdriver-bidi/) for browser automation.
12+
- The recording proxy is built on Jetty's ProxyHandler and the FetchHTTP2 module.
13+
- **Status:** Working for small crawls but needs more robust error handling (browser crashes, resource limits).
14+
915
- **Basic web auth:** You can now switch the web interface from Digest authentication to Basic authentication
1016
with the `--web-auth basic` command-line option. This is useful when running Heritrix behind a reverse proxy that
11-
adds external authentication.
17+
adds external authentication. [#654](https://github.com/internetarchive/heritrix3/pull/654)
1218

1319
- **Robots.txt wildcards:** The `*` and `$` wildcard rules from RFC 9309 are now supported.
1420
[#656](https://github.com/internetarchive/heritrix3/pull/656)
@@ -17,24 +23,25 @@
1723

1824
- **Code editor:** The configuration editor and script console were upgraded to CodeMirror 6. This resolves some browser
1925
incompatibilities, allowing CodeMirror’s own find function to be re-enabled for reliable text search of content far
20-
outside the viewport.
26+
outside the viewport. [#651](https://github.com/internetarchive/heritrix3/pull/651)
2127

2228
#### Removals
2329

2430
- **Removed Apache HttpClient 3**: If you have custom Heritrix modules you may need to update the following
25-
class references in your code:
31+
class references in your code:
2632

2733
| Removed | Replacement |
2834
|-----------------------------------------------------------|--------------------------------------|
2935
| `org.apache.commons.httpclient.URIException` | `org.archive.url.URIException` |
3036
| `org.apache.commons.httpclient.Header` | `org.archive.format.http.HttpHeader` |
3137

3238
Note that Apache HttpClient 4 (`org.apache.http`) was not removed.
39+
[#652](https://github.com/internetarchive/heritrix3/pull/652)
3340

3441
#### Dependency Upgrades
3542

3643
- **codemirror**: 2.23 → 6
37-
- **easmock**: 5.5.0 → removed
44+
- **easymock**: 5.5.0 → removed
3845
- **junit**: 5.12.2 → 5.13.1
3946
- **spring**: 6.2.6 → 6.2.7
4047
- **webarchive-commons**: 1.3.0 → 2.0.1

0 commit comments

Comments
 (0)