Have you reconsidered adding WARC support? #1418
Replies: 3 comments
-
|
The fundamental problem is that SingleFile does not inspect network exchanges, whereas the WARC format was designed to do just that. That's why I haven't delved into the subject. |
Beta Was this translation helpful? Give feedback.
-
|
I still think there is a pathway to support WARC format and a potential benefit associated. The entire bundled single file can be treated as the representation of the primary URL it was initiated from and can be put in a single WARC record. The WARC record would allow adding more metadata in headers and the WARC file can then be ingested into archival playback systems to serve these captures along with other traditional WARCs. |
Beta Was this translation helpful? Give feedback.
-
|
Why use WARC over HTML? Sounds like you sacrifice a good amount of compatibility (HTML can even be opened on anything with a browser) for very modest space savings (since it seems WARC contents are individually compressed)? It also seems you lose the faithfulness aspect of SingleFile with WARC. I can't recall ever viewing a WayBack Machine-archived copy of a page that wasn't garbled in some way. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I saw the issue for it posted all the way back in 2019 and I think its a really good time to look at supporting the WARC format.
.warcand.warc.gz) can easily be concatenated, with unixcatin the command line for instance.Beta Was this translation helpful? Give feedback.
All reactions