Skip to content

Commit 500a9f5

Browse files
Make EncodingTester usable in testing parsed state
This change updates EncodingTester to make it test the result for cases when the expected character encoding is not limited to what can be determined by checking only the first 1024 bytes of the input stream. Otherwise, without this change, EncodingTester is limited to only being useful for testing the output of the meta prescan. This change also allows EncodingTester to be given a directory name rather than a list of files (or pathname with a shell wildcard). And when given a directory name, it recurses the directory looking for *.dat files, and then run the tests from those files. Without that change, we can’t easily run EncodingTester from AntRun in Maven — because we can’t use shell wildcards in the “arg” value for the Ant “java” task, and any list of files we otherwise construct within Maven ends up getting put into the java arg value as a single string (single argument) — including the spaces between filenames.
1 parent 8d31a0b commit 500a9f5

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

test-src/nu/validator/htmlparser/test/EncodingTester.java

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@ public class EncodingTester {
3838

3939
private static int exitStatus = 0;
4040

41+
protected static int SNIFFING_LIMIT = 16384;
42+
4143
private final InputStream aggregateStream;
4244

4345
private final StringBuilder builder = new StringBuilder();
@@ -61,7 +63,7 @@ private boolean runTest() throws IOException, SAXException {
6163
}
6264
UntilHashInputStream stream = new UntilHashInputStream(aggregateStream);
6365
HtmlInputStreamReader reader = new HtmlInputStreamReader(stream, null,
64-
null, null, Heuristics.NONE);
66+
null, null, Heuristics.NONE, SNIFFING_LIMIT);
6567
Charset charset = reader.getCharset();
6668
stream.close();
6769
if (skipLabel()) {

0 commit comments

Comments
 (0)