Skip to content

Commit 64412c0

Browse files
committed
Update READMEs to have new formatHTML documentation, and to document AdvancedHTMLMiniFormatter, and expand formatting section a bit
1 parent 4afefa1 commit 64412c0

File tree

2 files changed

+75
-8
lines changed

2 files changed

+75
-8
lines changed

README.md

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -384,6 +384,8 @@ Each of the get\* functions above takes an additional "useIndex" function, which
384384
AdvancedHTMLFormatter and formatHTML
385385
------------------------------------
386386

387+
**AdvancedHTMLFormatter**
388+
387389
The AdvancedHTMLFormatter formats HTML into a pretty layout. It can handle elements like pre, core, script, style, etc to keep their contents preserved, but does not understand CSS rules.
388390

389391
The methods are:
@@ -397,18 +399,46 @@ The methods are:
397399
getRoot - Gets the "root" node (on a valid document this should be <html>). For arbitrary HTML, you should use getRootNodes, as there may be several nodes at the same outermost level
398400

399401

402+
You can access this same formatting off an AdvancedHTMLParser.AdvancedHTMLParser (or IndexedAdvancedHTMLParser) by calling .getFormattedHTML()
403+
404+
405+
**AdvancedHTMLMiniFormatter**
406+
407+
The AdvancedHTMLMiniFormatter will strip all non-functional whitespace (meaning any whitespace which wouldn't normally add a space to the document or is required for xhtml) and provide no indentation.
408+
409+
Use this when pretty-printing doesn't matter and you'd like to save space.
410+
411+
412+
You can access this same formatting off an AdvancedHTMLParser.AdvancedHTMLParser (or IndexedAdvancedHTMLParser) by calling .getMiniHTML()
413+
414+
415+
**formatHTML script**
416+
400417

401418
A script, formatHTML comes with this package and will perform formatting on an input file, and output to a file or stdout:
402419

403-
Usage: formatHTML (Optional: [/path/to/in.html]) (optional: [/path/to/output.html])
420+
Usage: formatHTML (Optional Arguments) (optional: /path/to/in.html) (optional: [/path/to/output.html])
421+
Formats HTML on input and writes to output.
422+
423+
Optional Arguments:
424+
-------------------
425+
426+
-e [encoding] - Specify an encoding to use. Default is utf-8
427+
428+
-m or --mini - Output "mini" HTML (only retain functional whitespace,
429+
strip the rest and no indentation)
404430

405-
Formats HTML on input and writes to output file, or stdout if output file is omitted.
431+
-p or --pretty - Output "pretty" HTML [This is the defualt mode]
406432

407433

408-
If output filename is not specified or is empty string, output will be to stdout.
434+
--indent=' ' - Use the provided string [default 4-spaces] to represent each
435+
level of nesting. Use --indent=" " for 1 tab insead, for example.
436+
Affects pretty printing mode only
409437

410-
If input filename is not specified or is empty string, input will be from stdin
411438

439+
If output filename is not specified or is empty string, output will be to stdout.
440+
If input filename is not specified or is empty string, input will be from stdin
441+
If -e is provided, will use that as the encoding. Defaults to utf-8
412442

413443

414444
Notes

README.rst

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -400,6 +400,8 @@ Each of the get\* functions above takes an additional "useIndex" function, which
400400
AdvancedHTMLFormatter and formatHTML
401401
------------------------------------
402402

403+
**AdvancedHTMLFormatter**
404+
403405
The AdvancedHTMLFormatter formats HTML into a pretty layout. It can handle elements like pre, core, script, style, etc to keep their contents preserved, but does not understand CSS rules.
404406

405407
The methods are:
@@ -415,18 +417,53 @@ The methods are:
415417
getRoot \- Gets the "root" node (on a valid document this should be <html>). For arbitrary HTML, you should use getRootNodes, as there may be several nodes at the same outermost level
416418

417419

420+
You can access this same formatting off an AdvancedHTMLParser.AdvancedHTMLParser (or IndexedAdvancedHTMLParser) by calling .getFormattedHTML()
421+
422+
423+
**AdvancedHTMLMiniFormatter**
424+
425+
The AdvancedHTMLMiniFormatter will strip all non-functional whitespace (meaning any whitespace which wouldn't normally add a space to the document or is required for xhtml) and provide no indentation.
426+
427+
Use this when pretty-printing doesn't matter and you'd like to save space.
428+
429+
430+
You can access this same formatting off an AdvancedHTMLParser.AdvancedHTMLParser (or IndexedAdvancedHTMLParser) by calling .getMiniHTML()
431+
432+
433+
**formatHTML script**
434+
418435

419436
A script, formatHTML comes with this package and will perform formatting on an input file, and output to a file or stdout:
420437

421-
Usage: formatHTML (Optional: [/path/to/in.html]) (optional: [/path/to/output.html])
438+
Usage: formatHTML (Optional Arguments) (optional: /path/to/in.html) (optional: [/path/to/output.html])
439+
440+
Formats HTML on input and writes to output.
441+
442+
Optional Arguments:
443+
444+
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
445+
446+
\-e [encoding] \- Specify an encoding to use. Default is utf\-8
447+
448+
\-m or \-\-mini \- Output "mini" HTML (only retain functional whitespace,
449+
450+
strip the rest and no indentation)
451+
452+
\-p or \-\-pretty \- Output "pretty" HTML [This is the defualt mode]
453+
454+
455+
\-\-indent=' ' \- Use the provided string [default 4\-spaces] to represent each
456+
457+
level of nesting. Use \-\-indent=" " for 1 tab insead, for example.
422458

423-
Formats HTML on input and writes to output file, or stdout if output file is omitted.
459+
Affects pretty printing mode only
424460

425461

426-
If output filename is not specified or is empty string, output will be to stdout.
462+
If output filename is not specified or is empty string, output will be to stdout.
427463

428-
If input filename is not specified or is empty string, input will be from stdin
464+
If input filename is not specified or is empty string, input will be from stdin
429465

466+
If \-e is provided, will use that as the encoding. Defaults to utf\-8
430467

431468

432469
Notes

0 commit comments

Comments
 (0)