Skip to content

Commit 135b1d7

Browse files
committed
Add AdvancedHTMLParser.AdvancedHTMLParser.setDoctype function, to allow you to set the doctype (versus having to use parser.doctype = "DOCTYPE html" ). When you parse a page that contains a doctype, it will still be detected and used in the same way. You can also pass None to this function to clear the doctype, and not output a doctype tag in AdvancedHTMLParser.getHTML
1 parent 9d49f4d commit 135b1d7

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

AdvancedHTMLParser/Parser.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,22 @@ def setRoot(self, root):
287287
'''
288288
self.root = root
289289

290+
291+
def setDoctype(self, newDoctype):
292+
'''
293+
setDoctype - Set the doctype for this document, or clear it.
294+
295+
@param newDoctype <str/None> -
296+
297+
If None, will clear the doctype and not return one with #getHTML
298+
299+
Otherwise, a string of the full doctype tag.
300+
301+
For example, the HTML5 doctype would be "DOCTYPE html"
302+
'''
303+
self.doctype = newDoctype
304+
305+
290306
def getElementsByTagName(self, tagName, root='root'):
291307
'''
292308
getElementsByTagName - Searches and returns all elements with a specific tag name.
@@ -1120,6 +1136,7 @@ def setRoot(self, root):
11201136
# Public
11211137
##########################################################
11221138

1139+
11231140
# This should be called if you modify a parsed tree at an element level, then search it.
11241141
def reindex(self, newIndexIDs=None, newIndexNames=None, newIndexClassNames=None, newIndexTagNames=None):
11251142
'''

0 commit comments

Comments
 (0)