-
Notifications
You must be signed in to change notification settings - Fork 53
[222_34] 增强html的格式检测 #2655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[222_34] 增强html的格式检测 #2655
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances HTML format detection in TeXmacs by implementing multiple statistical and heuristic methods to identify HTML content. The changes add sophisticated detection capabilities beyond simple tag matching.
Changes:
- Implemented density-based detection algorithms (angle brackets, HTML tags, and attributes)
- Added line-by-line HTML feature detection with configurable thresholds
- Added div tag balance checking and short text detection logic
- Extended
html-recognizes-at?function with comprehensive tag checks and statistical fallbacks
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 16 comments.
| File | Description |
|---|---|
| devel/222_34.md | Documentation describing the enhanced HTML detection features and testing instructions |
| TeXmacs/tests/222_34.scm | Comprehensive test suite with 15 HTML test cases and 16 non-HTML test cases covering various edge cases |
| TeXmacs/plugins/html/progs/data/html.scm | Implementation of enhanced HTML detection algorithms including density calculations and the updated recognition function |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| (/ (+ (charactor-from-string substr #\<) | ||
| (charactor-from-string substr #\>)) | ||
| len)))) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic error in density calculation. The function sums the densities returned by charactor-from-string (which are already ratios count/len) and then divides by len again. This double division produces incorrect results. The correct approach is to sum the character counts and then divide once by the total length, or use the already-computed density values without further division.
| (/ (+ (charactor-from-string substr #\<) | |
| (charactor-from-string substr #\>)) | |
| len)))) | |
| (+ (charactor-from-string substr #\<) | |
| (charactor-from-string substr #\>)))) |
| (substr (substring s 0 limit))) | ||
| (/ (+ (charactor-from-string substr #\=) | ||
| (charactor-from-string substr #\")) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic error in density calculation. The function sums the densities returned by charactor-from-string (which are already ratios count/len) and then divides by len again. This double division produces incorrect results. The correct approach is to sum the character counts and then divide once by the total length, or use the already-computed density values without further division.
| (determine-short-html-string s) | ||
| #f)) | ||
|
|
||
| (define (is-html-string? s) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation. This line has a leading space before the opening parenthesis, while all other function definitions in the file start at column 1. Remove the leading space for consistency.
| (define (is-html-string? s) | |
| (define (is-html-string? s) |
| (determine-short-html-string s) | ||
| #f)) | ||
|
|
||
| (define (is-html-string? s) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation. This line has a leading space before the opening parenthesis, while all other function definitions in the file start at column 1. Remove the leading space for consistency.
| (define (is-html-string? s) | |
| (define (is-html-string? s) |
| (let ((count (+ (html-string-count-substring lc-substr "<div") | ||
| (html-string-count-substring lc-substr "<span") | ||
| (html-string-count-substring lc-substr "<p") | ||
| (html-string-count-substring lc-substr "<a") | ||
| (html-string-count-substring lc-substr "<img") | ||
| (html-string-count-substring lc-substr "<ul") | ||
| (html-string-count-substring lc-substr "<ol") | ||
| (html-string-count-substring lc-substr "<li") | ||
| (html-string-count-substring lc-substr "<table") | ||
| (html-string-count-substring lc-substr "<tr") | ||
| (html-string-count-substring lc-substr "<td") | ||
| (html-string-count-substring lc-substr "<th") | ||
| (html-string-count-substring lc-substr "<h1") | ||
| (html-string-count-substring lc-substr "<h2") | ||
| (html-string-count-substring lc-substr "<h3") | ||
| (html-string-count-substring lc-substr "<h4") | ||
| (html-string-count-substring lc-substr "<h5") | ||
| (html-string-count-substring lc-substr "<h6") | ||
| (html-string-count-substring lc-substr "<form") | ||
| (html-string-count-substring lc-substr "<input") | ||
| (html-string-count-substring lc-substr "<button") | ||
| (html-string-count-substring lc-substr "<textarea") | ||
| (html-string-count-substring lc-substr "<select") | ||
| (html-string-count-substring lc-substr "<option") | ||
| (html-string-count-substring lc-substr "<style") | ||
| (html-string-count-substring lc-substr "<script") | ||
| (html-string-count-substring lc-substr "<meta") | ||
| (html-string-count-substring lc-substr "<link") | ||
| (html-string-count-substring lc-substr "</div") | ||
| (html-string-count-substring lc-substr "</ul") | ||
| (html-string-count-substring lc-substr "</ol") | ||
| (html-string-count-substring lc-substr "</table") | ||
| (html-string-count-substring lc-substr "</tr") | ||
| (html-string-count-substring lc-substr "</form") | ||
| (html-string-count-substring lc-substr "</style") | ||
| (html-string-count-substring lc-substr "</script")))) | ||
| (/ count len))))) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance concern: This function performs multiple linear scans of the same string, calling html-string-count-substring 34 times. Each call scans the entire substring. For better performance, consider combining these checks into a single pass through the string, using a state machine or regex pattern matching to identify all tag types in one scan.
| (let ((count (+ (html-string-count-substring lc-substr "<div") | |
| (html-string-count-substring lc-substr "<span") | |
| (html-string-count-substring lc-substr "<p") | |
| (html-string-count-substring lc-substr "<a") | |
| (html-string-count-substring lc-substr "<img") | |
| (html-string-count-substring lc-substr "<ul") | |
| (html-string-count-substring lc-substr "<ol") | |
| (html-string-count-substring lc-substr "<li") | |
| (html-string-count-substring lc-substr "<table") | |
| (html-string-count-substring lc-substr "<tr") | |
| (html-string-count-substring lc-substr "<td") | |
| (html-string-count-substring lc-substr "<th") | |
| (html-string-count-substring lc-substr "<h1") | |
| (html-string-count-substring lc-substr "<h2") | |
| (html-string-count-substring lc-substr "<h3") | |
| (html-string-count-substring lc-substr "<h4") | |
| (html-string-count-substring lc-substr "<h5") | |
| (html-string-count-substring lc-substr "<h6") | |
| (html-string-count-substring lc-substr "<form") | |
| (html-string-count-substring lc-substr "<input") | |
| (html-string-count-substring lc-substr "<button") | |
| (html-string-count-substring lc-substr "<textarea") | |
| (html-string-count-substring lc-substr "<select") | |
| (html-string-count-substring lc-substr "<option") | |
| (html-string-count-substring lc-substr "<style") | |
| (html-string-count-substring lc-substr "<script") | |
| (html-string-count-substring lc-substr "<meta") | |
| (html-string-count-substring lc-substr "<link") | |
| (html-string-count-substring lc-substr "</div") | |
| (html-string-count-substring lc-substr "</ul") | |
| (html-string-count-substring lc-substr "</ol") | |
| (html-string-count-substring lc-substr "</table") | |
| (html-string-count-substring lc-substr "</tr") | |
| (html-string-count-substring lc-substr "</form") | |
| (html-string-count-substring lc-substr "</style") | |
| (html-string-count-substring lc-substr "</script")))) | |
| (/ count len))))) | |
| (letrec* ((string-prefix-at? | |
| (lambda (s prefix idx) | |
| (let* ((s-len (string-length s)) | |
| (p-len (string-length prefix))) | |
| (if (> (+ idx p-len) s-len) | |
| #f | |
| (let loop ((j 0)) | |
| (if (= j p-len) | |
| #t | |
| (if (char=? (string-ref s (+ idx j)) | |
| (string-ref prefix j)) | |
| (loop (+ j 1)) | |
| #f))))))) | |
| (tags '("<div" | |
| "<span" | |
| "<p" | |
| "<a" | |
| "<img" | |
| "<ul" | |
| "<ol" | |
| "<li" | |
| "<table" | |
| "<tr" | |
| "<td" | |
| "<th" | |
| "<h1" | |
| "<h2" | |
| "<h3" | |
| "<h4" | |
| "<h5" | |
| "<h6" | |
| "<form" | |
| "<input" | |
| "<button" | |
| "<textarea" | |
| "<select" | |
| "<option" | |
| "<style" | |
| "<script" | |
| "<meta" | |
| "<link" | |
| "</div" | |
| "</ul" | |
| "</ol" | |
| "</table" | |
| "</tr" | |
| "</form" | |
| "</style" | |
| "</script"))) | |
| (substr-len (string-length lc-substr))) | |
| (let loop ((i 0) (count 0)) | |
| (if (>= i substr-len) | |
| (/ count len) | |
| (let ((new-count | |
| (let tag-loop ((ts tags) (c count)) | |
| (if (null? ts) | |
| c | |
| (if (string-prefix-at? lc-substr (car ts) i) | |
| (tag-loop (cdr ts) (+ c 1)) | |
| (tag-loop (cdr ts) c)))))) | |
| (loop (+ i 1) new-count))))))) |
| (check (html-recognizes-at? non-html-text15 0) => #f) | ||
| (check (html-recognizes-at? non-html-text16 0) => #f)) | ||
|
|
||
| (tm-define (test_222_33) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name is inconsistent with the test ID. The function is named "test_222_33" but the file and module are for "222_34". This should be "test_222_34" to match the correct test identifier.
| (tm-define (test_222_33) | |
| (tm-define (test_222_34) |
| (/ (+ (charactor-from-string substr #\<) | ||
| (charactor-from-string substr #\>)) | ||
| len)))) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic error in density calculation. The function sums the densities returned by charactor-from-string (which are already ratios count/len) and then divides by len again. This double division produces incorrect results. The correct approach is to sum the character counts and then divide once by the total length, or use the already-computed density values without further division.
| (/ (+ (charactor-from-string substr #\<) | |
| (charactor-from-string substr #\>)) | |
| len)))) | |
| (+ (charactor-from-string substr #\<) | |
| (charactor-from-string substr #\>))))) |
| (/ (+ (charactor-from-string substr #\=) | ||
| (charactor-from-string substr #\")) | ||
| len)))) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic error in density calculation. The function sums the densities returned by charactor-from-string (which are already ratios count/len) and then divides by len again. This double division produces incorrect results. The correct approach is to sum the character counts and then divide once by the total length, or use the already-computed density values without further division.
| (define (html-tag-density s) | ||
| (if (string-null? s) | ||
| 0 | ||
| (let* ((len (string-length s)) | ||
| (limit (if (>= len 1000) 1000 len)) | ||
| (substr (substring s 0 limit)) | ||
| (lc-substr (string-downcase substr))) | ||
| (let ((count (+ (html-string-count-substring lc-substr "<div") | ||
| (html-string-count-substring lc-substr "<span") | ||
| (html-string-count-substring lc-substr "<p") | ||
| (html-string-count-substring lc-substr "<a") | ||
| (html-string-count-substring lc-substr "<img") | ||
| (html-string-count-substring lc-substr "<ul") | ||
| (html-string-count-substring lc-substr "<ol") | ||
| (html-string-count-substring lc-substr "<li") | ||
| (html-string-count-substring lc-substr "<table") | ||
| (html-string-count-substring lc-substr "<tr") | ||
| (html-string-count-substring lc-substr "<td") | ||
| (html-string-count-substring lc-substr "<th") | ||
| (html-string-count-substring lc-substr "<h1") | ||
| (html-string-count-substring lc-substr "<h2") | ||
| (html-string-count-substring lc-substr "<h3") | ||
| (html-string-count-substring lc-substr "<h4") | ||
| (html-string-count-substring lc-substr "<h5") | ||
| (html-string-count-substring lc-substr "<h6") | ||
| (html-string-count-substring lc-substr "<form") | ||
| (html-string-count-substring lc-substr "<input") | ||
| (html-string-count-substring lc-substr "<button") | ||
| (html-string-count-substring lc-substr "<textarea") | ||
| (html-string-count-substring lc-substr "<select") | ||
| (html-string-count-substring lc-substr "<option") | ||
| (html-string-count-substring lc-substr "<style") | ||
| (html-string-count-substring lc-substr "<script") | ||
| (html-string-count-substring lc-substr "<meta") | ||
| (html-string-count-substring lc-substr "<link") | ||
| (html-string-count-substring lc-substr "</div") | ||
| (html-string-count-substring lc-substr "</ul") | ||
| (html-string-count-substring lc-substr "</ol") | ||
| (html-string-count-substring lc-substr "</table") | ||
| (html-string-count-substring lc-substr "</tr") | ||
| (html-string-count-substring lc-substr "</form") | ||
| (html-string-count-substring lc-substr "</style") | ||
| (html-string-count-substring lc-substr "</script")))) | ||
| (/ count len))))) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance concern: This function performs multiple linear scans of the same string, calling html-string-count-substring 34 times. Each call scans the entire substring. For better performance, consider combining these checks into a single pass through the string, using a state machine or regex pattern matching to identify all tag types in one scan.
| (define (html-tag-density s) | |
| (if (string-null? s) | |
| 0 | |
| (let* ((len (string-length s)) | |
| (limit (if (>= len 1000) 1000 len)) | |
| (substr (substring s 0 limit)) | |
| (lc-substr (string-downcase substr))) | |
| (let ((count (+ (html-string-count-substring lc-substr "<div") | |
| (html-string-count-substring lc-substr "<span") | |
| (html-string-count-substring lc-substr "<p") | |
| (html-string-count-substring lc-substr "<a") | |
| (html-string-count-substring lc-substr "<img") | |
| (html-string-count-substring lc-substr "<ul") | |
| (html-string-count-substring lc-substr "<ol") | |
| (html-string-count-substring lc-substr "<li") | |
| (html-string-count-substring lc-substr "<table") | |
| (html-string-count-substring lc-substr "<tr") | |
| (html-string-count-substring lc-substr "<td") | |
| (html-string-count-substring lc-substr "<th") | |
| (html-string-count-substring lc-substr "<h1") | |
| (html-string-count-substring lc-substr "<h2") | |
| (html-string-count-substring lc-substr "<h3") | |
| (html-string-count-substring lc-substr "<h4") | |
| (html-string-count-substring lc-substr "<h5") | |
| (html-string-count-substring lc-substr "<h6") | |
| (html-string-count-substring lc-substr "<form") | |
| (html-string-count-substring lc-substr "<input") | |
| (html-string-count-substring lc-substr "<button") | |
| (html-string-count-substring lc-substr "<textarea") | |
| (html-string-count-substring lc-substr "<select") | |
| (html-string-count-substring lc-substr "<option") | |
| (html-string-count-substring lc-substr "<style") | |
| (html-string-count-substring lc-substr "<script") | |
| (html-string-count-substring lc-substr "<meta") | |
| (html-string-count-substring lc-substr "<link") | |
| (html-string-count-substring lc-substr "</div") | |
| (html-string-count-substring lc-substr "</ul") | |
| (html-string-count-substring lc-substr "</ol") | |
| (html-string-count-substring lc-substr "</table") | |
| (html-string-count-substring lc-substr "</tr") | |
| (html-string-count-substring lc-substr "</form") | |
| (html-string-count-substring lc-substr "</style") | |
| (html-string-count-substring lc-substr "</script")))) | |
| (/ count len))))) | |
| ;; Helper: check whether string `s` has prefix `prefix` starting at index `start`. | |
| (define (html-string-prefix-at? s prefix start) | |
| (let* ((slen (string-length s)) | |
| (plen (string-length prefix))) | |
| (and (<= (+ start plen) slen) | |
| (let loop ((i 0)) | |
| (if (= i plen) | |
| #t | |
| (and (char=? (string-ref s (+ start i)) | |
| (string-ref prefix i)) | |
| (loop (+ i 1)))))))) | |
| ;; Helper: count all occurrences of any tag prefix in a single pass over `s`. | |
| (define (html-count-tag-prefixes s prefixes) | |
| (let ((len (string-length s))) | |
| (let loop ((i 0) (count 0)) | |
| (if (>= i len) | |
| count | |
| (let ((new-count | |
| (if (char=? (string-ref s i) #\<) | |
| (+ count | |
| (let scan-prefixes ((ps prefixes) (c 0)) | |
| (if (null? ps) | |
| c | |
| (let ((p (car ps))) | |
| (if (html-string-prefix-at? s p i) | |
| (scan-prefixes (cdr ps) (+ c 1)) | |
| (scan-prefixes (cdr ps) c)))))) | |
| count))) | |
| (loop (+ i 1) new-count))))) | |
| (define (html-tag-density s) | |
| (if (string-null? s) | |
| 0 | |
| (let* ((len (string-length s)) | |
| (limit (if (>= len 1000) 1000 len)) | |
| (substr (substring s 0 limit)) | |
| (lc-substr (string-downcase substr)) | |
| (tag-prefixes '("<div" | |
| "<span" | |
| "<p" | |
| "<a" | |
| "<img" | |
| "<ul" | |
| "<ol" | |
| "<li" | |
| "<table" | |
| "<tr" | |
| "<td" | |
| "<th" | |
| "<h1" | |
| "<h2" | |
| "<h3" | |
| "<h4" | |
| "<h5" | |
| "<h6" | |
| "<form" | |
| "<input" | |
| "<button" | |
| "<textarea" | |
| "<select" | |
| "<option" | |
| "<style" | |
| "<script" | |
| "<meta" | |
| "<link" | |
| "</div" | |
| "</ul" | |
| "</ol" | |
| "</table" | |
| "</tr" | |
| "</form" | |
| "</style" | |
| "</script"))) | |
| (let ((count (html-count-tag-prefixes lc-substr tag-prefixes))) | |
| (/ count len))))) |
| ;; 这一行文本是否包含html标签 | ||
| (define (html-line-contains-features? line) | ||
| (let ((lc-line (string-downcase line))) | ||
| (or | ||
| (> (html-string-count-substring lc-line "<div") 0) | ||
| (> (html-string-count-substring lc-line "<span") 0) | ||
| (> (html-string-count-substring lc-line "<p") 0) | ||
| (> (html-string-count-substring lc-line "<a") 0) | ||
| (> (html-string-count-substring lc-line "<img") 0) | ||
| (> (html-string-count-substring lc-line "<ul") 0) | ||
| (> (html-string-count-substring lc-line "<ol") 0) | ||
| (> (html-string-count-substring lc-line "<li") 0) | ||
| (> (html-string-count-substring lc-line "<table") 0) | ||
| (> (html-string-count-substring lc-line "<tr") 0) | ||
| (> (html-string-count-substring lc-line "<td") 0) | ||
| (> (html-string-count-substring lc-line "<th") 0) | ||
| (> (html-string-count-substring lc-line "<h1") 0) | ||
| (> (html-string-count-substring lc-line "<h2") 0) | ||
| (> (html-string-count-substring lc-line "<h3") 0) | ||
| (> (html-string-count-substring lc-line "<h4") 0) | ||
| (> (html-string-count-substring lc-line "<h5") 0) | ||
| (> (html-string-count-substring lc-line "<h6") 0) | ||
| (> (html-string-count-substring lc-line "</div") 0) | ||
| (> (html-string-count-substring lc-line "</span") 0) | ||
| (> (html-string-count-substring lc-line "</p") 0) | ||
| (> (html-string-count-substring lc-line "</a") 0) | ||
| (> (html-string-count-substring lc-line "/>") 0) | ||
| (> (html-string-count-substring lc-line "<!doctype") 0) | ||
| (> (html-string-count-substring lc-line "<?xml") 0)))) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance concern: This function performs multiple linear scans of the same string, calling html-string-count-substring 25 times. Each call scans the entire line. For better performance, consider combining these checks into a single pass through the line, or using a more efficient pattern matching approach.
| ;; 这一行文本是否包含html标签 | |
| (define (html-line-contains-features? line) | |
| (let ((lc-line (string-downcase line))) | |
| (or | |
| (> (html-string-count-substring lc-line "<div") 0) | |
| (> (html-string-count-substring lc-line "<span") 0) | |
| (> (html-string-count-substring lc-line "<p") 0) | |
| (> (html-string-count-substring lc-line "<a") 0) | |
| (> (html-string-count-substring lc-line "<img") 0) | |
| (> (html-string-count-substring lc-line "<ul") 0) | |
| (> (html-string-count-substring lc-line "<ol") 0) | |
| (> (html-string-count-substring lc-line "<li") 0) | |
| (> (html-string-count-substring lc-line "<table") 0) | |
| (> (html-string-count-substring lc-line "<tr") 0) | |
| (> (html-string-count-substring lc-line "<td") 0) | |
| (> (html-string-count-substring lc-line "<th") 0) | |
| (> (html-string-count-substring lc-line "<h1") 0) | |
| (> (html-string-count-substring lc-line "<h2") 0) | |
| (> (html-string-count-substring lc-line "<h3") 0) | |
| (> (html-string-count-substring lc-line "<h4") 0) | |
| (> (html-string-count-substring lc-line "<h5") 0) | |
| (> (html-string-count-substring lc-line "<h6") 0) | |
| (> (html-string-count-substring lc-line "</div") 0) | |
| (> (html-string-count-substring lc-line "</span") 0) | |
| (> (html-string-count-substring lc-line "</p") 0) | |
| (> (html-string-count-substring lc-line "</a") 0) | |
| (> (html-string-count-substring lc-line "/>") 0) | |
| (> (html-string-count-substring lc-line "<!doctype") 0) | |
| (> (html-string-count-substring lc-line "<?xml") 0)))) | |
| ;; HTML 特征前缀列表(全部小写,以配合 string-downcase) | |
| (define html-feature-prefixes | |
| '("<div" "<span" "<p" "<a" "<img" "<ul" "<ol" "<li" | |
| "<table" "<tr" "<td" "<th" | |
| "<h1" "<h2" "<h3" "<h4" "<h5" "<h6" | |
| "</div" "</span" "</p" "</a" | |
| "/>" "<!doctype" "<?xml")) | |
| ;; 判断字符串 s 在位置 idx 是否以 prefix 开始 | |
| (define (string-starts-with-at? s idx prefix) | |
| (let* ((s-len (string-length s)) | |
| (p-len (string-length prefix))) | |
| (if (> (+ idx p-len) s-len) | |
| #f | |
| (let loop ((j 0)) | |
| (cond | |
| ((= j p-len) #t) | |
| ((char=? (string-ref s (+ idx j)) | |
| (string-ref prefix j)) | |
| (loop (+ j 1))) | |
| (else #f)))))) | |
| ;; 当前字符串是否包含任意一个给定前缀 | |
| (define (string-contains-any-prefix? s prefixes) | |
| (let ((s-len (string-length s))) | |
| (let loop ((i 0)) | |
| (if (>= i s-len) | |
| #f | |
| (let check-prefixes ((ps prefixes)) | |
| (cond | |
| ((null? ps) (loop (+ i 1))) | |
| ((string-starts-with-at? s i (car ps)) #t) | |
| (else (check-prefixes (cdr ps))))))))) | |
| ;; 这一行文本是否包含html标签 | |
| (define (html-line-contains-features? line) | |
| (let ((lc-line (string-downcase line))) | |
| (string-contains-any-prefix? lc-line html-feature-prefixes))) |
No description provided.