Commit cc98fe1
committed
SimpleChatTC:WebTools:UrlText:HtmlParser: tag drops - refine
Update the initial skeleton wrt the tag drops logic
* had forgotten to convert object to json string at the client end
* had confused between js and python and tried accessing the dict
elements using . notation rather than [] notation in python.
* if the id filtered tag to be dropped is found, from then on
track all other tags of the same type (independent of id),
so that start and end tags can be matched. bcas end tag call
wont have attribute, so all other tags of same type need to
be tracked, for proper winding and unwinding to try find
matching end tag
* remember to reset the tracked drop tag type to None once matching
end tag at same depth is found. should avoid some unnecessary
unwinding.
* set/fix the type wrt tagDrops explicitly to needed depth and
ensure the dummy one and any explicitly got one is of right type.
Tested with duckduckgo search engine and now the div based unneeded
header is avoided in returned search result.1 parent a36de21 commit cc98fe1
File tree
3 files changed
+31
-8
lines changed- tools/server/public_simplechat
- local.tools
3 files changed
+31
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| |||
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
96 | 104 | | |
97 | 105 | | |
98 | 106 | | |
99 | | - | |
| 107 | + | |
100 | 108 | | |
101 | 109 | | |
| 110 | + | |
102 | 111 | | |
103 | 112 | | |
104 | 113 | | |
| |||
126 | 135 | | |
127 | 136 | | |
128 | 137 | | |
129 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
130 | 142 | | |
131 | 143 | | |
132 | 144 | | |
133 | 145 | | |
134 | | - | |
| 146 | + | |
135 | 147 | | |
136 | 148 | | |
| 149 | + | |
137 | 150 | | |
138 | 151 | | |
139 | 152 | | |
140 | 153 | | |
141 | | - | |
| 154 | + | |
142 | 155 | | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
143 | 159 | | |
144 | 160 | | |
145 | 161 | | |
| |||
186 | 202 | | |
187 | 203 | | |
188 | 204 | | |
189 | | - | |
| 205 | + | |
190 | 206 | | |
191 | | - | |
| 207 | + | |
192 | 208 | | |
193 | 209 | | |
194 | 210 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
590 | 590 | | |
591 | 591 | | |
592 | 592 | | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
593 | 600 | | |
594 | 601 | | |
595 | 602 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
259 | 259 | | |
260 | 260 | | |
261 | 261 | | |
262 | | - | |
| 262 | + | |
263 | 263 | | |
264 | 264 | | |
265 | 265 | | |
| |||
0 commit comments