Skip to content

BUG: Clipper uses block latex even when inline latex would be appropriate #674

@KraXen72

Description

@KraXen72

IMPORTANT: If your issue is related to missing content on page, please review the Troubleshooting instructions and open your issue on the Defuddle repo.
https://help.obsidian.md/web-clipper/troubleshoot

Version (please complete the following information):

  • OS: Fedora 43
  • Browser Vivaldi
  • Web Clipper version: 0.12.0
  • Obsidian version: N/A as I'm copying the clipper output to clipboard

Describe the bug

The clipper often extracts inline latex/katex/math content as if it was block latex

Expected behavior

It would properly extract it as inline latex.

URLs where the bug occurs
The original site is login-gated, so i've made a reproduction by copying out the html:
https://kraxen72.github.io/cdn/clipper-issue-repro.html

this is what clipper currently outputs:

# Document
[view on web - source](http://0.0.0.0:8000/clipper-issue-repro.html)
  
### Description

A fish salesman has determined there are two lucrative spots $$
P
$$
 $P$  and $$
Q
$$
 $Q$  where he can set up his stand. He has (perfectly) predicted the profits to be had during a period of n days on each spot, call them $$
pi
$$
 $p_{i}$  and $$
qi
$$
 $q_{i}$  for $$
1≤i≤n
$$
 $1 \leq i \leq n$ . The salesman obviously wants to maximize his profit, but he cannot be in both spots on one day, so he will have to decide where he is going to be on each day. Breaking up his stand and setting it up again in the other spot is a difficult job, however, which takes a whole day, on which there will be no profits.

As an example consider the following instance:

\`\`\`
P  Q
80 90
30 60
30 60
70 50
80 20
\`\`\`

We expect `300` as output here, representing that we set up shop at location Q on days 1 and 2 and on location P on days 4 and 5 (with day 3 being the switch day).

Give an iterative dynamic programming solution to find the maximum profit the salesman can earn.

#### Test Data is Available

You can access the following files:

| `example.out` [Download](https://example.com/downloadData/152126/2956eed8-92f8-4e9b-864f-c1b0dcbf318d) |
| --- |
| `example.in` [Download](https://example.com/downloadData/152126/4c39506b-6bf4-4ca8-ab83-23b3342aa122) |

(codeblock escaped by me)

You can see that a bunch of the inline formulas are extracted as block latex, when inline latex would've been more appropriate.

To reproduce

make sure to select all text before extracting, otherise it won't pick up all the formulas (i always select the section of the page i want to clip anyway).

Your template file

If you are using a custom template (i.e. not the Default template). Go to Web Clipper settings and click MoreCopy as JSON. Paste the JSON code below.

{
	"schemaVersion": "0.1.0",
	"name": "Default Article",
	"behavior": "create",
	"noteContentFormat": "# {{title}}\n[view on web - source]({{url}})\n  \n{{content}}",
	"properties": [
		{
			"name": "title",
			"value": "{{title}}",
			"type": "text"
		},
		{
			"name": "source",
			"value": "{{url}}",
			"type": "text"
		},
		{
			"name": "author",
			"value": "{{author|split:\\\", \\\"|join}}",
			"type": "text"
		},
		{
			"name": "published",
			"value": "{{published}}",
			"type": "date"
		},
		{
			"name": "created",
			"value": "{{date}}",
			"type": "date"
		},
		{
			"name": "description",
			"value": "{{description}}",
			"type": "text"
		},
		{
			"name": "tags",
			"value": "clippings",
			"type": "multitext"
		}
	],
	"triggers": [],
	"noteNameFormat": "{{title}}",
	"path": "clippings"
}

note that the defuddle playground still renders it normally, and if i clip the result of the defuddle playground, it also works fine:

Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions