Skip to content

Conversation

@emansih
Copy link

@emansih emansih commented Jun 15, 2019

The script to convert html files to md files is https://github.com/emansih/TaskerDocumentation/blob/ec316935bd02e3412a0b6d7aba63ad304da3d7ae/converter.sh

All files under en directory are ok. However userguide_summary.md file is borked.

@goldfndr
Copy link
Contributor

Many of the output files (e.g. ah_copy_file.md, ah_delete_file.md) have backslashes before newlines and quotes, which seems unnecessary (and less readable). GitHub's parser doesn't require them, does Tasker's?

@CennoxX
Copy link

CennoxX commented Jun 16, 2019

Yes, there are some issues, Pandoc seems to have issues with <br> – it writes the backslashes @goldfndr noticed at that positions. Also the Taker documentation uses void ankers like <A NAME="g"/> or even incorrect ones like

<A NAME="diff"/A>
<H4>Differences Between Widgets and Shortcuts</H4>

Althought that works in HTML, it isn't good style and doesn't translate well to markdown:

[]{#diff}
#### Differences Between Widgets and Shortcuts

It would be better, to change the HTML to <H4 id="diff">Differences Between Widgets and Shortcuts</H4> or #### Differences Between Widgets and Shortcuts{#diff} in Markdown.

@emansih
Copy link
Author

emansih commented Jun 17, 2019

Well, we can either manually edit those markdown files or edit the HTML files before doing a conversion again

@goldfndr
Copy link
Contributor

I'd expect a sed script to handle stripping the superfluous backslashes. It's just a matter of creating a blacklist or whitelist and applying it; sed could help there too.

Has a bug been filed for Pandoc?

@emansih
Copy link
Author

emansih commented Jun 18, 2019

no bug has been filed with pandoc since I am unsure if it's a bug in pandoc or bad code in the html.

@git-core
Copy link

Backslashes seem to be in e.g. ah_copy_file.md due to <br/> tags presented in the source. So, it's rather bad code in html.
Pent used to have some tool that produced xhtml, I guess. Look at the head of en/index.html, for example. The <br/> tag is part of xhtml, not html.
Perhaps, pandoc can be instructed to handle <br/> tags, but the userguide has numerous mistakes in the html formatting anyway. en/variables.html is a prominent example.
I'd recommend to feed all those pages to a html checker first and to fix all encountered errors before converting the sources to markdown.

@goldfndr
Copy link
Contributor

Yes, Pent probably did have a tool, but for different reasons (en/index.html is hand-crafted, e.g. some list elements have closing tags and some don't). You can see the XML source for the actions and events and states. The XML's actions include 5.0's Take Screenshot and Set App Shortcuts, so it was definitely in use pre-João. I would expect that the tool was included with what Pent provided, but I don't see it here. The tool probably reads the source (res/values/*.xml), as the A-Z files and individual files have names that the XML doesn't (e.g. "Clear Key"), and the A-Z file is obviously alphabetically sorted (the XML seems to be randomized). Some of the entries (e.g. action_help_clear_encryption and action_help_airplane_radios) do include HTML (italic and bold respectively) so that's allowed.

It's also possible that a tool could convert Markdown files into XML and we can come full circle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants