Skip to content

Conversation

@kattni
Copy link
Contributor

@kattni kattni commented May 22, 2025

Converts the website from rST to Markdown.

Updates include:

  • Converts content from rST to Markdown.
  • Updates datamodels to match.
  • Adds Farsi to the language dropdown.
  • Makes the following changes to the .lektorproject file:
    • Adds Farsi.
    • Adds lektor-markdown-admonition plugin.
    • Removes lektor-rst plugin.
  • Applies CSS to the following:
    • Notes
    • Blockquotes

This PR is the long-awaited continuation and completion of #472.

PR Checklist:

  • [Sort of] All new features have been tested
  • [Mostly?] All new features have been documented
  • I have read the CONTRIBUTING.md file
  • I will abide by the code of conduct

Conversion script

The following is the script used to convert the content.

from pathlib import Path
from urllib.parse import quote
from lektor.project import Project
from lektor.editor import make_editor_session
from lektor.constants import PRIMARY_ALT
import pandoc

project = Project.discover()
env = project.make_env()
pad = env.new_pad()


def get_rst_field_names(datamodel):
    for field in datamodel.fields:
        if field.type.name == "rst":
            yield field.name


def convert_rst_to_md(rst):
    # The RST treated the main headers in the content as H2; `shift-heading-level-by` does the same.
    document = pandoc.read(rst, format="rst", options=["--shift-heading-level-by", 1])
    return pandoc.write(document, format="markdown_strict")


for path in Path(".").glob("**/*.lr"):
    # First remove "content" prefix
    path = path.relative_to("content")

    # Find language alt
    if "+" in path.stem:
        alt = path.stem.split("+")[-1]
    else:
        alt = PRIMARY_ALT

    # Remove the file name from the path
    path = path.parent
    path = str(path)
    if path == ".":
        path = "/"
    else:
        path = "/" + quote(path)

    print(path)
    page = make_editor_session(pad, path, alt=alt)
    changed = False
    for f in get_rst_field_names(page.datamodel):
        if f not in page:
            continue
        page[f] = convert_rst_to_md(page[f])
        changed = True

    if changed:
        print(f"Saving page {page}")
        page.commit()

@freakboy3742 freakboy3742 added the preview Approved for an automated preview label May 22, 2025
@freakboy3742 freakboy3742 mentioned this pull request May 22, 2025
4 tasks
@github-actions
Copy link

github-actions bot commented May 22, 2025

Visit the preview URL for this PR (updated for commit cc13d9b):

https://beeware-org--pr629-markdown-dhic7esr.web.app

(expires Mon, 02 Jun 2025 01:49:26 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: b0da44bc067e7d9a4255c77cb2c5fce572218cec

@freakboy3742
Copy link
Member

Reviewing this on a diff basis is going to be almost impossible, so a visual inspection of the output seems the best review approach. To that end:

  1. What (if any) manual modifications have been made to content other than what the conversion script generated?
  2. Are there any pages in particular that caused particularly complex conversion problems that you think are worth a deep review?

@kattni
Copy link
Contributor Author

kattni commented May 22, 2025

Reviewing this on a diff basis is going to be almost impossible, so a visual inspection of the output seems the best review approach.

My strategy was to go through an arbitrary list of pages and compare. This turned out to be coincidentally effective, as we caught some issues with the initial conversion that would have otherwise been missed. I obviously did not go through the entire site either.

  1. What (if any) manual modifications have been made to content other than what the conversion script generated?
  • There were four "Note"s throughout the site that needed to be manually updated.
  • The .lektorproject file, the beeware.css file, and all of the .ini files were manually updated.
  1. Are there any pages in particular that caused particularly complex conversion problems that you think are worth a deep review?

The homepage required manual updates, but everything else played quite nice. I would suggest a deep comparison of the homepage to itself. Otherwise, as I did, perhaps grab an arbitrary sample of rendered pages to compare.

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part, this looks great. All the English text content seems to have done what we expect, and the handful of pages with images look like they've come across clean as well.

Four issues that I've noticed in my audit pass:

  1. The Sprints subheading in the contribution page gutter (/contributing) looks like it has converted badly; I can't find any other examples of this bug, though.
  2. It looks like in the case of translations, a non-existent body in the translation file has been turned into an empty body in the translation file, which results in the page rendering as empty, rather than "fallback". Maybe this could be an "if field" vs "if field is None" issue in the conversion?
  3. Some of the translations have duplicated gutter content. I noticed this on /community; Turkish and Spanish are duplicating the English content under the translation. This problem looks like it might be pre-existing, though; I'm happy to punt this to a follow up issue if you want.
  4. There's some stray "reStructuredText" references in the model and flowblock definitions. They're not surfacing anywhere (other than the admin), but we might as well clean them up.

Also - can you point me at an example of an admonition/note in use? I can see the style changes, but my grep-fu is failing me in finding one in the actual text...

@kattni
Copy link
Contributor Author

kattni commented May 22, 2025

  1. The Sprints subheading in the contribution page gutter (/contributing) looks like it has converted badly; I can't find any other examples of this bug, though.

On it.

  1. It looks like in the case of translations, a non-existent body in the translation file has been turned into an empty body in the translation file, which results in the page rendering as empty, rather than "fallback". Maybe this could be an "if field" vs "if field is None" issue in the conversion?

Hmm. Can you point me to an example of this so I have some idea what I'm looking for? I can't find what exactly you're referring to.

  1. Some of the translations have duplicated gutter content. I noticed this on /community; Turkish and Spanish are duplicating the English content under the translation. This problem looks like it might be pre-existing, though; I'm happy to punt this to a follow up issue if you want.

I'll see whether this is easily sorted, and if not, we'll punt it.

  1. There's some stray "reStructuredText" references in the model and flowblock definitions. They're not surfacing anywhere (other than the admin), but we might as well clean them up.

On it.

Also - can you point me at an example of an admonition/note in use? I can see the style changes, but my grep-fu is failing me in finding one in the actual text...

There's a note on this page: https://beeware.org/contributing/first-time/github/

@freakboy3742
Copy link
Member

  1. It looks like in the case of translations, a non-existent body in the translation file has been turned into an empty body in the translation file, which results in the page rendering as empty, rather than "fallback". Maybe this could be an "if field" vs "if field is None" issue in the conversion?

Hmm. Can you point me to an example of this so I have some idea what I'm looking for? I can't find what exactly you're referring to.

The one I noticed was /contributing; the Turkish, German and Spanish translation pages are all empty. Danish is fully populated - but it's all English.

Also - can you point me at an example of an admonition/note in use? I can see the style changes, but my grep-fu is failing me in finding one in the actual text...

There's a note on this page: https://beeware.org/contributing/first-time/github/

Awesome - looks great; although I did notice that the bullet points on the gutter of that page have been "quoted" - that's usually an indicator of too much indentation in the markdown. It's not clear if that might be a problem with the original content - the indentation is definitely off, but it's not rendering as a block quote.

@kattni
Copy link
Contributor Author

kattni commented May 22, 2025

The one I noticed was /contributing; the Turkish, German and Spanish translation pages are all empty. Danish is fully populated - but it's all English.

So, there is no Danish translation of /contributing so I think it renders it in English (https://beeware.org/da_DK/contributing/). All of the pages you listed are empty on the current site as well. I fixed that it added empty space to the body element in those files, but they were and still are rendering empty. I also removed the additional gutter information that I have no idea as to how it was added.

Unless I'm missing something!

Awesome - looks great; although I did notice that the bullet points on the gutter of that page have been "quoted" - that's usually an indicator of too much indentation in the markdown. It's not clear if that might be a problem with the original content - the indentation is definitely off, but it's not rendering as a block quote.

This was an issue with how rST was rendering lists, which was beginning with a space. The indentation was read by pandoc as quoteblocked. Rather than try to figure out scripting something to fix it, I did it manually.

@freakboy3742
Copy link
Member

So, there is no Danish translation of /contributing so I think it renders it in English (https://beeware.org/da_DK/contributing/). All of the pages you listed are empty on the current site as well.

Huh... I swear I checked and those were working on the main site...

Regardless - I wonder if it might be better to just delete the translations, rather than display an empty page. The only translated content we'd be losing is the titles, which aren't that hard to resurrect...

Or, we can punt this one as well, and treat it as a separate bug.

Awesome - looks great; although I did notice that the bullet points on the gutter of that page have been "quoted" - that's usually an indicator of too much indentation in the markdown. It's not clear if that might be a problem with the original content - the indentation is definitely off, but it's not rendering as a block quote.

This was an issue with how rST was rendering lists, which was beginning with a space. The indentation was read by pandoc as quoteblocked. Rather than try to figure out scripting something to fix it, I did it manually.

👍

@kattni
Copy link
Contributor Author

kattni commented May 25, 2025

  1. Some of the translations have duplicated gutter content. I noticed this on /community; Turkish and Spanish are duplicating the English content under the translation. This problem looks like it might be pre-existing, though; I'm happy to punt this to a follow up issue if you want.

I went through every instance of gutter: and it appears I removed all of the duplicated content. It seems it was limited to community.

Regardless - I wonder if it might be better to just delete the translations, rather than display an empty page. The only translated content we'd be losing is the titles, which aren't that hard to resurrect...

There were summaries as well, but I couldn't figure out where those were even rendered. I deleted the empty translations under contributing. There are other seemingly empty translations as well, but it will take more comparison to sort which of them are empty displayed pages, and which of them are "empty" body elements because LektorMagic is adding other content in the background. So I think anything further than this should be punted.

That should be everything at this point!

@freakboy3742 freakboy3742 added preview Approved for an automated preview and removed preview Approved for an automated preview labels May 25, 2025
Sprint Guide counts as a contribution to BeeWare!
### Improving this guide

If you've got any suggestions on how to improve this sprint guide, let us know.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah! I had this on my todo list for my next PR. I figured it should be separate. Done now!

@freakboy3742 freakboy3742 added preview Approved for an automated preview and removed preview Approved for an automated preview labels May 26, 2025
@freakboy3742 freakboy3742 added preview Approved for an automated preview and removed preview Approved for an automated preview labels May 26, 2025
Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to call this done. It's almost certainly not perfect, but then the original site had a bunch of pre-existing bugs (as we've found in reviewing this PR), and we can resolve those issues as we find them.

@freakboy3742 freakboy3742 merged commit 34031d8 into beeware:lektor May 26, 2025
@freakboy3742
Copy link
Member

Yay! We're marking down (pun intended) RST for being harder to maintain, I guess.

@johnzhou721 Please keep in mind that while dropping into a PR to make a joke like this might seem like "harmless fun", it's adding a message to the inbox of everyone who was involved in the discussion. I've indicated to you previously that you need to post fewer, more considered comments - and avoiding off-topic jokes is an extension of this. If this was a one off, I wouldn't be concerned; but you've made a habit of dropping into threads where you're not involved just to make a pun - and that's a distraction we can do without.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview Approved for an automated preview

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants