Skip to content

Syntax tree not recognising headings, separating full stops, and not separating lists #109

@uuykay

Description

@uuykay

Hello!

I like this project, because I am looking to create a syntax tree out of markdown (in order to do my own JSX parsing). However, after running the example in the Readme, the syntax tree seems to have issues on even basic cases. For example, this is my node script:

// index.js
const fs = require("fs")
const SimpleMarkdown = require("simple-markdown")
const mdParse = SimpleMarkdown.defaultBlockParse

const testMarkdown = fs.readFileSync(__dirname + "/test.md", "utf-8")

const syntaxTree = mdParse(testMarkdown)

fs.writeFileSync(__dirname + "/temp.json", JSON.stringify(syntaxTree, null, 4))

And this was my test.md file

# This is a h1 title
## This is a h2 subtitle
This is a paragraph 1.

This is a paragraph 2. **This text should be bold.** *And this text should be italic.*

> This should be a blockquote

1. Ordered list 1
2. Ordered list 2
3. Ordered list 3

- Unordered list 1
- Unordered list 2

And finally, after running the script, this was the full output:

[
  {
    "content": [
      {
        "content": "# This is a h1 title\n",
        "type": "text"
      },
      {
        "content": "#",
        "type": "text"
      },
      {
        "content": "# This is a h2 subtitle\nThis is a paragraph 1",
        "type": "text"
      },
      {
        "content": ".",
        "type": "text"
      }
    ],
    "type": "paragraph"
  },
  {
    "content": [
      {
        "content": "This is a paragraph 2",
        "type": "text"
      },
      {
        "content": ". ",
        "type": "text"
      },
      {
        "content": [
          {
            "content": "This text should be bold",
            "type": "text"
          },
          {
            "content": ".",
            "type": "text"
          }
        ],
        "type": "strong"
      },
      {
        "content": " ",
        "type": "text"
      },
      {
        "content": [
          {
            "content": "And this text should be italic",
            "type": "text"
          },
          {
            "content": ".",
            "type": "text"
          }
        ],
        "type": "em"
      }
    ],
    "type": "paragraph"
  },
  {
    "content": [
      {
        "content": [
          {
            "content": "This should be a blockquote",
            "type": "text"
          }
        ],
        "type": "paragraph"
      }
    ],
    "type": "blockQuote"
  },
  {
    "ordered": true,
    "start": 1,
    "items": [
      [
        {
          "content": "Ordered list 1",
          "type": "text"
        }
      ],
      [
        {
          "content": "Ordered list 2",
          "type": "text"
        }
      ],
      [
        {
          "content": [
            {
              "content": "Ordered list 3",
              "type": "text"
            }
          ],
          "type": "paragraph"
        }
      ],
      [
        {
          "content": "Unordered list 1",
          "type": "text"
        }
      ],
      [
        {
          "content": "Unordered list 2",
          "type": "text"
        }
      ]
    ],
    "type": "list"
  }
]

Some things to highlight:

  • Headings are not picked up at all
  • Full stops get separated into their own blocks, even though they should have the same formatting like in the bold or italic test.
  • There is no separation between the ordered list and unordered list I created.

Is there an easy way to extend this work so it fixes the cases I mentioned above?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions