Towards a new Parallels JSON structure

At the request of Bhante @sujato I want to make a proposal for a new, easily readable JSON structure for th Parallels.json file.

<h2>Methodology</h2>

There are four types of parallels, which are specified here: https://suttacentral.net/methodology?lang=en

Next to these, there are parallels of part of a text. So, for instance, if the first part of a sutta (A) is a full parallel to the first part of another sutta (B), but the rest of these suttas do not match. Then these two suttas can be represented as a "resembling parallel" or these first parts on their own can be represented as a "full parallel" by using their line numbers or paragraph ids to represent a range.

_A is a resembling parallel to B and B is a resembling parallel to A_
or
_A#sc1-#sc10 is a full parallels to B#sc1-#sc10_

Currently, the choice between these is sometimes clear, but sometimes a bit arbitrary.

<h2>Current JSON structure</h2>

Currently, the JSON structure is made to be concise, but it makes it not very readable. There is a wiki: https://github.com/suttacentral/suttacentral/wiki/Parallels-information

As an example, I take MN10: https://suttacentral.net/mn10?view=normal&lang=en

The first part of the parallels in this list is currently represented by:

```
    {
        "parallels": [
            "dn22",
            "ea12.1",
            "ma98",
            "mn10",
            "sht-sutta11",
            "~ma31",
            "~mn119",
            "~t32",
            "~ma81"
        ]
    },
```
The `~` represents a "resembling parallel".

Each of the resembling parallels is then also mentioned in a new object because it might be that for instance MA31 is resembling parallels with MN10, but not with MN119, etc. This has to be determined at in every case.

```
    {
        "parallels": [
            "mn141",
            "ea27.1",
            "ma31",
            "t32",
            "~dn22#18.1",
            "~ea12.1",
            "~ma98",
            "~mn10",
            "~sht-sutta11"
        ]
    },
    {
        "parallels": [
            "mn119",
            "ma81",
            "~dn22",
            "~ea12.1",
            "~ma98",
            "~mn10",
            "~sht-sutta11"
        ]
    },
.... Etc. for each of the resembling parallels.
```

Then there are the parallels of parts of the text. For instance for MN 10, paragraph 10.1:

```
    {
        "parallels": [
            "dn22#5.1",
            "mn10#10.1"
        ]
    },
    {
        "mentions": [
            "dn22#5.1",
            "ne17#22.1",
            "vb7#2.1"
        ]
    },
    {
        "mentions": [
            "mn10#10.1",
            "ne17#22.1",
            "vb7#2.1"
        ]
    },
```

In this case, we have a full parallel with dn22#5.1 and two mentions of each of those in ne17#22.1 and vb7#2.1.

And the same for the other paragraphs that are mentioned.

<h2>New JSON structure proposal</h2>

Now I propose a structure for the JSON that is radically different, namely a structure based on each sutta separately in the form of:

```
[
  "suttanr": {
            "full": [],
            "resembling": [],
            "mentions": [],
            "retelling": [],
            "sections": [
                         "suttanr#id-#id: {
                                         "full": [],
                                         "resembling": [],
                                         "mentions": [],
                                         "retelling": []
                                         }
                         ]
          }
]
```
Where "partial" represents the part-sutta parallels.
So for MN10 (in full), this would become:

```
[
  "mn10": {
            "full": [
                      "dn22",
                      "ea12.1",
                      "ma98",
                      "sht-sutta11"
                    ],
            "resembling": [
                      "ma31",
                      "mn119",
                      "t32",
                      "ma81"
                    ],
            "mentions": [],
            "retelling": [],
            "sections": [ "mn10#10.1": {
                                        "full": ["dn22#5.1"],
                                        "resembling": [],
                                        "mentions": [
                                                      "ne17#22.1",
                                                      "vb7#2.1"
                                                    ],
                                        "retelling": []
                                        },
                          "mn10#44.1": {
                                        "full": [
                                                  "dn22#17.1",
                                                  "mn9#14-18.1"
                                                ],
                                        "resembling": [],
                                        "mentions": [],
                                        "retelling": []
                                        },
                          "mn10#47.1": {
                                        "full": [
                                                  "dn22#22.24",
                                                  "sn47.1#2.1"
                                                ],
                                        "resembling": [],
                                        "mentions": ["kv1.9#10.1"],
                                        "retelling": []
                                        }
                        ]
          }
]

```
We could of course remove the empty fields so it becomes:

```
[
  "mn10": {
            "full": [
                      "dn22",
                      "ea12.1",
                      "ma98",
                      "sht-sutta11"
                    ],
            "resembling": [
                      "ma31",
                      "mn119",
                      "t32",
                      "ma81"
                    ],
            "sections": [ "mn10#10.1": {
                                        "full": ["dn22#5.1"],
                                        "mentions": [
                                                      "ne17#22.1",
                                                      "vb7#2.1"
                                                    ]
                                        },
                          "mn10#44.1": {
                                        "full": [
                                                  "dn22#17.1",
                                                  "mn9#14-18.1"
                                                ]
                                        },
                          "mn10#47.1": {
                                        "full": [
                                                  "dn22#22.24",
                                                  "sn47.1#2.1"
                                                ],
                                        "mentions": ["kv1.9#10.1"]
                                        }
                        ]
          }
]
```
And this for each and every sutta. So in this case, "dn22" would get it's own mention in the same way as well.

<h2>Pros and Cons</h2>

The proposed structure is more readable and intuitive, as it is basically the same as it actually shows on the website. The old structure is much more concise.

But it is also easier for someone adding a parallel to make the mistake to add it only in one place. For instance, if a full parallel X is found for mn10, you would need to add it with "dn22", "ea12.1", "ma98", "mn10", "sht-sutta11" and check if it has to be added to  "ma31", "mn119", "t32" and  "ma81".  And X needs to get it's own entry as well if it wasn't there before.

Now this is something that had to be taken into account in the old structure also, but if one adds X to the full parallels list `"dn22", "ea12.1", "ma98", "mn10", "sht-sutta11"`, it is automatically added with all of those already, without the need of doing this 5 times and making a new extra entry for X.

<h2>Integration into SuttaCentral</h2>

A new JSON structure would require the loading code in Python to be updated in the SuttaCentral backend.

So please let me know your ideas and thoughts about this proposal.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Towards a new Parallels JSON structure #368

Methodology

Current JSON structure

New JSON structure proposal

Pros and Cons

Integration into SuttaCentral

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Towards a new Parallels JSON structure #368

Description

Methodology

Current JSON structure

New JSON structure proposal

Pros and Cons

Integration into SuttaCentral

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions