Skip to content

Flood-counting #1451

@linas

Description

@linas

One of the "deep" conceptual problems with LG parsing is that there are usually either not enough parses (i.e. null words) or there are too many (count overflow) and its hard to strike a middle ground. This particularly affects linkage generation: there (currently) isn't any way to know the cost of a linkage, without actually computing it.

This issue is meant to be a place to daydream about possible solutions to this.

I have only one very vague and not really workable suggestion, for right now. During counting, instead of counting 1 for each link, instead do a weighted count of $2^{-cost}$ for each link.

Counting in this way replaces a grand-total by a weighted count, hinting at how many of these potential linkages are low-cost, vs. how many are high-cost. We could even make it smell more like simulated annealing by using $exp(-\beta \times cost)$ instead of $2^{-cost}$ where $\beta$ is a user-supplied "inverse temperature" parameter $\beta=1/kT$.

What is not yet clear to me is how to use this to replace the random sampling of linkages (when there is a count overflow) by a weighted random sampling, that would make lower-cost linkages more likely to be sampled.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions