|
| 1 | +#  |
| 2 | + |
| 3 | +Natural language for human and machine. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +> Note: Several projects use this document. Do not make changes without consulting with [TextOM](https://github.com/wooorm/textom), [parse-latin](https://github.com/wooorm/parse-latin), and [retext](https://github.com/wooorm/retext). |
| 8 | +
|
| 9 | +## CST |
| 10 | + |
| 11 | +### Node |
| 12 | + |
| 13 | +Node represents any unit in NLCST hierarchy. |
| 14 | + |
| 15 | +``` |
| 16 | +interface Node { |
| 17 | + type: string; |
| 18 | + data: Data | null; |
| 19 | +} |
| 20 | +``` |
| 21 | + |
| 22 | +### Data |
| 23 | + |
| 24 | +Data represents data associated with any node. Data is a scope for plug-ins to store any information. Its only limitation being that each property should by stringifyable: not throw when passed to `JSON.stringify()`. |
| 25 | + |
| 26 | +``` |
| 27 | +interface Data { } |
| 28 | +``` |
| 29 | + |
| 30 | +### Parent |
| 31 | + |
| 32 | +Parent ([Node](#node)) represents a unit in NLCST hierarchy which can have zero or more children. |
| 33 | + |
| 34 | +``` |
| 35 | +interface Parent <: Node { |
| 36 | + children: []; |
| 37 | +} |
| 38 | +``` |
| 39 | + |
| 40 | +### Text |
| 41 | + |
| 42 | +Text ([Node](#node)) represents a unit in NLCST hierarchy which has value. |
| 43 | + |
| 44 | +``` |
| 45 | +interface Text <: Node { |
| 46 | + value: string; |
| 47 | +} |
| 48 | +``` |
| 49 | + |
| 50 | +### RootNode |
| 51 | + |
| 52 | +Root ([Parent](#parent)) represents a document. |
| 53 | + |
| 54 | +``` |
| 55 | +interface RootNode < Parent { |
| 56 | + type: "RootNode"; |
| 57 | +} |
| 58 | +``` |
| 59 | + |
| 60 | +### ParagraphNode |
| 61 | + |
| 62 | +Paragraph ([Parent](#parent)) represents a self-contained unit of discourse in writing dealing with a particular point or idea. |
| 63 | + |
| 64 | +``` |
| 65 | +interface ParagraphNode < Parent { |
| 66 | + type: "ParagraphNode"; |
| 67 | +} |
| 68 | +``` |
| 69 | + |
| 70 | +### SentenceNode |
| 71 | + |
| 72 | +Sentence ([Parent](#parent)) represents grouping of grammatically linked words, that in principle tells a complete thought, although it may make little sense taken in isolation out of context. |
| 73 | + |
| 74 | +``` |
| 75 | +interface SentenceNode < Parent { |
| 76 | + type: "SentenceNode"; |
| 77 | +} |
| 78 | +``` |
| 79 | + |
| 80 | +### WordNode |
| 81 | + |
| 82 | +Word ([Parent](#parent)) represents the smallest element that may be uttered in isolation with semantic or pragmatic content. |
| 83 | + |
| 84 | +``` |
| 85 | +interface WordNode < Parent { |
| 86 | + type: "WordNode"; |
| 87 | +} |
| 88 | +``` |
| 89 | + |
| 90 | +### PunctuationNode |
| 91 | + |
| 92 | +Punctuation ([Parent](#parent)) represents typographical devices which aids understanding and correct reading of other grammatical units. |
| 93 | + |
| 94 | +``` |
| 95 | +interface PunctuationNode < Parent { |
| 96 | + type: "PunctuationNode"; |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +### WhiteSpaceNode |
| 101 | + |
| 102 | +White Space ([PunctuationNode](#punctuation)) represents typographical devices devoid of content, separating other grammatical units. |
| 103 | + |
| 104 | +``` |
| 105 | +interface WhiteSpaceNode < PunctuationNode { |
| 106 | + type: "WhiteSpaceNode"; |
| 107 | +} |
| 108 | +``` |
| 109 | + |
| 110 | +### SourceNode |
| 111 | + |
| 112 | +Source ([Text](#text)) represents an external (ungrammatical) value embedded into a grammatical unit: a hyperlink, an emoticon, and such. |
| 113 | + |
| 114 | +``` |
| 115 | +interface SourceNode < Text { |
| 116 | + type: "SourceNode"; |
| 117 | +} |
| 118 | +``` |
| 119 | + |
| 120 | +### TextNode |
| 121 | + |
| 122 | +Text ([Text](#text)) represents actual content in an NLCST document: one or more characters. |
| 123 | + |
| 124 | +``` |
| 125 | +interface TextNode < Text { |
| 126 | + type: "TextNode"; |
| 127 | +} |
| 128 | +``` |
| 129 | + |
| 130 | +## Related |
| 131 | + |
| 132 | +- [retext](https://github.com/wooorm/retext) — Analyse and Manipulate natural language, 20+ plug-ins. |
| 133 | +- [parse-latin](https://github.com/wooorm/parse-latin) — Transforms latin-script natural language into a CST; |
| 134 | +- [TextOM](https://github.com/wooorm/textom) — Provides an object-oriented manipulation interface to NLCST; |
| 135 | +- [nlcst-to-string](https://github.com/wooorm/nlcst-to-string) — Transforms a CST into a string; |
| 136 | + |
| 137 | +## License |
| 138 | + |
| 139 | +MIT © Titus Wormer |
0 commit comments