Merge pull request #30 from lpmi-13/typofix

Patrick Thomson · web-flow · commit 31fc1c08b454 · 2019-06-03T08:58:30.000-04:00
Fix simple typos and standardize formatting in places
diff --git a/docs/coding-style.md b/docs/coding-style.md
@@ -76,7 +76,7 @@ foo = Heap.lookup thing
 ```
 
 Unlike many Haskell projects, we rely in places on variable shadowing (especially in open-recursive functions).
-Avoid variable shadowing if possible, as it can lead to unintuitive error messages; you are free to disable shadowing on in a per-file basis with `{-# OPTIONS_GHC -Wshadow #-}`
+Avoid variable shadowing if possible, as it can lead to unintuitive error messages; you are free to disable shadowing on a per-file basis with `{-# OPTIONS_GHC -Wshadow #-}`
 
 # Functions
 
diff --git a/docs/grammar-development-guide.md b/docs/grammar-development-guide.md
@@ -105,7 +105,7 @@ Here are some things that might help:
 
 - **Inline:** Adding a rule to the `inline` array strips out the whole node as though it has been removed from the grammar. Each occurrence of this rule in the grammar is replaced with a copy of its definition. Similar to making something hidden, this makes your AST more compact. Inlining doesn't create these nodes at runtime, whereas  making something hidden acknowledges the node at runtime but hides it from the AST.
 
--  **Make `seq` visible and `choice` hidden** Sequences typically have meaning. Choices are just containers that point to other things.
+-  **Make `seq` visible and `choice` hidden:** Sequences typically have meaning. Choices are just containers that point to other things.
 
 - **Making things hidden:** Preceding a rule with an underscore (`_rule`) allows you to omit displaying a rule in the AST. This allows you to make a tree more compact.
 
@@ -117,7 +117,7 @@ Here are some guidelines to determine what approach to take when removing superf
 
 2. **Add it to the inline array.** If the rule is used more than once and its definition is not simple, make it `inline`. If this does not cause parsing problems, this is the best approach, because it will avoid intermediate node allocations and parsing operations at runtime. One possible side-effect of `inline` is that is sometimes makes the parser much larger in terms of number of states. To evaluate whether this has happened, it’s worth looking at the `STATE_COUNT` in `parser.c` before and after. If the state count goes way up, it may not be worth adding the rule to `inline` since more states mean more one-time memory footprint for the parser. If it goes up a few percent (or goes down), it’s fine to add.
 
-3. **Mark it hidden**. If `inline` causes conflicts or drastically increases the size of the parse table, it's better to mark it as hidden. This is often useful when two two nodes can not exist without one another. For example, `class_body_declaration` was a child of `class_body` and occurred together 100% of the time. Similarly, `type_arguments` can not exist independent of its child node, `type_argument`. In both cases, it makes sense to hide the former.
+3. **Mark it hidden.** If `inline` causes conflicts or drastically increases the size of the parse table, it's better to mark it as hidden. This is often useful when two nodes can not exist without one another. For example, `class_body_declaration` was a child of `class_body` and occurred together 100% of the time. Similarly, `type_arguments` can not exist independent of its child node, `type_argument`. In both cases, it makes sense to hide the former.
     ```diff
         (generic_type
           (type_identifier)
@@ -134,7 +134,7 @@ Once you have developed a significant portion of the grammar, find a file from a
 Use [a script like this](https://github.com/tree-sitter/tree-sitter-java/blob/master/script/parse-examples.rb) is one way to mass test a large repo quickly.
 
 ### Sequence your work
-Most languages have a long-tail of features that are not frequently utilized in the wild. When supporting a language, our aim is always to be able to parse 100% of a language (or ideally more, since the intent is to support multiple versions). However, this does necessarily all in one go. A good way to do this is to develop the structure and documentation necessary to support open source contribution. 
+Most languages have a long-tail of features that are not frequently utilized in the wild. When supporting a language, our aim is always to be able to parse 100% of a language (or ideally more, since the intent is to support multiple versions). However, this doesn't necessarily happen all in one go. A good way to do this is to develop the structure and documentation necessary to support open source contribution.
 
 ### Handling conflicts
 
@@ -148,23 +148,23 @@ Conflicts may arise due to ambiguities in the grammar. This is when the parser c
   - `commaSep1` - creates a repeating sequence of 1 or more tokens separated by a comma
   - `sep1`- creates a repeating sequence of 0 or more tokens separated by the specified delimiter
 
-- **Specify associativity and/or precedence.** Another way of resolving a conflict is through associativity and precedence. Specifying precedence allows us to prioritize productions in the grammar. If there are two or more ways to proceed, the production with the higher precedence will get preference. Left and right associativity can also be used to reflect how to proceed. For instance, a left-associative evaluation is `(a Q b) Q c` vs. a right-associative evaluation would render `a Q (b Q c)`. In this way, associativity changes the meaning of the expression. Resolving conflicts this way is a compile time solution as opposed to the "Add a conflict" section below which means the parser will try deal with the ambiguity at runtime.
+- **Specify associativity and/or precedence.** Another way of resolving a conflict is through associativity and precedence. Specifying precedence allows us to prioritize productions in the grammar. If there are two or more ways to proceed, the production with the higher precedence will get preference. Left and right associativity can also be used to reflect how to proceed. For instance, a left-associative evaluation is `(a Q b) Q c` vs. a right-associative evaluation would render `a Q (b Q c)`. In this way, associativity changes the meaning of the expression. Resolving conflicts this way is a compile time solution as opposed to the "Add a conflict" section below which means the parser will try to deal with the ambiguity at runtime.
 
 - **Add a conflict.** Adding conflicts allows the parser to pursue multiple paths in parallel, and decide which one to proceed with further along the process. Adding a conflict for one rule prevents the parser from recursively descending.
 
 _Workflow:_
-1. Add a conflict to the `conflicts` if there are 2 rules conflicting (to test that the conflict is the problem and gets the right parse output)
-2. Try `prec.left` or `prec.right` based on the options (if that’s not clear, then try both `prec.left` and `prec.right` and compare their outputs)
+1. Add a conflict to the `conflicts` if there are 2 rules conflicting (to test that the conflict is the problem and gets the right parse output).
+2. Try `prec.left` or `prec.right` based on the options (if that’s not clear, then try both `prec.left` and `prec.right` and compare their outputs).
 3. Look at adding a precedence number, usually `1` or `+1`, based on the rule you want to succeed first.
 4. Make sure there aren’t duplicate paths to get to the same rule from sibling rules (like having `_literal` in both `_statement` and `_expression`).
-And then once things are working in the tree output looks good, remove the conflict rule and try to solve it with associativity or precedence only. This helps confirm the solution before expending too much time adjusting precedence.
+And then once things are working and the tree output looks good, remove the conflict rule and try to solve it with associativity or precedence only. This helps confirm the solution before expending too much time adjusting precedence.
 
 ### Debugging errors
 
-Tree-sitter's error-handling is great, but sometimes works too well and hides helpful info that help to understand why errors are happening. The following tips can help detect where errors are occurring.
+Tree-sitter's error-handling is great, but sometimes works too well and hides helpful info that helps to understand why errors are happening. The following tips can help detect where errors are occurring.
 
 - **Narrow down your problem space.** Triangulate the error by starting with a simple example and progressively adding complexity to better understand where the parser is having trouble.
 - **Consult the spec.** Eliminate the possibility of typos or oversights in your logic by looking at the definition of your rule in the spec.
-- **Run your code.** Execute your test code to see verify it is valid. Use errors (if any) to get additional information about where the problem may lie.
+- **Run your code.** Execute your test code to verify it is valid. Use errors (if any) to get additional information about where the problem may lie.
 - **Use visual debug output.** Analyze the forks and look at individual production rules to hone in on the problem.
-- **Test all permutations of a particular language construct** This will help you find the edges of your language and ensure your grammar supports them.
+- **Test all permutations of a particular language construct.** This will help you find the edges of your language and ensure your grammar supports them.
diff --git a/docs/program-analysis.md b/docs/program-analysis.md
@@ -26,7 +26,7 @@ The following is a brief guide to working with the definitional interpreters and
 
 _Helpers:_
 - `parseFile`: parses one file.
-- `evaluateLanguageProject` takes a list of files and evaluates them usually under concrete semantics.
+- `evaluateLanguageProject`: takes a list of files and evaluates them usually under concrete semantics.
 - `callGraphLanguageProject`: uses the same mechanism for evaluating, but uses abstract semantics.
 - `typeCheckLanguageFile`: allows us to evaluate under type checking semantics.
 
diff --git a/docs/why-haskell.md b/docs/why-haskell.md
@@ -43,7 +43,7 @@ Haskell is a pleasure to work in everyday. It's both productive and eye-opening.
 - *Editor tooling* is sub-par (especially compared to language communities like Java and C#) and finicky - we often end up just compiling in a separate terminal.
 - *Edges of the type system*. We often find ourselves working at the edges of Haskell's incredible type system, wishing for dependent types or reaching for complex workarounds like the [Advanced Overlap][] techniques designed by Oleg Kiselyov & Simon Peyton Jones.
 - *Infra glue*. Haskell is very competent at standard infrastructure functionality like running a webserver, but it isn't the focus of the language community so you're often left writing your own libraries and components when you need to plug in to modern infrastructure.
-- *Lazy evaluation* isn't always want you want and can have performance problems and make some debugging activities incredibly frustrating. We use the `StrictData` language extension to combat some of these difficulties.
+- *Lazy evaluation* isn't always what you want and can have performance problems and make some debugging activities incredibly frustrating. We use the `StrictData` language extension to combat some of these difficulties.
 - *Haskell has a reputation for being difficult to learn.* Some of that is well deserved, but half of it has more to do with how many of us first learned imperative programming and the switch to a functional paradigm takes some patience. Haskell also leverages a much more mathematically rigorous set of abstractions which likely aren't as familiar to web developers. We have, however, had very good luck on-boarding new team members with a wide range of previous experience and the quality of learning Haskell resources has really improved.
 
 At this point, we are pretty firmly attached to Haskell's language features to enable many of the objectives of this project: abstract interpretation, graph analysis, effect analysis, code writing, AST matching, etc. Could you implement Semantic in another programming language? Certainly. An early prototype of the semantic diff portion of the project was done in Swift, but it quickly became unwieldy and even the first rough Haskell prototype was considerably more performant. Since adopting Haskell, we've had no trouble plugging into the rest of GitHub's infrastructure: running as a command line tool, a web server (HTTP/JSON), and now a Twirp RPC server. We've been an early adopter of Kubernetes and Moda and now ~~gRPC~~ Twirp at GitHub, often shipping our application on these new infrastructure components well ahead of other teams. We've managed our own build systems, quickly adopted new technologies like Docker, shipped in Enterprise, and much, much more in the short lifespan of the project. We've yet to be constrained by our language choice. If anything, we are amazed daily at Semantic's ability to abstract and represent the syntax and evaluation semantics of half a dozen (and counting) programming languages while keeping all the benefits of a strong static type system. If we'd chosen a more "popular" language it's likely we'd be mired in hundreds of thousands of lines of code and complaining about our tech debt, application performance, and the burden of adding any more languages. As it stands today, we've got 20k lines of Haskell code and some incredible program analysis capabilities at our disposal with little fear of adding more languages or supporting the changing needs of GitHub.