Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -1413,6 +1413,9 @@
[submodule "vendor/grammars/vscode-slice"]
path = vendor/grammars/vscode-slice
url = https://github.com/zeroc-ice/vscode-slice
[submodule "vendor/grammars/vscode-tree-sitter-query"]
path = vendor/grammars/vscode-tree-sitter-query
url = https://github.com/jrieken/vscode-tree-sitter-query.git
[submodule "vendor/grammars/vscode-vba"]
path = vendor/grammars/vscode-vba
url = https://github.com/serkonda7/vscode-vba.git
Expand Down
2 changes: 2 additions & 0 deletions grammars.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1267,6 +1267,8 @@ vendor/grammars/vscode-singularity:
vendor/grammars/vscode-slice:
- source.ice
- source.slice
vendor/grammars/vscode-tree-sitter-query:
- source.scm
vendor/grammars/vscode-vba:
- source.vba
- source.wwb
Expand Down
18 changes: 18 additions & 0 deletions lib/linguist/heuristics.yml
Original file line number Diff line number Diff line change
Expand Up @@ -714,6 +714,24 @@ disambiguations:
- language: Markdown
# Markdown syntax for scdoc
pattern: '^#+\s+(NAME|SYNOPSIS|DESCRIPTION)'
- extensions: ['.scm']
rules:
- language: Scheme
pattern:
- '(?:''[\(\w\*]|[\w\-]+->[\w\-]+|\b\.\.\.\b|\([+\-#:<>\/=~\)]|~>)'
- '^#:\w+'
- '#\w*\('
- '^\s*\((?i:define|import|library|let)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is vulnerable to ReDoS attack. Please fix it, and ensure the new regex runs linearly and is a Re2-style regex. See #7242 for other regexes we're slowly cleaning up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all the pattern entries joined by a | or evaluated separately? Does pattern: ['a*', 'b*'] become pattern: '(a*|b*)'?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find the magic here.

My finding was based purely on testing the first regex in isolation 😉

This comment was marked as resolved.

negative_pattern:
- '(["_\]\)] (@\w|\?)|'
- '\(#[\w-]+[!\?]'
- language: TreeSitterQuery
pattern:
- '\(#\w[\w-]+[!\?]'
- '[\)\]"_]\s*[\*\+\?@]'
negative_pattern:
- '\(([\w-]+!|[^\w"])'
- '\([\d\w\.\*\?\/><=+-:]+(\s+[\d\w\.\*\?\/><=+-:]+)+\)'
- extensions: ['.sol']
rules:
- language: Solidity
Expand Down
8 changes: 8 additions & 0 deletions lib/linguist/languages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7561,6 +7561,14 @@ Toit:
tm_scope: source.toit
ace_mode: text
language_id: 356554395
Tree-sitter Query:
type: programming
color: "#8ea64c"
extensions:
- ".scm"
tm_scope: source.scm
ace_mode: text
language_id: 436081647
Turing:
type: programming
color: "#cf142b"
Expand Down
23 changes: 23 additions & 0 deletions samples/Scheme/namespace.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
;; Variable bound to a number:
(define f 10)
f
===> 10
;; Mutation (altering the bound value)
(set! f (+ f f 6))
f
===> 26
;; Assigning a procedure to the same variable:
(set! f (lambda (n) (+ n 12)))
(f 6)
===> 18
;; Assigning the result of an expression to the same variable:
(set! f (f 1))
f
===> 13
;; functional programming:
(apply + '(1 2 3 4 5 6))
===> 21
(set! f (lambda (n) (+ n 100)))
(map f '(1 2 3))
===> (101 102 103)

161 changes: 161 additions & 0 deletions samples/Tree-sitter Query/highlights.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
; Identifiers

(type_identifier) @type
(primitive_type) @type.builtin
(field_identifier) @property

; Identifier conventions

; Assume all-caps names are constants
((identifier) @constant
(#match? @constant "^[A-Z][A-Z\\d_]+$'"))

; Assume uppercase names are enum constructors
((identifier) @constructor
(#match? @constructor "^[A-Z]"))

; Assume that uppercase names in paths are types
((scoped_identifier
path: (identifier) @type)
(#match? @type "^[A-Z]"))
((scoped_identifier
path: (scoped_identifier
name: (identifier) @type))
(#match? @type "^[A-Z]"))
((scoped_type_identifier
path: (identifier) @type)
(#match? @type "^[A-Z]"))
((scoped_type_identifier
path: (scoped_identifier
name: (identifier) @type))
(#match? @type "^[A-Z]"))

; Assume all qualified names in struct patterns are enum constructors. (They're
; either that, or struct names; highlighting both as constructors seems to be
; the less glaring choice of error, visually.)
(struct_pattern
type: (scoped_type_identifier
name: (type_identifier) @constructor))

; Function calls

(call_expression
function: (identifier) @function)
(call_expression
function: (field_expression
field: (field_identifier) @function.method))
(call_expression
function: (scoped_identifier
"::"
name: (identifier) @function))

(generic_function
function: (identifier) @function)
(generic_function
function: (scoped_identifier
name: (identifier) @function))
(generic_function
function: (field_expression
field: (field_identifier) @function.method))

(macro_invocation
macro: (identifier) @function.macro
"!" @function.macro)

; Function definitions

(function_item (identifier) @function)
(function_signature_item (identifier) @function)

(line_comment) @comment
(block_comment) @comment

(line_comment (doc_comment)) @comment.documentation
(block_comment (doc_comment)) @comment.documentation

"(" @punctuation.bracket
")" @punctuation.bracket
"[" @punctuation.bracket
"]" @punctuation.bracket
"{" @punctuation.bracket
"}" @punctuation.bracket

(type_arguments
"<" @punctuation.bracket
">" @punctuation.bracket)
(type_parameters
"<" @punctuation.bracket
">" @punctuation.bracket)

"::" @punctuation.delimiter
":" @punctuation.delimiter
"." @punctuation.delimiter
"," @punctuation.delimiter
";" @punctuation.delimiter

(parameter (identifier) @variable.parameter)

(lifetime (identifier) @label)

"as" @keyword
"async" @keyword
"await" @keyword
"break" @keyword
"const" @keyword
"continue" @keyword
"default" @keyword
"dyn" @keyword
"else" @keyword
"enum" @keyword
"extern" @keyword
"fn" @keyword
"for" @keyword
"gen" @keyword
"if" @keyword
"impl" @keyword
"in" @keyword
"let" @keyword
"loop" @keyword
"macro_rules!" @keyword
"match" @keyword
"mod" @keyword
"move" @keyword
"pub" @keyword
"raw" @keyword
"ref" @keyword
"return" @keyword
"static" @keyword
"struct" @keyword
"trait" @keyword
"type" @keyword
"union" @keyword
"unsafe" @keyword
"use" @keyword
"where" @keyword
"while" @keyword
"yield" @keyword
(crate) @keyword
(mutable_specifier) @keyword
(use_list (self) @keyword)
(scoped_use_list (self) @keyword)
(scoped_identifier (self) @keyword)
(super) @keyword

(self) @variable.builtin

(char_literal) @string
(string_literal) @string
(raw_string_literal) @string

(boolean_literal) @constant.builtin
(integer_literal) @constant.builtin
(float_literal) @constant.builtin

(escape_sequence) @escape

(attribute_item) @attribute
(inner_attribute_item) @attribute

"*" @operator
"&" @operator
"'" @operator
9 changes: 9 additions & 0 deletions samples/Tree-sitter Query/injections.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
((macro_invocation
(token_tree) @injection.content)
(#set! injection.language "rust")
(#set! injection.include-children))

((macro_rule
(token_tree) @injection.content)
(#set! injection.language "rust")
(#set! injection.include-children))
60 changes: 60 additions & 0 deletions samples/Tree-sitter Query/tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
; ADT definitions

(struct_item
name: (type_identifier) @name) @definition.class

(enum_item
name: (type_identifier) @name) @definition.class

(union_item
name: (type_identifier) @name) @definition.class

; type aliases

(type_item
name: (type_identifier) @name) @definition.class

; method definitions

(declaration_list
(function_item
name: (identifier) @name) @definition.method)

; function definitions

(function_item
name: (identifier) @name) @definition.function

; trait definitions
(trait_item
name: (type_identifier) @name) @definition.interface

; module definitions
(mod_item
name: (identifier) @name) @definition.module

; macro definitions

(macro_definition
name: (identifier) @name) @definition.macro

; references

(call_expression
function: (identifier) @name) @reference.call

(call_expression
function: (field_expression
field: (field_identifier) @name)) @reference.call

(macro_invocation
macro: (identifier) @name) @reference.call

; implementations

(impl_item
trait: (type_identifier) @name) @reference.implementation

(impl_item
type: (type_identifier) @name
!trait) @reference.implementation
7 changes: 7 additions & 0 deletions test/test_heuristics.rb
Original file line number Diff line number Diff line change
Expand Up @@ -948,6 +948,13 @@ def test_scd_by_heuristics
}, alt_name="test.scd")
end

def test_scm_by_heuristics
assert_heuristics({
"Scheme" => all_fixtures("Scheme", "*.scm"),
"Tree-sitter Query" => all_fixtures("Tree-sitter Query", "*.scm")
})
end

def test_sol_by_heuristics
assert_heuristics({
"Gerber Image" => Dir.glob("#{fixtures_path}/Generic/sol/Gerber Image/*"),
Expand Down
1 change: 1 addition & 0 deletions test/test_language.rb
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ def test_find_by_alias
assert_equal Language['Ruby'], Language.find_by_alias('ruby')
assert_equal Language['R'], Language.find_by_alias('r')
assert_equal Language['Scheme'], Language.find_by_alias('scheme')
assert_equal Language['Tree-sitter Query'], Language.find_by_alias('tree-sitter-query')
assert_equal Language['Shell'], Language.find_by_alias('bash')
assert_equal Language['Shell'], Language.find_by_alias('sh')
assert_equal Language['Shell'], Language.find_by_alias('shell')
Expand Down
1 change: 1 addition & 0 deletions vendor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -596,6 +596,7 @@ This is a list of grammars that Linguist selects to provide syntax highlighting
- **TextMate Properties:** [textmate/textmate.tmbundle](https://github.com/textmate/textmate.tmbundle)
- **Thrift:** [textmate/thrift.tmbundle](https://github.com/textmate/thrift.tmbundle)
- **Toit:** [toitware/ide-tools](https://github.com/toitware/ide-tools)
- **Tree-sitter Query:** [jrieken/vscode-tree-sitter-query](https://github.com/jrieken/vscode-tree-sitter-query)
- **Turing:** [Alhadis/language-turing](https://github.com/Alhadis/language-turing)
- **Turtle:** [peta/turtle.tmbundle](https://github.com/peta/turtle.tmbundle)
- **Twig:** [Anomareh/PHP-Twig.tmbundle](https://github.com/Anomareh/PHP-Twig.tmbundle)
Expand Down
1 change: 1 addition & 0 deletions vendor/grammars/vscode-tree-sitter-query
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
name: vscode-tree-sitter-query
version: d350239cb02d76c4e66720e7b1acb215ec636a18
type: git_submodule
homepage: https://github.com/jrieken/vscode-tree-sitter-query.git
license: none
licenses: []
notices: []
Loading