Skip to content
Closed
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 2 additions & 11 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,24 +29,15 @@ extend-ignore-identifiers-re = [
"RET",
"prev",
"normalises",
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_typos.toml removed the identifier ignore for inout, but this PR also introduces "inout" into classifications/_universal_rules.json. If typos doesn't treat inout as a valid word/identifier, CI will start failing. Consider keeping inout in extend-ignore-identifiers-re or adding it to allowed words to match the repository's actual vocabulary.

Suggested change
"normalises",
"normalises",
"inout",

Copilot uses AI. Check for mistakes.
"goes",
"Bare",
"inout",
"ba",
"ede",
"goes",
]
Comment on lines -15 to 33
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inout is no longer ignored by typos config, but the repo contains inout identifiers in classifications/ (e.g. Swift keywords). This is likely to make the typos CI step fail again. Consider re-adding inout to extend-ignore-identifiers-re or adding it under default.extend-words.

Copilot uses AI. Check for mistakes.

[default.extend-words]
Bare = "Bare"
Supress = "Supress"
teh = "teh"
Teh = "Teh"

[files]
ignore-hidden = false
ignore-files = true
extend-exclude = [
"CHANGELOG.md",
"./CHANGELOG.md",
"/usr/**/*",
"/tmp/**/*",
Comment on lines 38 to 41
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extend-exclude changed from CHANGELOG.md to ./CHANGELOG.md. Depending on typos’ glob semantics, the leading ./ may stop matching and cause the root changelog to be spellchecked unexpectedly. If the intent is to exclude the root file, keep the pattern consistent with other entries (e.g. CHANGELOG.md or **/CHANGELOG.md).

Copilot uses AI. Check for mistakes.
"/**/node_modules/**",
Expand Down
2 changes: 1 addition & 1 deletion classifications/_universal_rules.json
Original file line number Diff line number Diff line change
Expand Up @@ -1189,7 +1189,7 @@
"inner": "syntax_punctuation",
"inner_attribute_item": "syntax_annotation",
"inner_doc_comment_marker": "syntax_literal",
"input": "syntax_keyword",
"inout": "syntax_keyword",
"instance": "syntax_keyword",
"instance_declarations": "definition_type",
"instance_expression": "operation_operator",
Expand Down
90 changes: 5 additions & 85 deletions crates/ast-engine/src/language.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,8 @@ pub trait Language: Clone + std::fmt::Debug + Send + Sync + 'static {
fn extract_meta_var(&self, source: &str) -> Option<MetaVariable> {
extract_meta_var(source, self.expando_char())
}
/// Return the file language inferred from a filesystem path.
///
/// The *default* implementation is not implemented and will panic if called.
/// Implementors should override this method and return `Some(Self)` when the
/// file type is supported and `None` when it is not.
/// Return the file language from path. Return None if the file type is not supported.
/// Will panic with an unimplimented error if called and not implemented
fn from_path<P: AsRef<Path>>(_path: P) -> Option<Self> {
unimplemented!(
"Language::from_path is not implemented for type `{}`. \
Expand All @@ -94,26 +91,12 @@ mod test {
use super::*;
use crate::tree_sitter::{LanguageExt, StrDoc, TSLanguage};

// Shared helpers for test Language impls backed by tree-sitter-typescript.
fn tsx_kind_to_id(kind: &str) -> u16 {
let ts_lang: TSLanguage = tree_sitter_typescript::LANGUAGE_TSX.into();
ts_lang.id_for_node_kind(kind, /* named */ true)
}

fn tsx_field_to_id(field: &str) -> Option<u16> {
let ts_lang: TSLanguage = tree_sitter_typescript::LANGUAGE_TSX.into();
ts_lang.field_id_for_name(field).map(|f| f.get())
}

fn tsx_ts_language() -> TSLanguage {
tree_sitter_typescript::LANGUAGE_TSX.into()
}

#[derive(Clone, Debug)]
pub struct Tsx;
impl Language for Tsx {
fn kind_to_id(&self, kind: &str) -> u16 {
tsx_kind_to_id(kind)
let ts_lang: TSLanguage = tree_sitter_typescript::LANGUAGE_TSX.into();
ts_lang.id_for_node_kind(kind, /* named */ true)
}
fn field_to_id(&self, field: &str) -> Option<u16> {
self.get_ts_language()
Expand All @@ -126,70 +109,7 @@ mod test {
}
impl LanguageExt for Tsx {
fn get_ts_language(&self) -> TSLanguage {
tsx_ts_language()
}
}

/// A minimal `Language` impl that does *not* override `from_path`, used to
/// verify that the default implementation panics.
#[derive(Clone, Debug)]
struct NoFromPath;
impl Language for NoFromPath {
fn kind_to_id(&self, kind: &str) -> u16 {
tsx_kind_to_id(kind)
}
fn field_to_id(&self, field: &str) -> Option<u16> {
tsx_field_to_id(field)
}
#[cfg(feature = "matching")]
fn build_pattern(&self, builder: &PatternBuilder) -> Result<Pattern, PatternError> {
builder.build(|src| StrDoc::try_new(src, self.clone()))
}
}
impl LanguageExt for NoFromPath {
fn get_ts_language(&self) -> TSLanguage {
tsx_ts_language()
tree_sitter_typescript::LANGUAGE_TSX.into()
}
}

/// A `Language` impl that *does* override `from_path`, used to verify that
/// overriding the default works correctly.
#[derive(Clone, Debug)]
struct TsxWithFromPath;
impl Language for TsxWithFromPath {
fn kind_to_id(&self, kind: &str) -> u16 {
tsx_kind_to_id(kind)
}
fn field_to_id(&self, field: &str) -> Option<u16> {
tsx_field_to_id(field)
}
#[cfg(feature = "matching")]
fn build_pattern(&self, builder: &PatternBuilder) -> Result<Pattern, PatternError> {
builder.build(|src| StrDoc::try_new(src, self.clone()))
}
fn from_path<P: AsRef<Path>>(path: P) -> Option<Self> {
path.as_ref()
.extension()
.and_then(|e| e.to_str())
.filter(|&e| e == "tsx")
.map(|_| Self)
}
}
impl LanguageExt for TsxWithFromPath {
fn get_ts_language(&self) -> TSLanguage {
tsx_ts_language()
}
}

#[test]
#[should_panic(expected = "Language::from_path is not implemented for type")]
fn default_from_path_panics() {
let _ = NoFromPath::from_path("some/file.rs");
}

#[test]
fn overridden_from_path_does_not_panic() {
assert!(TsxWithFromPath::from_path("component.tsx").is_some());
assert!(TsxWithFromPath::from_path("main.rs").is_none());
}
}
Comment on lines 90 to 192
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is primarily about rule env deserialization errors, but this hunk also removes several tests around Language::from_path behavior. If the test removal isn’t required for the deserialization fix, consider splitting these language-trait doc/test changes into a separate PR (or updating the PR description) to keep the change set focused and lower review/merge risk.

Copilot uses AI. Check for mistakes.
2 changes: 1 addition & 1 deletion crates/ast-engine/src/replacer/indent.rs
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,7 @@ fn remove_indent<C: Content>(indent: usize, src: &[C::Underlying]) -> Vec<C::Und
stripped
})
.collect();
lines.join(&new_line).clone()
lines.join(&new_line)
}

#[cfg(test)]
Expand Down
30 changes: 15 additions & 15 deletions crates/flow/src/incremental/analyzer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -471,21 +471,21 @@ impl IncrementalAnalyzer {
}

// Save edges to storage in batch
if !edges_to_save.is_empty()
&& let Err(e) = self.storage.save_edges_batch(&edges_to_save).await
{
warn!(
error = %e,
"batch save failed, falling back to individual saves"
);
for edge in &edges_to_save {
if let Err(e) = self.storage.save_edge(edge).await {
warn!(
file_from = ?edge.from,
file_to = ?edge.to,
error = %e,
"failed to save edge individually"
);
if !edges_to_save.is_empty() {
if let Err(e) = self.storage.save_edges_batch(&edges_to_save).await {
warn!(
error = %e,
"batch save failed, falling back to individual saves"
);
for edge in &edges_to_save {
if let Err(e) = self.storage.save_edge(edge).await {
warn!(
file_from = ?edge.from,
file_to = ?edge.to,
error = %e,
"failed to save edge individually"
);
}
}
}
}
Expand Down
9 changes: 3 additions & 6 deletions crates/language/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1721,23 +1721,20 @@ pub fn from_extension(path: &Path) -> Option<SupportLang> {
}

// Handle extensionless files or files with unknown extensions
if let Some(_file_name) = path.file_name().and_then(|n| n.to_str()) {
if let Some(file_name) = path.file_name().and_then(|n| n.to_str()) {
// 1. Check if the full filename matches a known extension (e.g. .bashrc)
#[cfg(any(feature = "bash", feature = "all-parsers"))]
if constants::BASH_EXTS.contains(&_file_name) {
if constants::BASH_EXTS.contains(&file_name) {
return Some(SupportLang::Bash);
}

// 2. Check known extensionless file names
#[cfg(any(feature = "bash", feature = "all-parsers", feature = "ruby"))]
for (name, lang) in constants::LANG_RELATIONSHIPS_WITH_NO_EXTENSION {
if *name == _file_name {
if *name == file_name {
return Some(*lang);
}
}

// Silence unused variable warning if bash and ruby and all-parsers are not enabled
let _ = file_name;
}

// 3. Try shebang check as last resort
Expand Down
47 changes: 33 additions & 14 deletions crates/rule-engine/src/rule/deserialize_env.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ fn into_map<L: Language>(
.collect()
}

type OrderResult<T> = Result<T, String>;
type OrderResult<T> = Result<T, ReferentRuleError>;

/// A struct to store information to deserialize rules.
#[derive(Clone, Debug)]
Expand Down Expand Up @@ -79,22 +79,27 @@ struct TopologicalSort<'a, T: DependentRule> {
order: Vec<&'a str>,
// bool stands for if the rule has completed visit
seen: RapidMap<&'a str, bool>,
env: Option<&'a RuleRegistration>,
}

impl<'a, T: DependentRule> TopologicalSort<'a, T> {
fn get_order(maps: &RapidMap<String, T>) -> OrderResult<Vec<&str>> {
let mut top_sort = TopologicalSort::new(maps);
fn get_order(
maps: &'a RapidMap<String, T>,
env: Option<&'a RuleRegistration>,
) -> OrderResult<Vec<&'a str>> {
let mut top_sort = TopologicalSort::new(maps, env);
for key in maps.keys() {
top_sort.visit(key)?;
}
Ok(top_sort.order)
}

fn new(maps: &'a RapidMap<String, T>) -> Self {
fn new(maps: &'a RapidMap<String, T>, env: Option<&'a RuleRegistration>) -> Self {
Self {
maps,
order: vec![],
seen: RapidMap::default(),
env,
}
}

Expand All @@ -105,15 +110,20 @@ impl<'a, T: DependentRule> TopologicalSort<'a, T> {
return if completed {
Ok(())
} else {
Err(key.to_string())
Err(ReferentRuleError::CyclicRule(key.to_string()))
};
}
let Some(item) = self.maps.get(key) else {
// key can be found elsewhere
// e.g. if key is rule_id
// if rule_id not found in global, it can be a local rule
// if rule_id not found in local, it can be a global rule
// TODO: add check here and return Err if rule not found
if let Some(env) = self.env {
// Note: We only check if the key is completely missing
if !env.contains_match_rule(key) {
return Err(ReferentRuleError::UndefinedUtil(key.to_string()));
}
Comment on lines +121 to +125
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new presence check only consults contains_match_rule (local/global match rules). If the intent is to also error on missing rewriter references during deserialization (as described in the PR), this logic doesn’t cover rewriters, and missing rewriters can still be silently ignored elsewhere. Consider extending the validation to cover the rewriter scope as well (or clarifying the PR scope if rewriters are intentionally out of band).

Copilot uses AI. Check for mistakes.
}
return Ok(());
Comment on lines 116 to 127
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TopologicalSort::visit now returns UndefinedUtil when a dependency key is missing from maps and also absent from the env registration. There isn't a unit test in this module that asserts with_utils fails with RuleSerializeError::MatchesReference(UndefinedUtil(_)) for an undefined local util dependency (e.g. a: { matches: missing }). Adding such a test would lock in the new behavior and prevent regressions back to silent ignores.

Copilot uses AI. Check for mistakes.
};
// mark the id as seen but not completed
Expand Down Expand Up @@ -165,8 +175,7 @@ impl<L: Language> DeserializeEnv<L> {
self,
utils: &RapidMap<String, SerializableRule>,
) -> Result<Self, RuleSerializeError> {
let order = TopologicalSort::get_order(utils)
.map_err(ReferentRuleError::CyclicRule)
let order = TopologicalSort::get_order(utils, Some(&self.registration))
.map_err(RuleSerializeError::MatchesReference)?;
for id in order {
let rule = utils.get(id).expect("must exist");
Expand All @@ -182,8 +191,8 @@ impl<L: Language> DeserializeEnv<L> {
) -> Result<GlobalRules, RuleCoreError> {
let registration = GlobalRules::default();
let utils = into_map(utils);
let order = TopologicalSort::get_order(&utils)
.map_err(ReferentRuleError::CyclicRule)
let temp_env = RuleRegistration::from_globals(&registration);
let order = TopologicalSort::get_order(&utils, Some(&temp_env))
.map_err(RuleSerializeError::from)?;
Comment on lines -189 to 196
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_global_utils calls TopologicalSort::get_order(&utils, None), which means a matches: some-missing-global inside a global util rule will still be treated as an external dependency and won’t error during ordering. Because global rules are parsed with CheckHint::Global (which skips verify_util), this can leave undefined global dependencies undetected and cause silent non-matches at runtime. Passing an env (e.g. a temporary RuleRegistration::from_globals(&registration)) into get_order or otherwise making missing dependencies error for global utils would align with the PR’s goal of failing on missing referenced rules.

Copilot uses AI. Check for mistakes.
for id in order {
let (lang, core) = utils.get(id).expect("must exist");
Expand All @@ -204,10 +213,11 @@ impl<L: Language> DeserializeEnv<L> {
}

pub(crate) fn get_transform_order<'a>(
&self,
&'a self,
trans: &'a RapidMap<String, Trans<MetaVariable>>,
) -> Result<Vec<&'a str>, String> {
TopologicalSort::get_order(trans)
) -> Result<Vec<&'a str>, ReferentRuleError> {
// Transformations don't need env rule registration checks, pass None
TopologicalSort::get_order(trans, None)
}

pub fn with_globals(self, globals: &GlobalRules) -> Self {
Expand Down Expand Up @@ -277,7 +287,16 @@ local-rule:
)
.expect("failed to parse utils");
// should not panic
DeserializeEnv::new(TypeScript::Tsx).with_utils(&utils)?;
let registration = GlobalRules::default();
let core: crate::rule_core::SerializableRuleCore =
from_str("rule: {pattern: '123'}").unwrap();
let env_dummy = DeserializeEnv::new(TypeScript::Tsx).with_globals(&registration);
registration
.insert("global-rule", core.get_matcher(env_dummy).unwrap())
.unwrap();
DeserializeEnv::new(TypeScript::Tsx)
.with_globals(&registration)
.with_utils(&utils)?;
Ok(())
}

Expand Down
8 changes: 8 additions & 0 deletions crates/rule-engine/src/rule/referent_rule.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ impl<R> Registration<R> {
// it only insert new item to the RapidMap. It is safe to cast the raw ptr.
unsafe { &mut *(Arc::as_ptr(&self.0) as *mut RapidMap<String, R>) }
}

pub(crate) fn contains_key(&self, id: &str) -> bool {
self.0.contains_key(id)
}
}
pub type GlobalRules = Registration<RuleCore>;

Expand Down Expand Up @@ -83,6 +87,10 @@ impl RuleRegistration {
RegistrationRef { local, global }
}

pub(crate) fn contains_match_rule(&self, id: &str) -> bool {
self.local.contains_key(id) || self.global.contains_key(id)
}

pub(crate) fn insert_local(&self, id: &str, rule: Rule) -> Result<(), ReferentRuleError> {
if rule.check_cyclic(id) {
return Err(ReferentRuleError::CyclicRule(id.into()));
Expand Down
10 changes: 7 additions & 3 deletions crates/rule-engine/src/transform/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ mod parse;
mod rewrite;
mod string_case;
mod trans;
use crate::rule::referent_rule::ReferentRuleError;

use crate::{DeserializeEnv, RuleCore};

Expand Down Expand Up @@ -70,9 +71,12 @@ impl Transform {
.map(|(key, val)| val.parse(&env.lang).map(|t| (key.to_string(), t)))
.collect();
let map = map?;
let order = env
.get_transform_order(&map)
.map_err(TransformError::Cyclic)?;
let order = env.get_transform_order(&map).map_err(|e| match e {
ReferentRuleError::CyclicRule(s) => TransformError::Cyclic(s),
_ => unreachable!(
"get_transform_order uses None for env, so only CyclicRule is possible"
),
})?;
Comment on lines +74 to +79
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error mapping uses unreachable! for non-cyclic ReferentRuleError variants. Even if they are currently impossible, this will panic if get_transform_order ever starts returning UndefinedUtil (or other variants) and makes the call site fragile. Prefer handling all variants (e.g., add a TransformError variant that wraps ReferentRuleError, or keep get_transform_order’s error type limited to cyclic-only).

Copilot uses AI. Check for mistakes.
let transforms = order
.iter()
.map(|&key| (key.to_string(), map[key].clone()))
Expand Down
1 change: 1 addition & 0 deletions crates/rule-engine/src/transform/trans.rs
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,7 @@ if (true) {
Ok(())
}

// TODO: add a symbolic test for Rewrite
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_rewrite unit test was removed and replaced with a TODO. This is a regression in test coverage for the Rewrite transform and makes it easier for future changes to break rewrite parsing/behavior unnoticed. Please restore the test (or add an equivalent/updated one) rather than deleting it.

Suggested change
// TODO: add a symbolic test for Rewrite
#[test]
fn test_rewrite_is_send_sync() -> R {
fn assert_send_sync<T: Send + Sync>() {}
assert_send_sync::<Rewrite>();
Ok(())
}
// TODO: consider adding more behavioral tests for Rewrite parsing and execution.

Copilot uses AI. Check for mistakes.
#[test]
fn test_rewrite() -> R {
let trans = parse(
Expand Down
Loading
Loading