Skip to content

Trojan Source mitigation in the lexer is incomplete #16500

@revofusion

Description

@revofusion

Description

Trojan Source (CVE-2021-42574 class) mitigation in the lexer is incomplete.

liblangutil/Scanner.cpp uses validateBiDiMarkup() to reject unbalanced bidirectional override markers in comments and string literals, but:

  • NatSpec doc comments (/// and /** ... */) are not validated, so unbalanced markers like U+202E (RLO) inside NatSpec are accepted.
  • The validator does not include the isolate markers U+2066..U+2069 (LRI/RLI/FSI/PDI), so unbalanced isolates are accepted even in regular comments/strings.

Impact: source can be rendered deceptively in BiDi-aware viewers (e.g., code visually appears inside a comment or tokens appear reordered) while the compiler compiles the real bytes, undermining human review and tooling assumptions.

Environment

  • Compiler version: 0.8.35-develop.2026.2.21+commit.30f20186 (local build from develop)
  • Compilation pipeline (legacy, IR, EOF): legacy (default)
  • Target EVM version (as per compiler settings): default
  • Framework/IDE: N/A (CLI)
  • EVM execution environment / backend / blockchain client: N/A (parsing stage)
  • Operating system: macOS (Darwin)

Steps to Reproduce

  1. Create a file with an unbalanced U+202E inside a NatSpec doc comment:
// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.0;

contract C {
    function f() external {
        bool isOwner = false;

        /** audit: enforce owner check ‮*/ isOwner = true; // contains unbalanced U+202E

        if (isOwner) { /* privileged path */ }
    }
}
  1. Compile (parsing is enough):

solc --stop-after parsing <file>.sol

Expected: compiler rejects unbalanced BiDi markers consistently (as it does for regular /* ... */ comments).
Actual: no directional-override error for the NatSpec case.

This is also accepted because validateBiDiMarkup() does not track isolate markers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug 🐛easy difficultylow effortThere is not much implementation work to be done. The task is very easy or tiny.low impactChanges are not very noticeable or potential benefits are limited.natspec

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions