CC cleanups and optimization by jgarzik · Pull Request #570 · rustcoreutils/posixutils-rs

jgarzik · 2026-03-23T14:12:35Z

No description provided.

Add cc/ir/hwmap.rs, a new pass running after linearize and before optimize. It inspects each IR instruction and decides whether the target handles it natively (Legal), via a runtime library call (LibCall), or via a comparison-expansion pattern (CmpLibCall). Phase 1: infrastructure — HwMapAction enum, TargetHwMap trait, X86_64HwMap/Aarch64HwMap, wired into pipeline. Phase 2a: int128 div/mod → LibCall (__divti3 etc.) on all targets. Phase 2b: int128↔float conversions → LibCall (__floattisf etc.). Phase 2c: long double ops on aarch64/Linux → LibCall/CmpLibCall (__addtf3, __negtf2, __lttf2+SetLt, __extendsftf2, etc.). x86_64 and macOS aarch64 remain Legal (native x87 / ld==double). Removes ~350 lines of rtlib decision logic from the linearizer and ~380 lines of dead RtlibNames methods + tests from rtlib.rs. 30 unit tests in hwmap.rs; all 159 integration tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

--dump-ir now accepts an optional stage name: post-linearize, post-hwmap, post-opt (default), post-lower, all Bare --dump-ir remains backward compatible (dumps post-opt). --dump-ir-func=<name> filters output to a single function. Also add codegen_float16_mega integration test covering arithmetic, negation, comparisons, float/double conversions, and compound assignment for _Float16 on x86-64. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

On x86-64, Float16 arithmetic/negation/comparisons have no native hardware support. Previously the linearizer intercepted these and emitted a promote-operate-truncate pattern inline. Now the linearizer emits generic FAdd/FSub/FMul/FDiv/FNeg/FCmpO* with Float16 type, and the hwmap pass detects these on x86-64 and expands them: - Arithmetic: __extendhfsf2(left), __extendhfsf2(right), native float op, __truncsfhf2(result) - Negation: __extendhfsf2(src), native FNeg, __truncsfhf2(result) - Comparisons: __extendhfsf2(left), __extendhfsf2(right), native float compare (no truncate — result is int) AArch64 has native FP16 and remains Legal. Removes ~230 lines from linearizer (5 helper methods + 3 intercepts). Removes dead RtlibNames::float16_needs_softfloat(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

complex_mul_name() and complex_div_name() centralize the target-dependent function name selection (__mulsc3/__muldc3/__mulxc3/ __multc3 etc.) in hwmap.rs. The linearizer calls these instead of RtlibNames methods. Removes complex_mul(), complex_div(), longdouble_is_double() from RtlibNames and their tests. Adds 6 unit tests in hwmap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New opcodes: Lo64, Hi64, Pair64, AddC, AdcC, SubC, SbcC, UMulHi. These enable shared int128 decomposition in hwmap.rs instead of duplicated per-backend code. hwmap now expands int128 operations into 64-bit sequences: - Bitwise (And/Or/Xor): Lo64+Hi64 each operand, 64-bit op, Pair64 - Neg: SubC(0,lo), SbcC(0,hi,carry), Pair64 - Not: Lo64+Hi64, Not each, Pair64 - Add: AddC(lo,lo), AdcC(hi,hi,carry), Pair64 - Sub: SubC(lo,lo), SbcC(hi,hi,borrow), Pair64 - Mul: cross-product via Mul+UMulHi+Add - Eq/Ne: xor+or reduction, then 64-bit compare - Ordered comparisons: hi compare + Select(hi_eq, lo_cmp, hi_cmp) - Zext/Sext: Pair64 with zero/sign-extended halves Backend support for all 8 opcodes on x86_64 and aarch64. Fix x86_64 int128_pseudos to not mark Lo64/Hi64/Pair64 operands incorrectly (caused Sext misrouting to emit_int128_extend). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Conversion opcodes (Sext, Zext, Trunc, FCvtS, FCvtU, SCvtF, UCvtF, FCvtF) now display as e.g. sext.32to64 instead of sext.64, making the source width visible in --dump-ir output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Now that hwmap expands int128 add/sub/mul/bitwise/neg/not/comparisons/ zext/sext into 64-bit sequences using Lo64/Hi64/Pair64/AddC/etc., the per-backend implementations are dead code. x86_64: removed emit_int128_mul (~120 lines), emit_int128_div (~50), emit_int128_compare (~145), emit_int128_unary (~55), Zext/Sext-to-128 in emit_int128_extend (~75), Add/Sub/And/Or/Xor in emit_int128_binop (~100), int128_src2_lo/hi_operand helpers (~50). Total: ~645 lines. aarch64: removed Add/Sub/And/Or/Xor/Mul from emit_int128_binop (~140), emit_int128_div (~50), emit_int128_compare (~120), emit_int128_unary (~50), Zext/Sext-to-128 in emit_extend (~75). Removed dead LIR variants MAdd/Negs/Ngc (~60). Total: ~550 lines. Only int128 shifts (Shl/Lsr/Asr) remain in the backends — these require arch-specific branching (SHLD/SHRD vs LSL+LSR+ORR). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fixes from self-review: 1. expand_insn: explicit dispatch instead of fallthrough to expand_float16. Panics on unhandled Expand with context. 2. map_int128_expand: comparison arms now verify insn.typ is Int128, preventing misrouting of 128-bit long double comparisons if map_common ordering ever changes. 3. Remove 7 duplicate func.add_pseudo calls after alloc_reg64 (which already registers the pseudo). 4. Factor out extract_halves() helper — replaces ~120 lines of repeated Lo64+Hi64 pairs across 8 expansion arms. 5. Simplify Float16 comparison detection — remove redundant inner matches!() and dead `let _ = typ`. 6. Validate --dump-ir stage names early with helpful error message. 7. Add codegen_int128_carry_chain_optimized integration test that exercises AddC→AdcC carry propagation, SubC→SbcC borrow, and cross-boundary multiply under -O2. Confirms optimizer does not break the flag-chain invariant. 8. Add explanatory comments on int128_load_lo/hi Loc::Reg handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move arch-specific instruction mapping logic from the monolithic cc/ir/hwmap.rs into cc/arch/{x86_64,aarch64}/mapping.rs, with shared trait and helpers in cc/arch/mapping.rs. This follows the existing cc/arch/codegen.rs pattern of shared trait + per-arch implementations. Key design changes: - Replace HwMapAction enum + shared dispatcher with ArchMapper trait where arch code directly builds replacement IR via MappedInsn::Replace - Split expand_int128 monolith into 10 individual functions - Split expand_float16 into 3 individual functions - Float16 soft-float handling stays in x86_64 mapper only - Long double rtlib handling stays in aarch64 mapper only - Shared test helpers extracted into test_helpers module Pure refactoring — zero behavioral changes. All 161 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add Function::create_reg_pseudo() and Instruction::call_with_abi() as canonical IR primitives for passes that synthesize instructions. These replace duplicated patterns across mapping.rs (alloc_reg64), lower.rs (open-coded pseudo alloc), and the 6+ copies of the ABI-classify-and- build-call sequence (build_rtlib_call, build_rtlib_call_explicit, RtlibCallParams — all deleted). Migrate int128 constant shifts (Shl/Lsr/Asr) from backend codegen to the mapping pass as expand_int128_const_shl/lsr/asr. Variable shifts remain in backends (require arch-specific branching). Migrate Float16 conversions from linearizer (emit_float16_convert_call) to the x86_64 mapping pass. The linearizer now emits standard FCvtF/ FCvtS/FCvtU/SCvtF/UCvtF ops; the mapper lowers Float16 variants to rtlib calls with proper compiler-rt vs libgcc ABI dispatch. Fix uint128 large constant sign-extension: (*v >> 64) as i64 sign- extends for values like 0xFFFFFFFFFFFFFFFF; change to as u64 as i64 in three x86_64 locations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The mapping pass now expands all int128 binops except variable shifts (Shl/Lsr/Asr). Replace the silent `_ => {}` catch-all in emit_int128_binop with a panic so any unexpanded opcode reaching the backend is caught immediately instead of silently dropped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…p table Add QUOTE and COMMENT flags to the character classification, build the table via const fn at compile time, and use pre-computed class bits in get_special() to dispatch string literals and comments. Includes unit tests covering every byte category. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pre-intern ~240 well-known strings (C keywords, builtins, attribute names, preprocessor directives) at StringTable creation time. Each gets a deterministic StringId and a u32 tag bitmask, replacing string comparisons with integer comparisons in all hot parser/preprocessor paths. New file cc/kw.rs provides: define_keywords! macro, 14 tag bit constants, DECL_START composite, has_tag()/tags() query API, and 11 unit tests. Converted dispatch sites: is_declaration_start (43 arms), parse_type_specifiers (35 arms), parse_statement (12 arms), try_parse_type_name (25 arms), consume_type_qualifiers (5 arms), builtin dispatch (60 arms), handle_directive (14 arms), is_type_keyword, is_nullability_qualifier, is_attribute_keyword, is_asm_keyword, is_static_assert, is_builtin, is_supported_attribute, sizeof/alignof, pointer/array qualifier parsing (4 sites), and parse_asm_statement qualifiers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR refactors the cc (pcc) compiler to replace many string-based keyword/builtin/directive checks with deterministic StringId/tag lookups, and centralizes target-specific lowering decisions into a new “mapping” pass that expands complex IR ops into simpler sequences or rtlib calls.

Changes:

Add a pre-interned keyword system (cc/kw.rs) with tag-based membership checks, and update parser/preprocessor/builtin checks to use StringId comparisons instead of &str.
Introduce an architecture mapping pass (shared + per-arch mappers) to lower int128 ops, Float16 soft-float on x86-64, and long double rtlib calls on aarch64/Linux.
Enhance developer tooling and tests: staged --dump-ir, lexer char-classification table + tests, and expanded codegen/IR tests for Float16 and int128 behaviors.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
cc/token/preprocess.rs	Switch directive dispatch and `__has_*` evaluation to `StringId`/tag-based checks.
cc/token/lexer.rs	Replace match-based char classification with a compile-time lookup table; update tokenization and add unit tests.
cc/tests/codegen/misc.rs	Add large codegen regression tests for Float16 and int128 carry/shift/constant behaviors.
cc/strings.rs	Pre-intern all keyword strings at startup to ensure deterministic `StringId` assignments.
cc/rtlib.rs	Remove complex/longdouble/int128 rtlib name selection helpers and their tests (logic moved elsewhere).
cc/parse/parser.rs	Replace many keyword string comparisons with `crate::kw` `StringId` constants/tags.
cc/parse/mod.rs	Change nullability qualifier predicate to operate on `StringId` via tags.
cc/main.rs	Add `kw` module, recursion limit, staged `--dump-ir` with optional function filter, and integrate mapping pass in pipeline.
cc/lib.rs	Export `kw` module and set recursion limit for macro-heavy keyword generation.
cc/kw.rs	New keyword/tag system: deterministic IDs + bitmask tags + tests.
cc/ir/test_linearize.rs	Update Float16 conversion tests to expect IR conversion opcodes (mapping pass performs rtlib lowering).
cc/ir/mod.rs	Add new opcodes for int128 decomposition, add `Instruction::call_with_abi`, improve instruction display for conversions, add `Function::create_reg_pseudo`.
cc/ir/lower.rs	Use `Function::create_reg_pseudo` helper when creating temporaries.
cc/ir/linearize.rs	Remove direct rtlib emission/Float16 soft-float helpers and rely on mapping pass; route complex rtlib name selection through mapping utilities.
cc/builtins.rs	Add `is_builtin_id` fast-path using keyword tags.
cc/arch/x86_64/regalloc.rs	Adjust int128 pseudo tracking for new decomposition opcodes.
cc/arch/x86_64/mod.rs	Expose x86-64 mapping module.
cc/arch/x86_64/mapping.rs	New x86-64 mapper: expand int128 ops, map int128 div/mod + int128↔float, and expand Float16 soft-float sequences.
cc/arch/x86_64/expression.rs	Remove backend-handled int128 ops now expanded by mapping pass; add support for new decomposition ops and fix hi-half constant handling.
cc/arch/x86_64/codegen.rs	Dispatch new decomposition opcodes (Lo64/Hi64/Pair64/AddC/AdcC/SubC/SbcC/UMulHi).
cc/arch/mod.rs	Export the new shared mapping module.
cc/arch/codegen.rs	Fix hi-half emission of 128-bit immediates to avoid sign-extension artifacts.
cc/arch/aarch64/mod.rs	Expose aarch64 mapping module.
cc/arch/aarch64/mapping.rs	New aarch64 mapper: expand int128 ops, map int128 div/mod + int128↔float, and lower long double ops via rtlib on Linux.
cc/arch/aarch64/lir.rs	Remove AArch64 LIR ops that were used by backend int128 handling (now mapping-expanded).
cc/arch/aarch64/expression.rs	Limit backend int128 handling to shifts + truncation-from-128; add support for new decomposition ops.
cc/arch/aarch64/codegen.rs	Dispatch new decomposition opcodes (Lo64/Hi64/Pair64/AddC/AdcC/SubC/SbcC/UMulHi).

cc/strings.rs

cc/token/preprocess.rs

cc/builtins.rs

Use `_` as the name for keyword entries that are only needed for interning and tag-based membership (nullability qualifiers, attribute names, fortified builtins, etc.). The define_ids! macro skips pub const emission for `_` entries, eliminating 71 dead_code warnings without #[allow(dead_code)]. Also fix redundant_closure clippy lint in is_nullability_qualifier call, and remove unused tags() function. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Keyword ID determinism is a correctness requirement — wrong IDs cause silent misparsing in release builds. Use unconditional assert_eq!. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extract first_tok early and return false if args[0] is empty, avoiding unwrap() panic on __has_builtin()/__has_attribute() with no tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…sync Cross-checks that every entry in the SUPPORTED_BUILTINS string list is pre-interned in kw.rs with the BUILTIN tag, preventing the two sources from silently diverging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

emit_float_to_float() only recognized Float↔Double conversions in its needs_convert check, causing Float16↔Float/Double conversions to emit fmov (bit-copy) instead of fcvt (type conversion). The float16 bit pattern was zero-extended to 32 bits rather than properly converted, producing wrong values for all float16 arithmetic on aarch64. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jgarzik and others added 13 commits March 22, 2026 22:57

jgarzik requested a review from Copilot March 23, 2026 14:12

jgarzik self-assigned this Mar 23, 2026

jgarzik added the enhancement New feature or request label Mar 23, 2026

jgarzik changed the title ~~Updates~~ CC cleanups and optimization Mar 23, 2026

Copilot started reviewing on behalf of jgarzik March 23, 2026 14:13 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

cc/strings.rs Outdated Show resolved Hide resolved

cc/token/preprocess.rs Show resolved Hide resolved

cc/builtins.rs Show resolved Hide resolved

jgarzik and others added 5 commits March 23, 2026 14:23

cc: promote keyword ID assertions from debug_assert to assert

2b6a81d

Keyword ID determinism is a correctness requirement — wrong IDs cause silent misparsing in release builds. Use unconditional assert_eq!. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cc: fix potential panic in eval_has_builtin on empty macro args

4f4f46d

Extract first_tok early and return false if args[0] is empty, avoiding unwrap() panic on __has_builtin()/__has_attribute() with no tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jgarzik merged commit aecc388 into main Mar 23, 2026
6 checks passed

jgarzik deleted the updates branch March 23, 2026 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CC cleanups and optimization#570

CC cleanups and optimization#570
jgarzik merged 18 commits intomainfrom
updates

jgarzik commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jgarzik commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants