Skip to content

[browser][CoreCLR] GlobalizationNative_* P/Invokes are not trimmed under InvariantGlobalization #129242

@pavelsavara

Description

@pavelsavara

Related #129243

Summary

When building a Browser WASM app with the CoreCLR runtime flavor and
InvariantGlobalization=true, the managed-to-native P/Invoke table generator
(ManagedToNativeGeneratorPInvokeTableGenerator) still emits extern "C"
references to every GlobalizationNative_* entrypoint declared in
System.Private.CoreLib. Because the System.Globalization.Invariant feature
switch does not cause these [DllImport("libSystem.Globalization.Native")]
declarations to be trimmed away, the generated callhelpers-pinvoke.cpp
references native symbols that — in a truly invariant build — should not need to
exist at all.

Today this is masked at link time only because we link the ICU/globalization
archives unconditionally (see the workaround in
src/mono/browser/build/BrowserWasmApp.CoreCLR.targets). The underlying problem
is that the invariant feature switch does not let the trimmer remove the ICU
interop surface
, so:

  • The generated P/Invoke table is larger than necessary in invariant mode.
  • We are forced to link libSystem.Globalization.Native.a + libicuuc.a +
    libicui18n.a + libicudata.a even when the app opted into invariant
    globalization, inflating the linked dotnet.native.wasm and pulling in ICU
    data/code that should be eliminable.

This is tracked as a follow-up to the link-time fix; the link-time change keeps
builds working, but the trimming gap remains.

Affected area

  • src/tasks/WasmAppBuilder/coreclr/ManagedToNativeGenerator.cs
  • src/tasks/WasmAppBuilder/coreclr/PInvokeTableGenerator.cs
  • src/libraries/System.Private.CoreLib/src/ILLink/ILLink.Substitutions.Shared.xml
  • src/libraries/System.Private.CoreLib/src/System/Globalization/GlobalizationMode.cs
  • src/mono/browser/build/BrowserWasmApp.CoreCLR.targets (current link-time workaround)

Root cause analysis

There are two layered reasons the DllImports survive:

1. The relink scans the un-trimmed runtime-pack CoreLib

The CoreCLR relink path resolves System.Private.CoreLib directly from the
runtime pack's native directory:

<!-- BrowserWasmApp.CoreCLR.targets -->
<_CoreLibPath Condition="'$(_HasCoreLib)' != 'true'">$(MicrosoftNetCoreAppRuntimePackRidNativeDir)System.Private.CoreLib.dll</_CoreLibPath>

That assembly is the fully built, never-trimmed CoreLib shipped in the pack.
ManagedToNativeGenerator.ScanAssembly reads its IL metadata, so every
[DllImport("libSystem.Globalization.Native")] GlobalizationNative_* token is
present regardless of the app's InvariantGlobalization setting. A plain
dotnet build relink (as used by Wasm.Build.Tests IcuTests) does not run
ILLink over CoreLib at all.

2. The feature switch only constant-folds; it does not strip the ICU interop

The System.Globalization.Invariant switch is wired as a body substitution:

<!-- ILLink.Substitutions.Shared.xml -->
<type fullname="System.Globalization.GlobalizationMode">
  <method signature="System.Boolean get_Invariant()" body="stub" value="true"
          feature="System.Globalization.Invariant" featurevalue="true" />
</type>

This rewrites GlobalizationMode.get_Invariant() to return true and lets the
trimmer fold if (GlobalizationMode.Invariant) … guards. The
Interop.Globalization.GlobalizationNative_* extern methods are only removed if
every caller becomes statically unreachable, which does not happen:

  • Many ICU code paths (sort handles, casing, calendars, IDN, normalization) are
    reached through CompareInfo / TextInfo / CalendarData instances whose
    call graphs the trimmer cannot prove dead from a single folded bool.
  • Some entrypoints are reached via virtual dispatch or kept alive by other
    features.
  • The substitution's documented purpose is to let the GlobalizationMode.Settings
    nested class (the AppContext read + ICU load trigger) be trimmed — not to
    strip the ICU interop layer.

Net effect: even in a fully trimmed publish, the GlobalizationNative_*
DllImports generally remain reachable, so PInvokeTableGenerator emits them and
the linker needs the defining archives.

Steps to reproduce

The repro below uses the existing Wasm.Build.Tests IcuTests harness, which is
the most direct way to exercise the CoreCLR invariant relink.

  1. Build the Browser WASM CoreCLR runtime + libs:

    .\build.cmd -bl -os browser -subset clr+libs+host -c Debug /p:RuntimeFlavor=CoreCLR
  2. Run the invariant IcuTests scenario (this performs a native relink with
    InvariantGlobalization=true):

    $env:DOTNET_CLI_HOME="D:\runtime\artifacts\tmp\cli-home-icu"
    .\dotnet.cmd build .\src\mono\wasm\Wasm.Build.Tests\Wasm.Build.Tests.csproj `
      -c Debug -t:Test -p:TargetOS=browser -p:TargetArchitecture=wasm `
      -p:RuntimeFlavor=CoreCLR -p:TestUsingWorkloads=false `
      "-p:XUnitMethodName=Wasm.Build.Tests.IcuTests.FullIcuFromRuntimePackWithInvariant" `
      -bl:wbt-icu-invariant.binlog
  3. Inspect the generated P/Invoke table for one of the invariant build dirs
    (icu_Release_True_*):

    Select-String -Path (Get-ChildItem -Recurse -Filter callhelpers-pinvoke.cpp |
      Where-Object FullName -match 'icu_Release_True_' | Select-Object -First 1).FullName `
      -Pattern 'GlobalizationNative_'

Expected

In an invariant-globalization build, the generated P/Invoke table should contain
no (or only a minimal, justified set of) GlobalizationNative_* entries, and
the linker should not require libicuuc.a / libicui18n.a / libicudata.a.

Actual

The generated callhelpers-pinvoke.cpp contains the full set of
GlobalizationNative_* externs. The build only succeeds because the targets file
links the ICU/globalization archives unconditionally. If those archives were
linked only when InvariantGlobalization != 'true' (the original behavior), the
relink fails with:

wasm-ld: error: callhelpers-pinvoke.o: undefined symbol: GlobalizationNative_ChangeCase
... (and every other GlobalizationNative_* symbol)

Impact

  • Code size: invariant WASM CoreCLR apps still link ICU code + data, defeating
    one of the main benefits of invariant globalization (smaller download).
  • Correctness of the abstraction: the feature switch claims to remove
    globalization, but the native ICU surface is retained.
  • Workaround coupling: the link step is forced to always include ICU archives
    to keep the unconditional P/Invoke table satisfied.

Possible directions for the fix (to be evaluated later)

  1. Make the generator feature-aware: teach ManagedToNativeGenerator /
    PInvokeTableGenerator to skip the libSystem.Globalization.Native module
    (and emit unresolved/throwing stubs) when InvariantGlobalization=true,
    matching the runtime expectation that those entrypoints are never called.
  2. Improve trimming of the ICU interop: extend the substitution/feature
    wiring so that, in invariant mode, the Interop.Globalization.* externs and
    their reachable callers fold away, letting the trimmer remove them from a
    trimmed publish (does not help the un-trimmed runtime-pack relink path).
  3. Scan a representative (trimmed/feature-aware) CoreLib during relink instead
    of the raw runtime-pack copy, so generation reflects the app's feature switches.

Note: option (1) is the most targeted for the relink path, since that path does
not run ILLink over CoreLib and therefore cannot rely on trimming alone.

Related

  • Introduced by the CoreCLR in-tree relink work (PR [browser] CoreCLR in-tree relink #126946,
    commit cbb1e13b082), which added the unconditional
    _WasmPInvokeModules Include="libSystem.Globalization.Native" alongside the
    (then) conditional archive link.
  • Surfaced by PR [WASI][Mono] Enable Mono AOT on WASI in runtime-wasm CI leg #129098 enabling the optional CoreCLR IcuTests leg
    (excludeOptional: false).
  • Current link-time workaround: link ICU/globalization archives unconditionally
    in src/mono/browser/build/BrowserWasmApp.CoreCLR.targets (mirrors Mono).

Metadata

Metadata

Labels

arch-wasmWebAssembly architecturearea-System.Globalizationos-browserBrowser variant of arch-wasmsize-reductionIssues impacting final app size primary for size sensitive workloads

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions