Skip to content

Identify issues when 'char' is used as a numeric type. #5779

@hjmjohnson

Description

@hjmjohnson

Description

Key fact: In C/C++, the signedness of char is implementation-defined. Do not assume it.

Platform Summary

Platform / Toolchain char signedness Notes
x86 / x86-64 (Linux, GCC/Clang) Signed De-facto standard on desktop/server Linux
x86 / x86-64 (Windows, MSVC) Signed Consistent across MSVC versions
macOS (Intel, Clang) Signed Historical BSD behavior
macOS (Apple Silicon / ARM64) Signed Apple enforces signed char
ARM (Linux, GCC/Clang) Unsigned (often) Common on embedded and SBCs
ARM bare-metal / RTOS Unsigned (usually) Toolchain-dependent
PowerPC Unsigned (typically) Historically big-endian
SPARC Signed (typically) Rare today
MIPS Unsigned (typically) Embedded-heavy

Steps to Reproduce

Build on a platform where 'char' is unsigned.

Expected behavior

The behavior of data types is handled consistently across platforms.

Using 'char' as a numeric type is dangerous

Actual behavior

Templated itk::Image (and other locations in the codebase) where "unsigned char" is intended to be an unsigned 8-bit numeric type, and "char" is intended to be a signed 8-bit numeric type.

Reproducibility

Building on Linux ARM-based computers produced inconsistent behavior on most platforms.

Versions

All of ITK prior to addressing this issue.

Environment

#5137 build on Linux ARM in CI exposed a concerning inconsistency when 'char' is used as a numeric type.

A summary of platforms indicating the signed characteristics of the 'char' type.

Signedness of char by Platform

Compiler Controls

  • GCC / Clang
    • -fsigned-char
    • -funsigned-char
  • MSVC
    • char is always signed (no switch)

Best Practices

  • Never use char for numeric data.
  • Prefer:
    • signed char / int8_t for signed bytes
    • unsigned char / uint8_t for raw bytes
  • HDF5 conventions:
    • Disk: H5T_STD_U8LE or H5T_STD_I8LE
    • Memory: H5T_NATIVE_UINT8 or H5T_NATIVE_INT8
  • Treat char as text-only or legacy interop.

Bottom Line

If portability matters, char is a liability.
Make signedness explicit at the type level.

Metadata

Metadata

Assignees

Labels

type:CompilerCompiler support or related warnings

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions