Skip to content

Better support for string conversions#86

Open
konard wants to merge 3 commits intomainfrom
issue-76-2ce7e576
Open

Better support for string conversions#86
konard wants to merge 3 commits intomainfrom
issue-76-2ce7e576

Conversation

@konard
Copy link
Member

@konard konard commented Sep 13, 2025

Summary

This PR implements better support for string conversions as requested in issue #76. The implementation adds specialized converters for UTF string types while maintaining backward compatibility.

✨ Features Added

  • Bidirectional conversion between std::u8string and std::string

    • Direct memory reinterpretation for efficient conversion
    • Assumes UTF-8 encoding for std::string (as per issue requirements)
  • Chain conversions for UTF-16 and UTF-32 strings

    • std::u16stringstd::u8stringstd::string
    • std::u32stringstd::u8stringstd::string
    • Proper UTF-8 encoding for multi-byte characters
  • Comprehensive UTF-8 encoding support

    • Handles 1-byte ASCII characters (0x00-0x7F)
    • Handles 2-byte characters (0x80-0x7FF)
    • Handles 3-byte characters (0x800-0xFFFF)
    • Handles 4-byte characters (0x10000-0x10FFFF) including emojis

🧪 Test Coverage

Added comprehensive test cases covering:

  • Basic ASCII string conversions
  • Unicode character conversions (accented characters)
  • Emoji conversions (4-byte UTF-8)
  • Bidirectional conversion verification
  • Chain conversion validation

All tests pass successfully with the new implementation.

🔧 Implementation Details

  • Template specializations: Following the existing code pattern and avoiding large function modifications as suggested in the issue comments
  • Memory efficient: Direct reinterpretation for std::stringstd::u8string conversions
  • Standards compliant: Proper UTF-8 encoding implementation for multi-byte sequences
  • Chain pattern: Reuses conversion logic for maintainability

📋 Supported Conversions

Source Target Method
std::u8string std::string Direct memory reinterpretation
std::string std::u8string Direct memory reinterpretation
std::u16string std::u8string UTF-8 encoding
std::u32string std::u8string UTF-8 encoding
std::u16string std::string Chain: u16 → u8 → string
std::u32string std::string Chain: u32 → u8 → string

⚠️ Requirements

As specified in the issue, these conversions work correctly only when std::string is configured to use UTF-8 encoding by the compiler.


Fixes #76

🤖 Generated with Claude Code

Adding CLAUDE.md with task information for AI processing.
This file will be removed when the task is complete.

Issue: #76
@konard konard self-assigned this Sep 13, 2025
Add specialized converters for UTF string types as requested in issue #76:
- std::u8string <-> std::string bidirectional conversion
- std::u16string -> std::u8string -> std::string conversion chain
- std::u32string -> std::u8string -> std::string conversion chain
- Proper UTF-8 encoding for multi-byte characters
- Comprehensive test coverage for all conversion scenarios

The implementation follows the existing code style and uses template
specializations to keep the functions focused and maintainable.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@konard konard changed the title [WIP] Better support for string convertions Better support for string conversions Sep 13, 2025
@konard konard marked this pull request as ready for review September 13, 2025 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Better support for string convertions

1 participant