|
| 1 | +# RBS File Encoding |
| 2 | + |
| 3 | +## Best Practice |
| 4 | + |
| 5 | +**Use UTF-8** for both file encoding and your system locale. |
| 6 | + |
| 7 | +## Supported Encodings |
| 8 | + |
| 9 | +RBS parser supports ASCII-compatible encodings (similar to Ruby's script encoding support). |
| 10 | + |
| 11 | +**Examples**: UTF-8, US-ASCII, Shift JIS, EUC-JP, ... |
| 12 | + |
| 13 | +## Unicode Codepoint Symbols |
| 14 | + |
| 15 | +String literal types in RBS can contain Unicode codepoint escape sequences (`\uXXXX`). |
| 16 | + |
| 17 | +When the file encoding is UTF-8, the parser translates Unicode codepoint symbols: |
| 18 | + |
| 19 | +```rbs |
| 20 | +# In UTF-8 encoded files |
| 21 | +
|
| 22 | +type t = "\u0123" # Translated to the actual Unicode character ģ |
| 23 | +type s = "\u3042" # Translated to the actual Unicode character あ |
| 24 | +``` |
| 25 | + |
| 26 | +When the file encoding is not UTF-8, Unicode escape sequences are interpreted literally as the string `\uXXXX`: |
| 27 | + |
| 28 | +```rbs |
| 29 | +# In non-UTF-8 encoded files |
| 30 | +
|
| 31 | +type t = "\u0123" # Remains as the literal string "\u0123" |
| 32 | +``` |
| 33 | + |
| 34 | +## Implementation |
| 35 | + |
| 36 | +RBS gem currently doesn't do anything for file encoding. It relies on Ruby's encoding handling, specifically `Encoding.default_external` and `Encoding.default_internal`. |
| 37 | + |
| 38 | +`Encoding.default_external` is the encoding Ruby assumes when it reads external resources like files. The Ruby interpreter sets it based on the locale. `Encoding.default_internal` is the encoding Ruby converts the external resources to. The default is `nil` (no conversion.) |
| 39 | + |
| 40 | +When your locale is set to use `UTF-8` encoding, `default_external` is `Encoding::UTF_8`. So the RBS file content read from the disk will have UTF-8 encoding. |
| 41 | + |
| 42 | +### Parsing non UTF-8 RBS source text |
| 43 | + |
| 44 | +If you want to work with another encoding, ensure the source string has ASCII compatible encoding. |
| 45 | + |
| 46 | +```ruby |
| 47 | +source = '"日本語"' |
| 48 | +RBS::Parser.parse_type(source.encode(Encoding::EUC_JP)) # => Parses successfully |
| 49 | +RBS::Parser.parse_type(source.encode(Encoding::UTF_32)) # => Returns `nil` since UTF-32 is not ASCII compatible |
| 50 | +``` |
| 51 | + |
| 52 | +### Specifying file encoding |
| 53 | + |
| 54 | +Currently, RBS doesn't support specifying file encoding directly. |
| 55 | + |
| 56 | +You can use `Encoding.default_external` while the gem loads RBS files from the storage. |
0 commit comments