Skip to content

Bidi CCSID Support via Extension Settings#2826

Open
yuvalnn wants to merge 28 commits intocodefori:masterfrom
yuvalnn:bidi-support
Open

Bidi CCSID Support via Extension Settings#2826
yuvalnn wants to merge 28 commits intocodefori:masterfrom
yuvalnn:bidi-support

Conversation

@yuvalnn
Copy link
Copy Markdown

@yuvalnn yuvalnn commented Aug 2, 2025

Changes

This PR introduces configurable automatic CCSID conversion for non-UTF compatible source members (including BiDi CCSIDs such as 424, 420, etc.).

Features

  • New extension settings under "Source Code":
  1. Automatic Conversion for non UTF compatible CCSIDs
  2. Source CCSID
  3. Target CCSID
  • Conversion logic only runs when enabled in the settings.
  • Encoding is handled via IFS-based copy to avoid corruption.
  • The implementation is cleanly isolated to avoid side effects.
  • Added full compatibility with "Source Dates" mode.

How to test this PR

  1. Open the extension settings → "Source Code" tab.
  2. Enable Automatic Conversion for non UTF compatible CCSIDs.
  3. Select a Source CCSID.
  4. Select a Target CCSID.
  5. Open a member encoded with the selected Source CCSID.
  6. Verify the content is displayed correctly and saved as expected.
  7. Open a member with a different CCSID and confirm no conversion occurs.
  8. Disable conversion and confirm default behavior.

Checklist

  • have tested my change
  • have created one or more test cases
  • updated relevant documentation
  • Remove any/all console.logs I added
  • have added myself to the contributors' list in CONTRIBUTING.md

yuvalnn added 4 commits July 25, 2025 19:38
- Added two new extension settings(index.ts):
  1. Enable Bidi conversion
  2. Target CCSID for Bidi conversion

- Modified IBMiContent to apply Bidi-aware conversion logic
  when enabled via user settings, including fallback and error handling.
Extends the Bidi handling to also apply when uploading source members back to the IBM i.
@worksofliam worksofliam self-requested a review August 11, 2025 14:42
@worksofliam
Copy link
Copy Markdown
Member

@yuvalnn This is look like a good start, but we're going to need some granular tests in encoding.test.ts to make sure this works long term. Can you look into adding some? Let me know if you have any questions. Thanks for your work so far!

- Added BiDi test cases in src/api/tests/suites/encoding.test.ts
- Covers Hebrew (CCSID 424) and Arabic (CCSID 420)
- Includes helper convertToUTF8WithCCSID and edge cases
@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Aug 16, 2025

@worksofliam I’ve added a new synchronous test suite in encoding.test.ts to cover the BiDi changes, thanks to the work from @Itsanexpriment

The suite includes:
Valid/invalid conversions for CCSID 420 (Arabic) and 424 (Hebrew)
Centralized BidiContents test data object for easier extension
Helper convertToUTF8WithCCSID (similar to runCommandsWithCCSID)

We kept this in a separate suite to avoid interfering with existing async tests.
All tests are passing on my side

@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Sep 21, 2025

Hi @sebjulliand , could you please review this PR?

@sebjulliand sebjulliand self-requested a review October 1, 2025 14:07
@sebjulliand
Copy link
Copy Markdown
Member

Hi @yuvalnn , I finally have time to look at this, sorry about the delay.
Quick question after looking the code: I like the fact that it's been well isolated, but do you think we could simply automate the detection of the need of Bidi conversion, based on the source file's CCSID?

Something like: if target source file's CCSID is 420 or 424, then convert using the special Bidi method.
Could we use a hardcoded map th know which CCSSID to use to convert based on the source file CCSID? (424 => 62211, 420 => 8612 , etc).

What do you think?

@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Oct 3, 2025

Hi @sebjulliand, thanks for the review.
That was actually my initial thought as well, but I wanted to avoid always checking the CCSID unless the user enabled it.
Thinking about it again, your approach is simpler and with no user involvement required, which seems better.
If needed, we can extend the CCSID map later.

I'll make the change and update.

@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Oct 7, 2025

Hi @sebjulliand after further checking there are situations where automation is not enough.
Hebrew or Arabic with CCSID 62211/8612 usually works best, but not always.

  • If the file CCSID is 65535, we might need to look at an override
  • Support for more languages may be needed in the future (I just found out Persian is needed as well)
  • Doesn’t cover complex RTL/LTR cases.

By the way, other IBM open-source projects, like JTOpen, follow the same approach: the user explicitly enables Bidi and chooses the algorithm (similar to CCSID).
So I think it’s better to keep it like this.

Itsanexpriment and others added 2 commits October 19, 2025 14:12
* Convert to Utf8 with bidi compatible ccsid

* Convert from Utf8 to bidi ccsid when uploading member with source dates
@SanjulaGanepola SanjulaGanepola self-requested a review November 21, 2025 20:24
@sebjulliand
Copy link
Copy Markdown
Member

Hello there (no, you haven't been forgotten... 😉). Something crossed my mind: shouldn't the isBidi test also include a test of the source file CCSID ? The underlying question would be : what happens if you enable bidi support but you're going to work on a non-bidi source file at some point?

@yuvalnn yuvalnn temporarily deployed to testing_environment January 13, 2026 08:36 — with GitHub Actions Inactive
@sebjulliand
Copy link
Copy Markdown
Member

@yuvalnn any comments on #2826 (comment) ?

@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Feb 1, 2026

Hi @sebjulliand,
I checked this in more depth. For non-bidi source file(e.g 37), things work fine with English-only content, but non-ASCII characters can get corrupted on upload. The current direction is to slightly adjust the UI so it's clearer that this involves CCSID conversion.

@worksofliam
Copy link
Copy Markdown
Member

@sebjulliand @yuvalnn what is the status here? should we do something?

}
}

async function convertToUTF8WithCCSID(connection: IBMi, text: string, baseCcsid: string, intermediateCcsid: string): Promise<string> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CCSIDs are numeric, so string should be number

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled internally. the value is normalized inside conversion function, so typing doesn't affect behavior.

let insertResult: CommandResult = { code: 0, stdout: '', stderr: '' };
if (isBidi) {
await connection.runSQL([
`@QSYS/CPY OBJ('${tempRmt}') TOOBJ('${tempRmt}') TOCCSID(${bidiCcsid}) DTAFMT(*TEXT) REPLACE(*YES);`,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

above we are always setting the CCSID tag of tempRmt to 1208. What does CPY actually do when TOCCSID is used? Is there conversion or does it just change the tag?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPY with TOCCSID performs actual conversion, not just tag change.


// BiDi handling if enabled
const isBidi = this.config.bidi;
const bidiCcsid = this.config.bidiCcsid;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type is missing in config

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added according to the new structure

- Add helper function uploadWithoutBidiDownloadWithBidi to test non-BiDi files
- Add helper function testFullCycleWithBidi to test full upload/download cycle
- Add tests for CCSID 37 and 273 with BiDi CCSID enabled
- Fix copyToImport to use stmfCcsid variable instead of hardcoded 1208
- Add Tools.determineCcsidConversion() to check if conversion is needed
- Check source file CCSID before applying conversion in download/upload
- Only convert if source CCSID matches configured ccsidConvertFrom
- Update tests to use new config fields
- Add CCSID conversion to extendedContent and copyToImport
- Separate non-BiDi tests into dedicated suite for better filtering
- Add test case: file with CCSID 273 downloaded with BiDi settings 424→62211
- Verifies conversion is skipped when source CCSID doesn't match config
- Update uploadWithoutBidiDownloadWithBidi to support optional sourceCcsid parameter
@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Feb 12, 2026

Hi @sebjulliand @worksofliam ,
Changes were made following feedback. Before performing any conversion, we now verify the source file actually matches the requested conversion criteria. if the source is not bidi, no conversion is applied, even when CCSID handling is enabled. Non-bidi sources remain unaffected.

The UI was updated to require explicit selection of both source and target CCSID.
Relevant tests were added.

@yuvalnn yuvalnn had a problem deploying to testing_environment February 22, 2026 13:48 — with GitHub Actions Failure
@sebjulliand sebjulliand added the enhancement New feature or request label Feb 23, 2026
@sebjulliand sebjulliand self-assigned this Feb 23, 2026
Copy link
Copy Markdown
Member

@sebjulliand sebjulliand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work so far @yuvalnn .
There isa conflict to resolve and some changes to be done before carrying on.

Thanks in advance!

Comment on lines +19 to +21
ccsidConversionEnabled: boolean;
ccsidConvertFrom: string;
ccsidConvertTo: string;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default values for these three fields must be provided here (ince it's missing, it's producing an error):

Comment on lines +198 to +224
/**
* Determines if CCSID conversion should be applied based on configuration and source file CCSID.
* @param sourceCcsid The CCSID of the source file
* @param config Connection configuration containing conversion settings
* @returns [requiresConversion, targetCcsid] - whether conversion is required and the target CCSID to use
*/
export function determineCcsidConversion(sourceCcsid: number, config: { ccsidConversionEnabled?: boolean, ccsidConvertFrom?: string, ccsidConvertTo?: string }): [boolean, number] {
// If conversion is not enabled, don't convert
if (!config.ccsidConversionEnabled) {
return [false, 0];
}

const configuredSourceCcsid = Number(config.ccsidConvertFrom) || 0;
const configuredTargetCcsid = Number(config.ccsidConvertTo) || 0;

// If no source or target CCSID configured, don't convert
if (configuredSourceCcsid === 0 || configuredTargetCcsid === 0) {
return [false, 0];
}

// Only convert if the source file CCSID matches the configured source CCSID
if (sourceCcsid === configuredSourceCcsid) {
return [true, configuredTargetCcsid];
}

return [false, 0];
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return an object instead of an array here. An object directly gives a view of what is being returned by the function. An array doesn't and forces whoever uses the function to look at its description to figure out what's returned.

The return type should be:

{ requiresConversion: boolean, targetCcsid: number }

Comment on lines +184 to +204
let insertResult: CommandResult = { code: 0, stdout: '', stderr: '' };
if (requiresConversion) {
await connection.runSQL([
`@QSYS/CPY OBJ('${tempRmt}') TOOBJ('${tempRmt}') TOCCSID(${targetCcsid}) DTAFMT(*TEXT) REPLACE(*YES)`
].join("\n")).catch(e => {
insertResult.code = -1;
insertResult.stderr = String(e);
});

if (insertResult.code === 0) {
insertResult = await connection.runCommand({
command: `QSYS/RUNSQLSTM SRCSTMF('${tempRmt}') COMMIT(*NONE) NAMING(*SQL)`,
noLibList: true
});
}
} else {
insertResult = await connection.runCommand({
command: `QSYS/RUNSQLSTM SRCSTMF('${tempRmt}') COMMIT(*NONE) NAMING(*SQL)`,
noLibList: true
});
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RUNSQLSTM call may not be duplicated here. We can safely test insertResult to run it

Suggested change
let insertResult: CommandResult = { code: 0, stdout: '', stderr: '' };
if (requiresConversion) {
await connection.runSQL([
`@QSYS/CPY OBJ('${tempRmt}') TOOBJ('${tempRmt}') TOCCSID(${targetCcsid}) DTAFMT(*TEXT) REPLACE(*YES)`
].join("\n")).catch(e => {
insertResult.code = -1;
insertResult.stderr = String(e);
});
if (insertResult.code === 0) {
insertResult = await connection.runCommand({
command: `QSYS/RUNSQLSTM SRCSTMF('${tempRmt}') COMMIT(*NONE) NAMING(*SQL)`,
noLibList: true
});
}
} else {
insertResult = await connection.runCommand({
command: `QSYS/RUNSQLSTM SRCSTMF('${tempRmt}') COMMIT(*NONE) NAMING(*SQL)`,
noLibList: true
});
}
if (requiresConversion) {
await connection.runSQL([
`@QSYS/CPY OBJ('${tempRmt}') TOOBJ('${tempRmt}') TOCCSID(${targetCcsid}) DTAFMT(*TEXT) REPLACE(*YES)`
].join("\n")).catch(e => {
insertResult.code = -1;
insertResult.stderr = String(e);
});
}
if(insertResult.code === 0){
insertResult = await connection.runCommand({
command: `QSYS/RUNSQLSTM SRCSTMF('${tempRmt}') COMMIT(*NONE) NAMING(*SQL)`,
noLibList: true
});
}

Comment on lines +29 to +1215
const CCSID_Options:SelectItem[] = [
{
"value": "37",
"description": "37 - US, Canada, Netherlands, Portugal, Brazil, New Zealand, Australia",
"text": ""
},
{
"value": "256",
"description": "256 - Netherlands",
"text": ""
},
{
"value": "273",
"description": "273 - Austria, Germany",
"text": ""
},
{
"value": "277",
"description": "277 - Denmark, Norway",
"text": ""
},
{
"value": "278",
"description": "278 - Finland, Sweden",
"text": ""
},
{
"value": "280",
"description": "280 - Italy",
"text": ""
},
{
"value": "284",
"description": "284 - Spanish, Latin America",
"text": ""
},
{
"value": "285",
"description": "285 - United Kingdom",
"text": ""
},
{
"value": "290",
"description": "290 - Japan Katakana",
"text": ""
},
{
"value": "297",
"description": "297 - France",
"text": ""
},
{
"value": "300",
"description": "300 - Japan English",
"text": ""
},
{
"value": "301",
"description": "301 - Japanese PC Data",
"text": ""
},
{
"value": "367",
"description": "367 - ANSI X3.4 ASCII standard; USA",
"text": ""
},
{
"value": "420",
"description": "420 - Arabic-speaking countries",
"text": ""
},
{
"value": "423",
"description": "423 - Greece",
"text": ""
},
{
"value": "424",
"description": "424 - Hebrew",
"text": ""
},
{
"value": "425",
"description": "425 - Arabic-speaking countries",
"text": ""
},
{
"value": "437",
"description": "437 - PC Data; PC Base; USA",
"text": ""
},
{
"value": "500",
"description": "500 - Belgium, Canada, Switzerland, International Latin-1",
"text": ""
},
{
"value": "720",
"description": "720 - MS-DOS Arabic",
"text": ""
},
{
"value": "737",
"description": "737 - MS-DOS Greek PC-Data",
"text": ""
},
{
"value": "775",
"description": "775 - MS-DOS Baltic PC-Data",
"text": ""
},
{
"value": "813",
"description": "813 - ISO 8859-7; Greek/Latin",
"text": ""
},
{
"value": "819",
"description": "819 - ISO 8859-1; Latin Alphabet No. 1",
"text": ""
},
{
"value": "833",
"description": "833 - Korea (extended range)",
"text": ""
},
{
"value": "834",
"description": "834 - Korea host double byte (including 1880 UDC)",
"text": ""
},
{
"value": "835",
"description": "835 - Traditional Chinese host double byte (including 6204 UDC)",
"text": ""
},
{
"value": "836",
"description": "836 - Simplified Chinese (extended range)",
"text": ""
},
{
"value": "837",
"description": "837 - Simplified Chinese",
"text": ""
},
{
"value": "838",
"description": "838 - Thailand (extended range)",
"text": ""
},
{
"value": "850",
"description": "850 - PC Data; MLP 222 Latin Alphabet 1",
"text": ""
},
{
"value": "851",
"description": "851 - PC Data; Greek",
"text": ""
},
{
"value": "852",
"description": "852 - PC Data; Latin-2 Multilingual",
"text": ""
},
{
"value": "855",
"description": "855 - PC Data; ROECE Cyrillic",
"text": ""
},
{
"value": "857",
"description": "857 - PC Data; Turkey Latin #5",
"text": ""
},
{
"value": "858",
"description": "858 - PC Data: MLP 222; Latin Alphabet Number 1 w/euro; Latin-1 Countries",
"text": ""
},
{
"value": "860",
"description": "860 - PC Data; Portugal",
"text": ""
},
{
"value": "861",
"description": "861 - PC Data; Iceland",
"text": ""
},
{
"value": "862",
"description": "862 - PC Data; Hebrew",
"text": ""
},
{
"value": "863",
"description": "863 - PC Data; Canada",
"text": ""
},
{
"value": "864",
"description": "864 - PC Data; Arabic",
"text": ""
},
{
"value": "865",
"description": "865 - PC Data; Denmark, Norway",
"text": ""
},
{
"value": "866",
"description": "866 - PC Data; Cyrillic #2 - Personal Computer",
"text": ""
},
{
"value": "868",
"description": "868 - PC Data: Urdu",
"text": ""
},
{
"value": "869",
"description": "869 - PC Data; Greek",
"text": ""
},
{
"value": "870",
"description": "870 - Latin-2 Multilingual",
"text": ""
},
{
"value": "871",
"description": "871 - Iceland",
"text": ""
},
{
"value": "874",
"description": "874 - Thai PC Data",
"text": ""
},
{
"value": "875",
"description": "875 - Greece",
"text": ""
},
{
"value": "878",
"description": "878 - Russian Internet KOI8-R Cyrillic",
"text": ""
},
{
"value": "880",
"description": "880 - Cyrillic Multilingual",
"text": ""
},
{
"value": "891",
"description": "891 - Korean PC Data (non-extended)",
"text": ""
},
{
"value": "897",
"description": "897 - Japanese PC Data (non-extended)",
"text": ""
},
{
"value": "903",
"description": "903 - Simplified Chinese PC Data (non-extended)",
"text": ""
},
{
"value": "904",
"description": "904 - Traditional Chinese PC Data",
"text": ""
},
{
"value": "905",
"description": "905 - Turkey Latin-3",
"text": ""
},
{
"value": "912",
"description": "912 - ISO 8859-2; ROECE Latin-2 Multilingual",
"text": ""
},
{
"value": "914",
"description": "914 - Latin 4 - ISO 8859-4",
"text": ""
},
{
"value": "915",
"description": "915 - ISO 8859-5; Cyrillic; 8-bit ISO",
"text": ""
},
{
"value": "916",
"description": "916 - ISO 8859-8; Hebrew",
"text": ""
},
{
"value": "918",
"description": "918 - Urdu EBCDIC",
"text": ""
},
{
"value": "920",
"description": "920 - ISO 8859-9; Latin 5",
"text": ""
},
{
"value": "921",
"description": "921 - Baltic, 8-bit (ISO 8859-13)",
"text": ""
},
{
"value": "922",
"description": "922 - Estonia, 8-bit (ISO)",
"text": ""
},
{
"value": "923",
"description": "923 - ISO 8859-15: Latin Alphabet with euro",
"text": ""
},
{
"value": "924",
"description": "924 - Latin 9 EBCDIC",
"text": ""
},
{
"value": "926",
"description": "926 - Korean PC Data DBCS, UDC 1880",
"text": ""
},
{
"value": "927",
"description": "927 - Traditional Chinese PC Data DBCS, UDC 6204",
"text": ""
},
{
"value": "928",
"description": "928 - Simplified Chinese PC Data DBCS, UDC 1880",
"text": ""
},
{
"value": "930",
"description": "930 - Japan Katakana (extended range) 4370 UDC (User Defined Characters)",
"text": ""
},
{
"value": "932",
"description": "932 - Japan PC Data Mixed",
"text": ""
},
{
"value": "933",
"description": "933 - Korea (extended range), 1880 UDC",
"text": ""
},
{
"value": "934",
"description": "934 - Korean PC Data",
"text": ""
},
{
"value": "935",
"description": "935 - Simplified Chinese (extended range)",
"text": ""
},
{
"value": "936",
"description": "936 - Simplified Chinese (non-extended)",
"text": ""
},
{
"value": "937",
"description": "937 - Traditional Chinese (extended range)",
"text": ""
},
{
"value": "938",
"description": "938 - Traditional Chinese (non-extended)",
"text": ""
},
{
"value": "939",
"description": "939 - Japan English (extended range) 4370 UDC",
"text": ""
},
{
"value": "941",
"description": "941 - Japanese DBCS PC for Open environment (Multi-vendor code):\n6878 JIS X 0208-1990 characters, 386 IBM® selected\ncharacters, 1880 IBM UDC (X'F040' to X'F9FC')",
"text": ""
},
{
"value": "942",
"description": "942 - Japanese PC Data Mixed",
"text": ""
},
{
"value": "943",
"description": "943 - Japanese PC Data Mixed for Open environment (Multi-vendor code):\n6878 JIS X 0208-1990 characters, 386 IBM selected\nDBCS characters, 1880 UDC (X'F040' to X'F9FC')",
"text": ""
},
{
"value": "944",
"description": "944 - Korean PC Data Mixed",
"text": ""
},
{
"value": "946",
"description": "946 - Simplified Chinese PC Data Mixed",
"text": ""
},
{
"value": "947",
"description": "947 - ASCII Double-byte",
"text": ""
},
{
"value": "948",
"description": "948 - Traditional Chinese PC Data Mixed 6204 UDC (User Defined Characters)",
"text": ""
},
{
"value": "949",
"description": "949 - Republic of Korea National Standard Graphic Character Set (KS)\nPC Data mixed-byte including 1800 UDC",
"text": ""
},
{
"value": "950",
"description": "950 - Traditional Chinese PC Data Mixed for Big5",
"text": ""
},
{
"value": "951",
"description": "951 - Republic of Korea National Standard Graphic Character Set (KS)\nPC Data double-byte including 1800 UDC",
"text": ""
},
{
"value": "954",
"description": "954 - Japanese EUC; G0 - JIS X201 Roman set (00895); G1 - JIS X208-1990\nset (00952); G2 - JIS X201 Katakana set (04992 ); G3 - JIS X212 set\n(00953)",
"text": ""
},
{
"value": "956",
"description": "956 - JIS X201 Roman for CP 00895; JIS X208-1983 for CP 00952",
"text": ""
},
{
"value": "957",
"description": "957 - JIS X201 Roman for CP 00895; JIS X208-1978 for CP 00955",
"text": ""
},
{
"value": "958",
"description": "958 - ASCII for CP 00367; JIS X208-1983 for CP 00952",
"text": ""
},
{
"value": "959",
"description": "959 - ASCII for CP 00367; JIS X208-1978 for CP 00955",
"text": ""
},
{
"value": "964",
"description": "964 - G0 - ASCII for CP 00367; G1- CNS 11643 plane 1 for CP 960",
"text": ""
},
{
"value": "965",
"description": "965 - ASCII for CP 00367; CNS 11643 plane 1 for CP 960",
"text": ""
},
{
"value": "970",
"description": "970 - G0 ASCII for CP 00367; G1 KSC X5601-1989 (including 188 UDCs)\nfor CP 971",
"text": ""
},
{
"value": "971",
"description": "971 - Korean EUC, G1 - KS C5601-1989 (including 188 UDC)",
"text": ""
},
{
"value": "1008",
"description": "1008 - Arabic 8-bit ISO/ASCII",
"text": ""
},
{
"value": "1009",
"description": "1009 - IS0-7: IRV",
"text": ""
},
{
"value": "1010",
"description": "1010 - ISO-7; France",
"text": ""
},
{
"value": "1011",
"description": "1011 - ISO-7; Germany",
"text": ""
},
{
"value": "1012",
"description": "1012 - ISO-7; Italy",
"text": ""
},
{
"value": "1013",
"description": "1013 - ISO-7; United Kingdom",
"text": ""
},
{
"value": "1014",
"description": "1014 - ISO-7; Spain",
"text": ""
},
{
"value": "1015",
"description": "1015 - ISO-7; Portugal",
"text": ""
},
{
"value": "1016",
"description": "1016 - ISO-7; Norway",
"text": ""
},
{
"value": "1017",
"description": "1017 - ISO-7; Denmark",
"text": ""
},
{
"value": "1018",
"description": "1018 - ISO-7; Finland and Sweden",
"text": ""
},
{
"value": "1019",
"description": "1019 - ISO-7; Belgium and Netherlands",
"text": ""
},
{
"value": "1025",
"description": "1025 - Cyrillic Multilingual",
"text": ""
},
{
"value": "1026",
"description": "1026 - Turkey Latin 5 CECP",
"text": ""
},
{
"value": "1027",
"description": "1027 - Japan English (extended range)",
"text": ""
},
{
"value": "1040",
"description": "1040 - Korean Latin PC Data extended",
"text": ""
},
{
"value": "1041",
"description": "1041 - Japanese PC Data extended",
"text": ""
},
{
"value": "1042",
"description": "1042 - Simplified Chinese PC Data extended",
"text": ""
},
{
"value": "1043",
"description": "1043 - Traditional Chinese PC Data extended",
"text": ""
},
{
"value": "1046",
"description": "1046 - PC Data - Arabic Extended",
"text": ""
},
{
"value": "1051",
"description": "1051 - HP Emulation(for use with Latin 1). GCGID SF150000 is mapped\nto a control X'7F'",
"text": ""
},
{
"value": "1088",
"description": "1088 - Korean PC Data single-byte",
"text": ""
},
{
"value": "1089",
"description": "1089 - ISO 8859-6: Arabic (string type 5)",
"text": ""
},
{
"value": "1097",
"description": "1097 - Farsi",
"text": ""
},
{
"value": "1098",
"description": "1098 - Farsi (IBM-PC)",
"text": ""
},
{
"value": "1112",
"description": "1112 - Baltic, Multilingual",
"text": ""
},
{
"value": "1114",
"description": "1114 - Traditional Chinese, Taiwan Industry Graphic Character Set\n(Big5)",
"text": ""
},
{
"value": "1115",
"description": "1115 - Simplified Chinese National Standard (GB), personal computer\nSBCS",
"text": ""
},
{
"value": "1122",
"description": "1122 - Estonia",
"text": ""
},
{
"value": "1123",
"description": "1123 - Cyrillic Ukraine EBCDIC",
"text": ""
},
{
"value": "1124",
"description": "1124 - Cyrillic Ukraine 8-Bit",
"text": ""
},
{
"value": "1125",
"description": "1125 - Cyrillic Ukraine PC-Data",
"text": ""
},
{
"value": "1126",
"description": "1126 - Windows Korean PC Data Single-Byte",
"text": ""
},
{
"value": "1129",
"description": "1129 - ISO-8 Vietnamese",
"text": ""
},
{
"value": "1130",
"description": "1130 - EBCDIC Vietnamese",
"text": ""
},
{
"value": "1131",
"description": "1131 - Cyrillic Belarus PC-Data",
"text": ""
},
{
"value": "1132",
"description": "1132 - EBCDIC Lao",
"text": ""
},
{
"value": "1133",
"description": "1133 - ISO-8 Lao",
"text": ""
},
{
"value": "1137",
"description": "1137 - Devanagari EBCDIC",
"text": ""
},
{
"value": "1140",
"description": "1140 - ECECP: USA, Canada, Netherlands, Portugal, Brazil, Australia,\nNew Zealand",
"text": ""
},
{
"value": "1141",
"description": "1141 - ECECP: Austria, Germany",
"text": ""
},
{
"value": "1142",
"description": "1142 - ECECP: Denmark, Norway",
"text": ""
},
{
"value": "1143",
"description": "1143 - ECECP: Finland, Sweden",
"text": ""
},
{
"value": "1144",
"description": "1144 - ECECP: Italy",
"text": ""
},
{
"value": "1145",
"description": "1145 - ECECP: Spain, Latin America (Spanish)",
"text": ""
},
{
"value": "1146",
"description": "1146 - ECECP: United Kingdom",
"text": ""
},
{
"value": "1147",
"description": "1147 - ECECP: France",
"text": ""
},
{
"value": "1148",
"description": "1148 - ECECP: International 1",
"text": ""
},
{
"value": "1149",
"description": "1149 - ECECP: Iceland",
"text": ""
},
{
"value": "1153",
"description": "1153 - Latin-2 - EBCDIC Multilingual with euro",
"text": ""
},
{
"value": "1154",
"description": "1154 - Cyrillic Multilingual with euro",
"text": ""
},
{
"value": "1155",
"description": "1155 - Turkey Latin 5 with euro",
"text": ""
},
{
"value": "1156",
"description": "1156 - Baltic, Multilingual with euro",
"text": ""
},
{
"value": "1157",
"description": "1157 - Estonia EBCDIC with euro",
"text": ""
},
{
"value": "1158",
"description": "1158 - Cyrillic Ukraine EBCDIC with euro",
"text": ""
},
{
"value": "1160",
"description": "1160 - Thai host with euro",
"text": ""
},
{
"value": "1164",
"description": "1164 - EBCDIC Vietnamese with euro",
"text": ""
},
{
"value": "1166",
"description": "1166 - Cyrillic multilingual with Euro for Kazakhstan",
"text": ""
},
{
"value": "1175",
"description": "1175 - Turkey with Euro and Turkish Lira",
"text": ""
},
{
"value": "1200",
"description": "1200 - Unicode: UTF-16, big endian",
"text": ""
},
{
"value": "1208",
"description": "1208 - Unicode: UTF-8",
"text": ""
},
{
"value": "1250",
"description": "1250 - Windows, Latin 2",
"text": ""
},
{
"value": "1251",
"description": "1251 - Windows, Cyrillic",
"text": ""
},
{
"value": "1252",
"description": "1252 - Windows,Latin 1",
"text": ""
},
{
"value": "1253",
"description": "1253 - Windows, Greek",
"text": ""
},
{
"value": "1254",
"description": "1254 - Windows, Turkish",
"text": ""
},
{
"value": "1255",
"description": "1255 - Windows, Hebrew",
"text": ""
},
{
"value": "1256",
"description": "1256 - Windows, Arabic",
"text": ""
},
{
"value": "1257",
"description": "1257 - Windows, Baltic Rim",
"text": ""
},
{
"value": "1258",
"description": "1258 - MS Windows, Vietnamese",
"text": ""
},
{
"value": "1275",
"description": "1275 - Apple Latin-1",
"text": ""
},
{
"value": "1280",
"description": "1280 - Apple Greek",
"text": ""
},
{
"value": "1281",
"description": "1281 - Apple Turkey",
"text": ""
},
{
"value": "1282",
"description": "1282 - Apple Central European (Latin-2)",
"text": ""
},
{
"value": "1283",
"description": "1283 - Apple Cyrillic",
"text": ""
},
{
"value": "1362",
"description": "1362 - Windows Korean PC DBCS-PC, including 11 172\nfull hangul",
"text": ""
},
{
"value": "1363",
"description": "1363 - Windows Korean PC Mixed, including 11 172\nfull hangul",
"text": ""
},
{
"value": "1364",
"description": "1364 - Korean host mixed extended including 11 172 full hangul",
"text": ""
},
{
"value": "1371",
"description": "1371 - Traditional Chinese host mixed including 6204 UDC, Extended SBCS including SBCS and DBCS euro\n(CCSID 9563 level)",
"text": ""
},
{
"value": "1377",
"description": "1377 - Hong Kong Traditional Chinese mixed host enhancement for HKSCS-2004 (Mapping is HKSCS-2004 to\nUnicode 17584 level)",
"text": ""
},
{
"value": "1380",
"description": "1380 - Simplified Chinese, People's Republic of China National Standard\n(GB), personal computer DBCS",
"text": ""
},
{
"value": "1381",
"description": "1381 - Simplified Chinese, People's Republic of China National Standard\n(GB) personal computer mixed SBCS and DBCS",
"text": ""
},
{
"value": "1382",
"description": "1382 - Simplified Chinese DBCS PC GB 2312-80 set, including 31 IBM selected\nand 1360 UDC.",
"text": ""
},
{
"value": "1383",
"description": "1383 - Simplified Chinese, EUC \nG0 set; ASCII\n\nG1 set; GB 2312-80 set (1382)\n\n\n",
"text": ""
},
{
"value": "1385",
"description": "1385 - Simplified Chinese DBCS-PC GBK, all GBK character set and others",
"text": ""
},
{
"value": "1386",
"description": "1386 - Simplified Chinese PC Data GBK mixed, all GBK character set\nand others",
"text": ""
},
{
"value": "1388",
"description": "1388 - Simplified Chinese DBCS- GB 18030 Host with UDCs and Uygur\nextension.",
"text": ""
},
{
"value": "1399",
"description": "1399 - Japanese Latin-Kanji Host Mixed including 4370 UDC, Extended\nSBCS (includes SBCS and DBCS euro)",
"text": ""
},
{
"value": "4396",
"description": "4396 - Japanese Host DB including 1880",
"text": ""
},
{
"value": "4930",
"description": "4930 - Korean DBCS-Host extended including 11 172 full hangul",
"text": ""
},
{
"value": "4933",
"description": "4933 - Simplified Chinese DBCS Host (GBK), all GBK character set and\nothers",
"text": ""
},
{
"value": "4948",
"description": "4948 - Latin 2 PC Data Multilingual",
"text": ""
},
{
"value": "4951",
"description": "4951 - Cyrillic PC Data Multilingual",
"text": ""
},
{
"value": "4952",
"description": "4952 - Hebrew PC Data",
"text": ""
},
{
"value": "4953",
"description": "4953 - Turkey PC Data Latin 5",
"text": ""
},
{
"value": "4960",
"description": "4960 - Arabic PC Data",
"text": ""
},
{
"value": "4965",
"description": "4965 - Greek PC Data",
"text": ""
},
{
"value": "4970",
"description": "4970 - Thai PC Data Single-Byte",
"text": ""
},
{
"value": "4971",
"description": "4971 - Greek (including euro)",
"text": ""
},
{
"value": "5026",
"description": "5026 - Japan Katakana (extended range) 1880 UDC",
"text": ""
},
{
"value": "5035",
"description": "5035 - Japan English (extended range) 1880 UDC",
"text": ""
},
{
"value": "5050",
"description": "5050 - G0 - JIS X201 Roman for CP 895; G1 JIS X208-1990 for CP 952",
"text": ""
},
{
"value": "5052",
"description": "5052 - JIS X201 Roman for CP 895; JIS X208-1983 for CP 952",
"text": ""
},
{
"value": "5053",
"description": "5053 - JIS X201 Roman for CP 895; JIS X208-1978 for CP 955",
"text": ""
},
{
"value": "5054",
"description": "5054 - ASCII for CP 367; JIS X208-1983 for CP 952",
"text": ""
},
{
"value": "5055",
"description": "5055 - ASCII for CP 367; JIS X208-1978 for CP 955",
"text": ""
},
{
"value": "5123",
"description": "5123 - Japanese Latin Host Extended SBCS (includes euro)",
"text": ""
},
{
"value": "5210",
"description": "5210 - Simplified Chinese PC Data Single-Byte (GBK), growing CS",
"text": ""
},
{
"value": "5233",
"description": "5233 - Devanagari EBCDIC, including Indian Rupee",
"text": ""
},
{
"value": "5348",
"description": "5348 - Windows,\nLatin 1 with euro",
"text": ""
},
{
"value": "8612",
"description": "8612 - Arabic (base shapes only)",
"text": ""
},
{
"value": "9030",
"description": "9030 - Thai Host Extended SBCS",
"text": ""
},
{
"value": "9056",
"description": "9056 - PC Data: Arabic PC Storage/Interchange",
"text": ""
},
{
"value": "9066",
"description": "9066 - Thai PC Data Extended SBCS",
"text": ""
},
{
"value": "12708",
"description": "12708 - Arabic (base shapes, Lamaleph ligatures and Hindi digits) (string\ntype 7)",
"text": ""
},
{
"value": "13121",
"description": "13121 - Korean Host Extended SBCS",
"text": ""
},
{
"value": "13124",
"description": "13124 - Simplified Chinese Host Data Single-Byte (GBK) equivalent to\nSimplified Chinese Host Data Single-Byte (GB) except growing CS",
"text": ""
},
{
"value": "13488",
"description": "13488 - Unicode: UTF-16 as defined in the Unicode Standard. Fixed\nCS as defined by Unicode 2.0. Big endian",
"text": ""
},
{
"value": "16684",
"description": "16684 - Japanese Latin Host Double-Byte including 4370 UDC (includes\neuro)",
"text": ""
},
{
"value": "17354",
"description": "17354 - G0 - ASCII for CP 00367; G1 - KSC X5601-1989 (including 188\nUDCs) for CP 00971",
"text": ""
},
{
"value": "25546",
"description": "25546 - Korean 2022-KR TCP, ASCII, KS C5601-1989 (includes 188 UDC, RFC1557 using SO/SI)",
"text": ""
},
{
"value": "28709",
"description": "28709 - Traditional Chinese (extended range)",
"text": ""
},
{
"value": "33722",
"description": "33722 - Japanese EUC",
"text": ""
},
{
"value": "57345",
"description": "57345 - All Japanese 2022 characters",
"text": ""
},
{
"value": "61175",
"description": "61175 - Character positions. ",
"text": ""
},
{
"value": "61952",
"description": "61952 - (old CCSID for UCS). Use of 13488 is recommended instead.",
"text": ""
},
{
"value": "62210",
"description": "62210 - ISO 8859-8; Hebrew, string type 4.",
"text": ""
},
{
"value": "62211",
"description": "62211 - EBCDIC; Hebrew, string type 5",
"text": ""
},
{
"value": "62215",
"description": "62215 - MS Windows; Hebrew, string type 4",
"text": ""
},
{
"value": "62218",
"description": "62218 - PC data; Arabic, string type 4",
"text": ""
},
{
"value": "62222",
"description": "62222 - ISO 8859-9; Hebrew, string type 6",
"text": ""
},
{
"value": "62223",
"description": "62223 - MS Windows; Hebrew, string type 6",
"text": ""
},
{
"value": "62224",
"description": "62224 - EBCDIC; Arabic, string type 6",
"text": ""
},
{
"value": "62228",
"description": "62228 - MS Windows; Arabic, string type 6",
"text": ""
},
{
"value": "62235",
"description": "62235 - EBCDIC; Hebrew, string type 6",
"text": ""
},
{
"value": "62238",
"description": "62238 - ISO 8859-9; Hebrew, string type 10",
"text": ""
},
{
"value": "62239",
"description": "62239 - MS Windows; Hebrew, string type 10",
"text": ""
},
{
"value": "62245",
"description": "62245 - EBCDIC; Hebrew, string type 10",
"text": ""
},
{
"value": "65534",
"description": "65534 - Look at lower level CCSID",
"text": ""
},
{
"value": "65535",
"description": "65535 - Special value indicating data is hex and should not be converted.\nThis is the default for the QCCSID system value.",
"text": ""
}
];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the size of this array (taking 2/3 of that file 😅), it should be put in its own file under src/api/CCSIDS.ts for example.
Don't bother making the elemnts SelectItem yet. Keeping it general purpose will allow other components to reuse it for other purposes.

Make it a simple data structure:

export const CCSIDS = [
    {
        ccsid: "37",
        description: "37 - US, Canada, Netherlands, Portugal, Brazil, New Zealand, Australia"
    },
    {
        ccsid: "256",
        description: "256 - Netherlands"
    },
    ...
]

Then map it to a SelectItem[] array when the UI needs it - using an anonymous function to avoid repeating the code if needed.
No need to use a Set to wrap ENCODINGS; the Array includes method should be enough.

So it should looks like something like that:

const getCCSIDItems: (selected:number) => [ { ccsid: 0, description: "Select CCSID" }, ...CCSIDS ].map(e => { value: e.ccsid, description: e.description, text: "", selected: selected === e.ccsid });

const sourceCcsidOptions = getCCSIDItems(config.ccsidConvertFrom).filter(option => ENCODINGS.includes(option.value));
const targetCcsidOptions = getCCSIDItems(config.ccsidConvertTo).filter(option => !ENCODINGS.includes(option.value));

Comment on lines 1303 to 1329
const nonUtfEncodings = new Set<string>(ENCODINGS);

const sourceCcsidOptions = CCSID_Options.filter(option => nonUtfEncodings.has(option.value)).map((i) => { return {...i}});
const targetCcsidOptions = CCSID_Options.filter(option => !nonUtfEncodings.has(option.value)).map((i) => { return {...i}});

sourceCcsidOptions.unshift({
value: "0",
description: "Select CCSID",
text: ""
});

targetCcsidOptions.unshift({
value: "0",
description: "Select CCSID",
text: ""
});

const selectedSourceCcsid = sourceCcsidOptions.find(option => option.value === config.ccsidConvertFrom);
const selectedTargetCcsid = targetCcsidOptions.find(option => option.value === config.ccsidConvertTo);

if (selectedSourceCcsid) {
selectedSourceCcsid.selected = true;
}
if (selectedTargetCcsid) {
selectedTargetCcsid.selected = true;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemy other comment about how to refactor this part.

}
return true;
} else {
console.log(copyResult.command);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to eventually remove this console.log.

Comment on lines +156 to +158
// Fetch source file CCSID and determine if conversion is needed
const attr = await this.getAttributes(path, "CCSID");
const sourceCcsid = Number(attr?.["CCSID"]) || 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconditionally fetching the member's CCSID each time we upload/download members comes with a slight overhead that could be avoided by caching this information in the IBMi class (since both IBMIContent and ExtendedContent will reuse that information).
This could be put in a class field of Map<string, number> type, where the key would be the source physical file path and the value its CCSID.
Then wrap the logic to get and fetch the information once per file (not per member) in a class method.

Comment on lines +261 to +262
const attr = await this.getAttributes(path, "CCSID");
const sourceCcsid = Number(attr?.["CCSID"]) || 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment about caching this.

Comment on lines +180 to +181
const attr = await connection.getContent().getAttributes(memberPath, "CCSID");
const sourceCcsid = Number(attr?.["CCSID"]) || 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment about caching this.

Itsanexpriment and others added 7 commits February 24, 2026 01:32
- Resolved conflict in src/webviews/settings/index.ts
- Preserved SelectItem import needed for CCSID_Options
- Added new checkLoginForm import from upstream
- Synced with upstream master (commit 154bc1b)
- Add default values for CCSID config fields (ccsidConversionEnabled, ccsidConvertFrom, ccsidConvertTo)
- Change determineCcsidConversion to return object instead of array
- Implement CCSID caching in IBMi class with Map and 10-minute TTL
- Add shared getFileCcsid() method in IBMi class
- Update IBMiContent and ExtendedContent to use shared method
- Add CacheItem interface for cache entries
- Move CCSID definitions to src/api/CCSIDs.ts
- Use plain objects instead of SelectItem format
- Replace Set.has() with Array.includes()
- Map to SelectItem only when needed
@yuvalnn
Copy link
Copy Markdown
Author

yuvalnn commented Mar 25, 2026

@sebjulliand Thanks again for the review! all requested changes have been addressed.

Main changes:

  • Added default values for CCSID config
  • Refactored determineCcsidConversion to return an object
  • Implemented CCSID caching with a shared method
  • Extracted CCSID options and simplified the settings UI

Tested locally (including the UI).
Also thanks to @Itsanexpriment for the help on this!

Let me know if anything else is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants