Skip to content

Conversation

@david-roper
Copy link
Collaborator

@david-roper david-roper commented Sep 24, 2025

PR to deal with any non-visible character present in a csv template uploaded

new function to check string entry for non-visible chars
display error if one is present

closes issue #1196

Summary by CodeRabbit

  • New Features

    • Blocks non-visible Unicode (ANSI, zero‑width) in CSV headers and cells across all parsing paths; errors now include row/column context and U+ hex codes; header trimming now uses trimmed header values.
  • Bug Fixes

    • Consistent row-numbered error reporting and earlier detection of hidden-character issues before schema validation.
  • Localization

    • English and French error messages refined for consistent wording and punctuation.
  • Tests

    • CSV parsing tests updated to reflect adjusted field ordering and validation behavior.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 24, 2025

Walkthrough

Adds detection and rejection of non-visible Unicode characters during CSV upload for both Zod3 and Zod4 flows; validates headers (subjectID, date) and each data cell with a rowNumber; returns localized (EN/FR) UploadError messages including row/column and U+ hex context; changes header-trimming to use simple trimming (removing prior Mongolian-vowel-separator-specific replacement); updates tests to reflect changed schema field-order behavior.

Changes

Cohort / File(s) Summary of Changes
CSV upload validation & localization
apps/web/src/utils/upload.ts
Added nonVisibleCharChecker plus ANSI/zero-width regex constants; header-level non-visible-character validation for subjectID and date; per-row/per-cell non-visible-character checks with rowNumber in both Zod3 and Zod4 paths; returns localized (EN/FR) UploadErrors containing row/column and U+ hex code; changed header trimming to use trimmed header values (removed prior Mongolian-vowel-separator-specific handling); adjusted punctuation/translation text.
Upload tests (expectation & ordering updates)
apps/web/src/utils/__tests__/upload.test.ts
Updated test fixtures and expectations to match swapped field orders in mocked validationSchema across multiple Zod3/Zod4 cases; changed type assertions to as unknown as ...; updated expected mapped output ordering and related assertions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant W as Web App
  participant UU as upload.ts
  participant Z as Zod Parser

  U->>W: Upload CSV
  W->>UU: processInstrumentCSV(csv)
  rect #f3f7ff
    note right of UU: Trim headers (use .trim())\nHeader pre-checks for non-visible chars (subjectID/date)\nReturn localized UploadError on header failure
  end

  UU->>Z: parse validation schema (Zod3 or Zod4)
  Z-->>UU: schema parse result / error
  alt schema error
    UU-->>W: localized UploadError (schema parse)
  else schema OK
    rect #f6fff5
      loop per CSV row (rowNumber++)
        UU->>UU: per-cell non-visible-char check\n(if found -> localized UploadError with row/col and U+ hex)
        UU->>UU: trim/map fields according to schema order
      end
    end
    UU-->>W: parsed data or first UploadError
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Zod4 upload #1190 — Overlaps upload CSV parsing and UploadError/Zod3/Zod4 flows; touches the same file.
  • fix tests #1207 — Modifies upload tests and CSV processing; similar test fixture and processing changes.

Suggested reviewers

  • joshunrau

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly states the main change of displaying an error message when non-visible characters are detected in a CSV, aligning directly with the PR’s objective and making the purpose immediately understandable to reviewers.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 553cee9 and e81547a.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/web/src/utils/upload.ts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/web/src/utils/upload.ts (2)

585-587: Fix French string: missing closing quote around row number.

-                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}`
+                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}'`

939-941: Fix French string: missing closing quote around row number (Zod4 path).

-                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}`
+                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}'`
🧹 Nitpick comments (6)
apps/web/src/utils/upload.ts (6)

66-72: Simplify and harden the non‑visible character check (return boolean, cover more code points).

Use .test(), return boolean, and include LRM/RLM (200E/200F) and WORD JOINER (2060).

-function nonVisibleCharChecker(entry: string | undefined) {
-  if (!entry) {
-    return null;
-  }
-  const nonVisibleCharCheck = /[\u200B-\u200D\uFEFF\u180E]/g.exec(entry);
-  return nonVisibleCharCheck;
-}
+function nonVisibleCharChecker(entry: string | undefined): boolean {
+  if (!entry) return false;
+  // Zero‑width: 200B..200F, WORD JOINER 2060, BOM FEFF, MVS 180E
+  return /[\u200B-\u200F\u2060\uFEFF\u180E]/.test(entry);
+}

517-537: Remove redundant first‑row Subject/Date checks (covered by per‑cell loop and can trip on sample row).

These checks duplicate the per‑cell validation and run before sample‑row removal. Drop them to avoid inconsistent behavior.

-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
-          return reject(
-            new UploadError({
-              en: `Subject ID at row ${rowNumber} contains Non-visible characters`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
-        if (regexResultDate !== null) {
-          return reject(
-            new UploadError({
-              en: `Date at row ${rowNumber} contains Non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }

541-543: Make sample‑row detection resilient to hidden characters (MVS/ZW).*

Strip non‑visible chars before comparing to sample tokens. This preserves compatibility with older templates containing MVS.

-          dataLines[0][0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
-          dataLines[0][1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
+          dataLines[0][0]?.replace(/[\u200B-\u200F\u2060\uFEFF\u180E]/g, '').trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
+          dataLines[0][1]?.replace(/[\u200B-\u200F\u2060\uFEFF\u180E]/g, '').trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]

870-871: Fix comment typos and align terminology.

-        //return an error if non space charaters are found
+        // return an error if non‑visible characters are found

872-892: Remove redundant first‑row Subject/Date checks (same rationale as Zod3 path).

-        let rowNumber = 1;
-
-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
-          return reject(
-            new UploadError({
-              en: `Subject ID at row ${rowNumber} contains Non-visible characters`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
-        if (regexResultDate !== null) {
-          return reject(
-            new UploadError({
-              en: `Date at row ${rowNumber} contains Non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
+        let rowNumber = 1;

895-899: Same resilience for sample‑row detection in Zod4 path.

-          dataLines[0][0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
-          dataLines[0][1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
+          dataLines[0][0]?.replace(/[\u200B-\u200F\u2060\uFEFF\u180E]/g, '').trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
+          dataLines[0][1]?.replace(/[\u200B-\u200F\u2060\uFEFF\u180E]/g, '').trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf78bab and 17d56d4.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (5 hunks)
🔇 Additional comments (1)
apps/web/src/utils/upload.ts (1)

605-607: LGTM: clearer schema‑parse failure copy with FR translation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/src/utils/upload.ts (1)

551-555: Guard against short rows (avoid crash on undefined cell values)

elements[i]!.trim() will throw if a row has fewer columns than headers. Default missing cells to empty string.

-          for (let i = 0; i < headers.length; i++) {
-            const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
-            if (rawValue === '\n') {
-              continue;
-            }
+          for (let i = 0; i < headers.length; i++) {
+            const key = headers[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
+            if (rawValue === '\n') continue;
🧹 Nitpick comments (8)
apps/web/src/utils/upload.ts (8)

66-72: Return a boolean from the hidden-character checker

Make the helper boolean and prefer .test over .exec to simplify call sites and avoid nullable checks.

-function nonVisibleCharChecker(entry: string | undefined) {
-  if (!entry) {
-    return null;
-  }
-  const nonVisibleCharCheck = /[\u200B-\u200D\uFEFF\u180E]/g.exec(entry);
-  return nonVisibleCharCheck;
-}
+function nonVisibleCharChecker(entry?: string): boolean {
+  if (!entry) return false;
+  return /[\u200B-\u200D\uFEFF\u180E]/.test(entry);
+}

517-537: Remove redundant pre-check on row 1 (avoid false positives with sample row)

This duplicate check is unnecessary (per-cell validation below catches it) and can incorrectly flag the sample row. Remove it and keep rowNumber initialization.

-        let rowNumber = 1;
-
-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
-          return reject(
-            new UploadError({
-              en: `Subject ID at row ${rowNumber} contains Non-visible characters`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
-        if (regexResultDate !== null) {
-          return reject(
-            new UploadError({
-              en: `Date at row ${rowNumber} contains Non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
+        let rowNumber = 1;

539-543: Update stale comment to reflect current behavior

The code no longer strips MVS; it just checks the sample row values. Update the comment.

-        //remove sample data if included remove any mongolian vowel separators
+        // Remove sample data row if present
         if (
           dataLines[0][0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
           dataLines[0][1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]

583-585: Clarify parse error context (EN/FR phrasing)

Shorten EN and fix FR prepositions.

-                    en: `${error.description.en} at column name: '${key}' and row number '${rowNumber}'`,
-                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}'`
+                    en: `${error.description.en} at column '${key}' and row ${rowNumber}`,
+                    fr: `${error.description.fr} à la colonne '${key}' et à la ligne ${rowNumber}`

591-594: Improve FR translation for generic CSV error

-                  en: `Error parsing CSV`,
-                  fr: `Erreur avec la CSV`
+                  en: `Error parsing CSV`,
+                  fr: `Erreur lors de l'analyse du CSV`

867-890: Remove Zod4 pre-check and fix typo

Same redundancy and sample-row false positive risk as Zod3; also fix “charaters”.

-        //remove sample data if included (account for old mongolian vowel separator templates)
-        //return an error if non space charaters are found
-
-        let rowNumber = 1;
-
-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
-          return reject(
-            new UploadError({
-              en: `Subject ID at row ${rowNumber} contains Non-visible characters`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
-        if (regexResultDate !== null) {
-          return reject(
-            new UploadError({
-              en: `Date at row ${rowNumber} contains Non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
-            })
-          );
-        }
+        // Remove sample data row if present
+        let rowNumber = 1;

935-937: Clarify parse error context (EN/FR phrasing) — Zod4

-                    en: `${error.description.en} at column name: '${key}' and row number '${rowNumber}'`,
-                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}'`
+                    en: `${error.description.en} at column '${key}' and row ${rowNumber}`,
+                    fr: `${error.description.fr} à la colonne '${key}' et à la ligne ${rowNumber}`

943-945: Improve FR translation for generic CSV error — Zod4

-                  en: `Error parsing CSV`,
-                  fr: `Erreur avec la CSV`
+                  en: `Error parsing CSV`,
+                  fr: `Erreur lors de l'analyse du CSV`
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa8eedc and 128853d.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🔇 Additional comments (3)
apps/web/src/utils/upload.ts (3)

558-567: Use boolean checker and polish messages (quotes, casing, FR preposition)

Same change as previously suggested.

-            //Check for non visible char in every row, return error if present
-            if (nonVisibleCharChecker(rawValue) !== null) {
+            // Return error if any non‑visible character is present
+            if (nonVisibleCharChecker(rawValue)) {
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains Non-visible characters`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters`,
+                  fr: `La valeur à la ligne ${rowNumber} et à la colonne '${key}' contient des caractères non visibles`
                 })
               );
             }

603-605: LGTM: clearer schema parse failure messaging

Message content and FR translation look good.


911-919: Use boolean checker and polish messages (quotes, casing, FR preposition) — Zod4

Same change as previously suggested.

-            // Return error if any non‑visible character is present
-            if (nonVisibleCharChecker(rawValue) !== null) {
+            // Return error if any non‑visible character is present
+            if (nonVisibleCharChecker(rawValue)) {
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains Non-visible characters`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters`,
+                  fr: `La valeur à la ligne ${rowNumber} et à la colonne '${key}' contient des caractères non visibles`
                 })
               );
             }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/src/utils/upload.ts (1)

551-554: Guard against short rows to avoid .trim() on undefined (Zod3 path)

Short rows will throw here; guard before trimming.

-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
🧹 Nitpick comments (4)
apps/web/src/utils/upload.ts (4)

66-72: Return a boolean from nonVisibleCharChecker; prefer RegExp.test with Unicode flag

Simplifies usage and avoids exec-with-g side effects.

-function nonVisibleCharChecker(entry: string | undefined) {
-  if (!entry) {
-    return null;
-  }
-  const nonVisibleCharCheck = /[\u200B-\u200D\uFEFF\u180E]/g.exec(entry);
-  return nonVisibleCharCheck;
-}
+function nonVisibleCharChecker(entry?: string): boolean {
+  return !!entry && /[\u200B-\u200D\uFEFF\u180E]/u.test(entry);
+}

517-537: Use boolean checker and fix messages (Subject/Date pre-check)

Remove temp vars; lowercase “non-visible”; improve FR wording.

-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
+        if (nonVisibleCharChecker(dataLines[0][0])) {
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains Non-visible characters`,
+              en: `Subject ID at row ${rowNumber} contains non-visible characters`,
               fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
             })
           );
         }
-        if (regexResultDate !== null) {
+        if (nonVisibleCharChecker(dataLines[0][1])) {
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains Non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
+              en: `Date at row ${rowNumber} contains non-visible characters`,
+              fr: `La date à la ligne ${rowNumber} contient des caractères non visibles`
             })
           );
         }

585-585: French phrasing nit in aggregated error

“à la colonne … et à la ligne …” reads better.

-                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}'`
+                    fr: `${error.description.fr} à la colonne '${key}' et à la ligne '${rowNumber}'`

935-937: French phrasing nit in aggregated error (Zod4)

Align with Zod3 suggestion.

-                    fr: `${error.description.fr} au nom de colonne : '${key}' et numéro de ligne '${rowNumber}'`
+                    fr: `${error.description.fr} à la colonne '${key}' et à la ligne '${rowNumber}'`
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 128853d and d5f0ba7.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🔇 Additional comments (8)
apps/web/src/utils/upload.ts (8)

541-543: LGTM: sample-row detection via trim is cleaner

Trim-based comparison is a good replacement for MVS stripping.


558-567: Use boolean checker and polish messages (per‑cell Zod3 path)

Lowercase “non-visible” and quote column name.

-            //Check for non visible char in every row, return error if present
-            if (nonVisibleCharChecker(rawValue) !== null) {
+            // Return error if any non-visible character is present
+            if (nonVisibleCharChecker(rawValue)) {
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains Non-visible characters`,
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles`
                 })
               );
             }

603-605: LGTM: clearer schema-failure message (Zod3)

Improved guidance is helpful.


868-890: Use boolean checker; fix comment and messages (Subject/Date pre-check, Zod4 path)

Updates wording and removes temp vars.

-        //return an error if non space charaters are found
+        // Return an error if non-visible characters are found
-
-        let rowNumber = 1;
-
-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
+        let rowNumber = 1;
+        if (nonVisibleCharChecker(dataLines[0][0])) {
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains Non-visible characters`,
+              en: `Subject ID at row ${rowNumber} contains non-visible characters`,
               fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
             })
           );
         }
-        if (regexResultDate !== null) {
+        if (nonVisibleCharChecker(dataLines[0][1])) {
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains Non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
+              en: `Date at row ${rowNumber} contains non-visible characters`,
+              fr: `La date à la ligne ${rowNumber} contient des caractères non visibles`
             })
           );
         }

893-895: LGTM: sample-row detection via trim (Zod4)

Consistent with Zod3 path.


911-919: Use boolean checker and polish messages (per‑cell Zod4 path)

Lowercase “non-visible” and quote column name.

-            // Return error if any non‑visible character is present
-            if (nonVisibleCharChecker(rawValue) !== null) {
+            // Return error if any non‑visible character is present
+            if (nonVisibleCharChecker(rawValue)) {
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains Non-visible characters`,
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles`
                 })
               );
             }

955-957: LGTM: clearer schema-failure message (Zod4)

Matches Zod3.


903-910: Guard against short rows to avoid .trim() on undefined (Zod4 path)

Same crash risk as Zod3 path.

-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/src/utils/upload.ts (1)

552-566: Fix false negatives when BOM characters are present.

rawValue is trimmed before invoking nonVisibleCharChecker, but String.prototype.trim() strips some of the very characters we want to detect (e.g. \uFEFF / zero-width no-break space, possibly at either edge). As a result, a cell containing only a leading/trailing BOM passes the new guard even though it should be rejected. Please run the checker on the original cell string before trimming (and keep the trimmed value for later parsing). This also lets us avoid the unsafe elements[i]! when the row is short.

-            const rawValue = elements[i]!.trim();
-            if (rawValue === '\n') {
-              continue;
-            }
-
-            //Check for non visible char in every row, return error if present
-            if (nonVisibleCharChecker(rawValue) !== null) {
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
+            if (rawValue === '\n') {
+              continue;
+            }
+
+            // Return error if any non-visible character is present
+            if (nonVisibleCharChecker(cell) !== null) {
               return reject(
                 new UploadError({
                   en: `Value at row ${rowNumber} and column ${key} contains non-visible characters`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles`
                 })
               );
             }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5f0ba7 and 5ebfdca.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
apps/web/src/utils/upload.ts (3)

519-537: Enhance error messages with character code.

Per past feedback, error messages should indicate which character was detected. Currently users only know "non-visible characters" exist but not which one.

Apply this diff to include the character code:

-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
-
-        if (regexResultSubject !== null) {
+        const matchSubject = nonVisibleCharChecker(dataLines[0][0]);
+        const matchDate = nonVisibleCharChecker(dataLines[0][1]);
+
+        if (matchSubject !== null) {
+          const charCode = matchSubject[0]?.charCodeAt(0).toString(16).toUpperCase();
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
-        if (regexResultDate !== null) {
+        if (matchDate !== null) {
+          const charCode = matchDate[0]?.charCodeAt(0).toString(16).toUpperCase();
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

872-890: Enhance error messages with character code (Zod4 path).

Same as Zod3: include the character code in error messages for better debugging.

Apply the same enhancement as suggested for lines 519-537 in the Zod3 path.


66-72: Simplify return type to boolean.

The function returns a regex exec result but all call sites only check !== null. Since the matched character info is never used in error messages, return a boolean instead:

return /[\u200B-\u200D\uFEFF\u180E]/g.test(entry);

The character selection (zero-width spaces, joiners, and BOM) is standard for CSV validation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe8460e and 1c21715.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🔇 Additional comments (5)
apps/web/src/utils/upload.ts (5)

539-545: Good fix for sample data detection.

Using trimmed comparison instead of character stripping is cleaner and more robust.


603-605: French translation added.

Good addition for consistency.


892-897: Good fix for sample data detection (Zod4 path).

Consistent with Zod3 improvement.


952-954: French translation present.

Consistent localization across both paths.


905-916: Check raw cell value before trimming (Zod4 path).

The cell safety guard is good, but you're still checking rawValue (trimmed) for non-visible characters. This misses leading/trailing characters like BOM. Check cell before trimming, as flagged in past comments.

Apply this diff:

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
             const cell = elements[i];
             const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            if (nonVisibleCharChecker(rawValue) !== null) {
+            // Check for non-visible character in raw cell before trimming
+            const match = nonVisibleCharChecker(cell);
+            if (match !== null) {
+              const charCode = match[0]?.charCodeAt(0).toString(16).toUpperCase();
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains non-visible characters`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles`
+                  en: `The value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }

Based on past review comments.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
apps/web/src/utils/upload.ts (2)

518-538: Minor formatting inconsistency in French error message.

Line 527 has an extra space before the character placeholder compared to line 535.

-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`

Ensure consistent spacing around interpolated values in both French messages.


540-546: Update outdated comment.

The comment references removing Mongolian vowel separators, but the code now uses trimmed string comparison.

-        //remove sample data if included remove any mongolian vowel separators
+        // Remove sample data row if present
         if (
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c21715 and fe5e7c6.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🧰 Additional context used
🪛 Biome (2.1.2)
apps/web/src/utils/upload.ts

[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)


[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)


[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)

🔇 Additional comments (2)
apps/web/src/utils/upload.ts (2)

870-870: Fix typo.

"charaters" → "characters"

-        //return an error if non space charaters are found
+        // Return an error if non-visible characters are found

Based on learnings (past comment flagged this typo).


559-568: Check raw cell value before trimming.

Checking the trimmed rawValue loses leading/trailing non-visible characters like BOM (\uFEFF). Check elements[i] directly before trimming.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {

Based on learnings (past comments flagged this issue).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/src/utils/upload.ts (1)

553-568: Check raw cell before trimming.

Line 554 trims first, then line 560 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM \uFEFF) are lost.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
+              const charCode = nonVisibleChars[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains non-visible characters ${nonVisibleChars[0]}`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles ${nonVisibleChars[0]}`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }

Based on learnings (past comments flagged this exact issue).

🧹 Nitpick comments (2)
apps/web/src/utils/upload.ts (2)

520-538: Include Unicode code point in error messages.

Error messages show the character itself ${regexResultSubject[0]}, which may render as invisible. Format as hex code (e.g., U+FEFF) for clarity:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

Based on learnings (past comment asked for char code indication).


874-892: Include Unicode code point in error messages.

Error messages show the character itself, which may be invisible. Format as hex code:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe5e7c6 and a1ca11a.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🧰 Additional context used
🪛 Biome (2.1.2)
apps/web/src/utils/upload.ts

[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)


[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)


[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)

🔇 Additional comments (2)
apps/web/src/utils/upload.ts (2)

542-546: LGTM: Sample data detection.

Trimmed comparison correctly identifies sample data rows without character-specific stripping.


894-899: LGTM: Sample data detection.

Trimmed comparison correctly identifies sample data rows, consistent with Zod3 path.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
apps/web/src/utils/upload.ts (3)

520-538: Include character code in error messages.

The error messages show the character but should include its Unicode code point for clarity:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `La date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

Based on learnings.


870-870: Fix typo.

"charaters" → "characters"

-        //return an error if non space charaters are found
+        //return an error if non-space characters are found

Based on learnings.


874-892: Include character code in error messages.

Mirror the Zod3 fix to include Unicode code points:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `La date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

Based on learnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1ca11a and 2254695.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🧰 Additional context used
🪛 Biome (2.1.2)
apps/web/src/utils/upload.ts

[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)


[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)


[error] 70-70: Unexpected control character in a regular expression.

Control characters are unusual and potentially incorrect inputs, so they are disallowed.

(lint/suspicious/noControlCharactersInRegex)

🔇 Additional comments (1)
apps/web/src/utils/upload.ts (1)

559-568: Check raw cell before trimming.

Line 554 trims first, then line 560 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM) are lost.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
+              const charCode = nonVisibleChars[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains non-visible characters ${nonVisibleChars[0]}`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles ${nonVisibleChars[0]}`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }

Based on learnings.

Comment on lines 907 to 935
const cell = elements[i];
const rawValue = cell == null ? '' : cell.trim();
if (rawValue === '\n') continue;
// Return error if any non‑visible character is present
const nonVisibleChars = nonVisibleCharChecker(rawValue);
if (nonVisibleChars !== null) {
return reject(
new UploadError({
en: `Value at row ${rowNumber} and column ${key} contains non-visible characters ${nonVisibleChars[0]}`,
fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles ${nonVisibleChars[0]}`
})
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Check raw cell before trimming.

Line 908 trims first, then line 911 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM) are lost.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
             const cell = elements[i];
-            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
+              const charCode = nonVisibleChars[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains non-visible characters ${nonVisibleChars[0]}`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles ${nonVisibleChars[0]}`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }
+            const rawValue = cell == null ? '' : cell.trim();

Based on learnings.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In apps/web/src/utils/upload.ts around lines 907 to 919, the code trims the cell
before checking for non-visible characters so leading/trailing non-visible
characters (e.g., BOM) are lost; change the flow to first capture the original
cell string (e.g., rawBeforeTrim = cell == null ? '' : cell), run
nonVisibleCharChecker against rawBeforeTrim and return the UploadError if it
finds characters, then set the trimmed value (rawValue = rawBeforeTrim.trim())
for the subsequent logic (including the '\n' check) so trimming happens after
validation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
apps/web/src/utils/upload.ts (2)

521-539: Include Unicode code point in error messages.

Pre-validation correctly runs before trimming, but error messages show the raw character rather than its code point. Past feedback requested displaying char codes (e.g., "U+FEFF").

         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

Based on learnings.


875-893: Include Unicode code point in error messages (Zod4 path).

Mirror the Zod3 feedback: error messages should display the Unicode code point (e.g., "U+FEFF") rather than the raw character for clarity.

         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

Based on learnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2254695 and a4ebd03.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🔇 Additional comments (3)
apps/web/src/utils/upload.ts (3)

66-74: Positive-match regex now permits non-Latin scripts.

The regex correctly targets specific non-visible/format characters (zero-width spaces, BOM, soft hyphen, etc.) without blocking valid Cyrillic, CJK, Arabic, or other scripts. This addresses the critical feedback from past reviews.


908-920: Check raw cell before trimming to catch leading/trailing BOM (Zod4 path).

Line 909 trims first, then line 912 checks the trimmed value. Leading/trailing non-visible characters are lost before validation.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
             const cell = elements[i];
-            const rawValue = cell == null ? '' : cell.trim();
-            if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
+              const charCode = nonVisibleChars[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains non-visible characters ${nonVisibleChars[0]}`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles ${nonVisibleChars[0]}`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }
+            const rawValue = cell == null ? '' : cell.trim();
+            if (rawValue === '\n') continue;

Based on learnings.


560-569: Check raw cell before trimming to catch leading/trailing BOM.

Line 555 trims the cell, then line 561 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM at \uFEFF) are stripped before validation and won't be detected.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
+              const charCode = nonVisibleChars[0]?.charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column ${key} contains non-visible characters ${nonVisibleChars[0]}`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles ${nonVisibleChars[0]}`
+                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }

Based on learnings.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
apps/web/src/utils/upload.ts (2)

560-570: Check raw cell before trimming.

Line 561 checks rawValue (already trimmed on line 555), so leading/trailing non-visible characters like BOM are lost. Check elements[i] before trimming:

          for (let i = 0; i < headers.length; i++) {
            const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
            if (rawValue === '\n') {
              continue;
            }

-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
            if (nonVisibleChars !== null) {
              const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
              return reject(
                new UploadError({
                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character(s) (U+${charCode})`,
                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractère(s) non visibles (U+${charCode})`
                })
              );
            }

Based on learnings.


913-922: Check raw cell before trimming (Zod4 path).

Line 913 checks rawValue (trimmed on line 910), losing leading/trailing non-visible characters. Check cell before trimming:

          for (let i = 0; i < headers.length; i++) {
            const key = headers[i]!.trim();
            const cell = elements[i];
-            const rawValue = cell == null ? '' : cell.trim();
            if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
            if (nonVisibleChars !== null) {
              const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
              return reject(
                new UploadError({
                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters (U+${charCode})`,
                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles (U+${charCode})`
                })
              );
            }
+            const rawValue = cell == null ? '' : cell.trim();

Based on learnings.

🧹 Nitpick comments (2)
apps/web/src/utils/upload.ts (2)

521-539: Include character codes in pre-validation errors.

The pre-validation checks raw values correctly but error messages show the character directly rather than as a Unicode code point. Match the per-cell format:

        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);

        if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
          return reject(
            new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
            })
          );
        }
        if (regexResultDate !== null) {
+          const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
          return reject(
            new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
            })
          );
        }

876-894: Include character codes in pre-validation errors (Zod4 path).

Mirror the Zod3 enhancement—format error messages with Unicode code points:

        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);

        if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
          return reject(
            new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
            })
          );
        }
        if (regexResultDate !== null) {
+          const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
          return reject(
            new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
            })
          );
        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdcba35 and cd5b487.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🔇 Additional comments (1)
apps/web/src/utils/upload.ts (1)

66-74: Good fix—positive pattern now allows non-Latin scripts.

The regex correctly targets specific non-visible characters (zero-width spaces, BOM, soft hyphen, etc.) without blocking valid Cyrillic, CJK, or other Unicode text. The commented-out regex on line 71 can be removed to reduce clutter.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/src/utils/upload.ts (1)

554-570: Check raw cell before trimming.

Line 555 trims first, then line 561 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM) are lost before validation runs.

Apply this diff:

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
                   en: `Value at row ${rowNumber} and column '${key}' contains non-visible character(s) (U+${charCode})`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractère(s) non visibles (U+${charCode})`
                 })
               );
             }

Based on learnings.

♻️ Duplicate comments (2)
apps/web/src/utils/upload.ts (2)

909-922: Check raw cell before trimming.

Line 910 trims first, then line 913 checks the trimmed value. Leading/trailing non-visible characters are lost.

Apply this diff:

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
             const cell = elements[i];
-            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non-visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
                   en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters (U+${charCode})`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles (U+${charCode})`
                 })
               );
             }
+            const rawValue = cell == null ? '' : cell.trim();

Based on learnings.


521-539: Include character code in header error messages.

The per-cell validation (lines 563-567) includes the Unicode code point (e.g., U+FEFF), but header validation only shows the raw character. For consistency and clarity, add the character code here as well.

Apply this diff:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd5b487 and 15f6843.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)
🔇 Additional comments (5)
apps/web/src/utils/upload.ts (5)

66-74: Regex correctly targets non-visible characters.

The positive pattern now matches specific format/zero-width characters without blocking valid non-Latin scripts. The function returning the match array (instead of boolean) is appropriate since downstream error messages extract the character code from nonVisibleChars[0].


543-544: LGTM.

Sample data detection correctly uses trimmed comparison after validation has already caught non-visible characters.


588-608: LGTM.

French translations added for error context messages.


897-898: LGTM.

Sample data detection matches Zod3 approach.


939-960: LGTM.

French translations added for error context messages, matching Zod3 path.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
apps/web/src/utils/upload.ts (4)

568-578: Check raw cell before trimming.

Line 563 trims elements[i], then line 569 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM) are lost before validation.

Apply this diff:

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
             //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
                   en: `Value at row ${rowNumber} and column '${key}' contains non-visible character(s) (U+${charCode})`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractère(s) non visibles (U+${charCode})`
                 })
               );
             }

Based on learnings.


917-930: Check raw cell before trimming.

Line 918 trims the cell, then line 921 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM) are lost.

Apply this diff:

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
             const cell = elements[i];
-            const rawValue = cell == null ? '' : cell.trim();
-            if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Return error if any non‑visible character is present
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
                   en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters (U+${charCode})`,
                   fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles (U+${charCode})`
                 })
               );
             }
+            const rawValue = cell == null ? '' : cell.trim();
+            if (rawValue === '\n') continue;

Based on learnings.


884-902: Include character code in header error messages.

Same issue as Zod3 path: header validation shows only the raw character, while per-cell validation (lines 923-927) includes the Unicode code point.

Apply this diff:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }

529-547: Include character code in header error messages.

Header validation (lines 532-547) displays only the raw character, while per-cell validation (lines 571-575) includes the Unicode code point. This inconsistency makes debugging harder.

Apply this diff:

         const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
         const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
 
         if (regexResultSubject !== null) {
+          const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
             })
           );
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15f6843 and 85df63e.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
apps/web/src/utils/upload.ts (4)

560-577: Guard against short rows and check raw cell before trimming.

Calling trim() on undefined will throw; trimming also hides leading/trailing BOM and similar characters. Check the original cell first; then trim for parsing.

-            const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
-            if (rawValue === '\n') {
-              continue;
-            }
-
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            const key = headers[i]!.trim();
+            const cell = elements[i];
+            // Check for non-visible characters before trimming (preserves leading/trailing BOM, etc.)
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
-              const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
+              const charCode = nonVisibleChars[0].codePointAt(0)?.toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible character(s) (U+${charCode})`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractère(s) non visibles (U+${charCode})`
+                  en: `The value at row ${rowNumber} and column '${key}' contains a non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et à la colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }
+            const rawValue = cell == null ? '' : cell.trim();
+            if (rawValue === '\n') {
+              continue;
+            }

526-546: Move sample-row removal before header checks; include code point and fix FR.

Running the header checks before removing the sample row can raise false errors on legacy templates with hidden chars. Also include the Unicode code point and fix French grammar.

         let rowNumber = 1;
 
-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
+        // Remove sample data row first (handles old templates with hidden chars)
+        if (
+          dataLines[0]?.[0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
+          dataLines[0]?.[1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
+        ) {
+          dataLines.shift();
+        }
+
+        const [subjectCell, dateCell] = dataLines[0] ?? [];
+        const regexResultSubject = nonVisibleCharChecker(subjectCell);
+        const regexResultDate = nonVisibleCharChecker(dateCell);
 
         if (regexResultSubject !== null) {
+          const code = regexResultSubject[0]?.codePointAt(0)?.toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible character(s) ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains a non-visible character (U+${code})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${code})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const code = regexResultDate[0]?.codePointAt(0)?.toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible character(s) ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains a non-visible character (U+${code})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${code})`
             })
           );
         }
-
-        //remove sample data if included remove any mongolian vowel separators
-        if (
-          dataLines[0][0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
-          dataLines[0][1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
-        ) {
-          dataLines.shift();
-        }
+        // sample data row already removed above

Also applies to: 548-552


879-901: Do sample-row removal first; include code point and fix FR.

Mirror the Zod3 fix: remove the sample row before header validation; include Unicode code point in messages; fix French wording.

-        //remove sample data if included (account for old mongolian vowel separator templates)
-        //return an error if non space characters are found
-
-        let rowNumber = 1;
-
-        const regexResultSubject = nonVisibleCharChecker(dataLines[0][0]);
-        const regexResultDate = nonVisibleCharChecker(dataLines[0][1]);
+        // remove sample data if included (account for old Mongolian vowel separator templates)
+        if (
+          dataLines[0]?.[0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
+          dataLines[0]?.[1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
+        ) {
+          dataLines.shift();
+        }
+        // return an error if non-space characters are found
+        let rowNumber = 1;
+        const [subjectCell, dateCell] = dataLines[0] ?? [];
+        const regexResultSubject = nonVisibleCharChecker(subjectCell);
+        const regexResultDate = nonVisibleCharChecker(dateCell);
 
         if (regexResultSubject !== null) {
+          const code = regexResultSubject[0]?.codePointAt(0)?.toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Subject ID at row ${rowNumber} contains non-visible characters ${regexResultSubject[0]}`,
-              fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles ${regexResultSubject[0]}`
+              en: `Subject ID at row ${rowNumber} contains a non-visible character (U+${code})`,
+              fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${code})`
             })
           );
         }
         if (regexResultDate !== null) {
+          const code = regexResultDate[0]?.codePointAt(0)?.toString(16).toUpperCase().padStart(4, '0');
           return reject(
             new UploadError({
-              en: `Date at row ${rowNumber} contains non-visible characters ${regexResultDate[0]}`,
-              fr: `Date à la ligne ${rowNumber} contient des caractères non visibles ${regexResultDate[0]}`
+              en: `Date at row ${rowNumber} contains a non-visible character (U+${code})`,
+              fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${code})`
             })
           );
         }
-
-        if (
-          dataLines[0][0]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[0] &&
-          dataLines[0][1]?.trim() === INTERNAL_HEADERS_SAMPLE_DATA[1]
-        ) {
-          dataLines.shift();
-        }
+        // sample data row already removed above

Also applies to: 903-907


916-929: Check raw cell before trimming in Zod4 path.

Same issue: trimming first hides leading/trailing non-visible chars (BOM, etc.). Check the original cell, then trim for parsing.

             const key = headers[i]!.trim();
             const cell = elements[i];
-            const rawValue = cell == null ? '' : cell.trim();
-            if (rawValue === '\n') continue;
-            // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            // Check for non‑visible character before trimming
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
-                  en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters (U+${charCode})`,
-                  fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles (U+${charCode})`
+                  en: `The value at row ${rowNumber} and column '${key}' contains a non-visible character (U+${charCode})`,
+                  fr: `La valeur à la ligne ${rowNumber} et à la colonne '${key}' contient un caractère non visible (U+${charCode})`
                 })
               );
             }
+            const rawValue = cell == null ? '' : cell.trim();
+            if (rawValue === '\n') continue;
🧹 Nitpick comments (1)
apps/web/src/utils/upload.ts (1)

66-81: Precompile the regex and clarify intent.

Avoid recompiling the heavy RegExp on every call; hoist it to module scope. Also the name “zeroWidthCharacterExceptNewline” includes a wide control range, not just zero‑width. Consider a clearer name.

-function nonVisibleCharChecker(entry: string | undefined) {
-  if (!entry) {
-    return null;
-  }
-  const ansiEscapeCode = '[\u001b\u009b][[()#;?]*(?:[0-9]{1,4}(?:;[0-9]{0,4})*)?[0-9A-PRZcf-nqry=><]',
-    zeroWidthCharacterExceptNewline =
-      '\u0000-\u0008\u000B-\u0019\u001b\u009b\u00ad\u200b\u2028\u2029\ufeff\ufe00-\ufe0f';
-
-  const zeroWidthCharactersExceptNewline = new RegExp(
-    '(?:' + ansiEscapeCode + ')|[' + zeroWidthCharacterExceptNewline + ']',
-    'g'
-  );
-  const nonVisibleCharCheck = zeroWidthCharactersExceptNewline.exec(entry);
-
-  return nonVisibleCharCheck;
-}
+function nonVisibleCharChecker(entry: string | undefined) {
+  if (!entry) return null;
+  return NON_VISIBLE_RE.exec(entry);
+}

Add at module scope:

// Matches ANSI escape sequences and non-visible/control characters excluding \t, \n, \r, and printable ASCII
const ANSI_ESCAPE = String.raw`[\u001b\u009b][[()#;?]*(?:\d{1,4}(?:;\d{0,4})*)?[0-9A-PRZcf-nqry=><]`;
const NON_VISIBLE_RANGE = String.raw`\u0000-\u0008\u000B-\u0019\u001B\u009B\u00AD\u200B\u2028\u2029\uFEFF\uFE00-\uFE0F`;
const NON_VISIBLE_RE = new RegExp(`(?:${ANSI_ESCAPE})|[${NON_VISIBLE_RANGE}]`, 'g');
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85df63e and f9cc434.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/web/src/utils/upload.ts (1)

560-577: Run the non-visible check on the original cell value.

We trim the cell before validation, so leading/trailing markers like \uFEFF are stripped and slip through. Calling the checker on the raw cell also avoids trimming undefined when rows are short.

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
-            //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            //Check for non visible char in every row, return error if present
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
                   en: `Value at row ${rowNumber} and column '${key}' contains non-visible character(s) (U+${charCode})`,
♻️ Duplicate comments (1)
apps/web/src/utils/upload.ts (1)

915-928: Check the untrimmed cell in the Zod4 path as well.

Same issue here: trimming before validation drops leading/trailing non-visible characters (e.g., BOM), so they pass the check.

             const key = headers[i]!.trim();
             const cell = elements[i];
             const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') continue;
             // Return error if any non-visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
               const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
               return reject(
                 new UploadError({
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4225391 and 7d9c2fa.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
apps/web/src/utils/upload.ts (2)

570-580: Check raw cell value before trimming.

Line 565 trims elements[i] before line 571 checks for non-visible characters. Leading/trailing non-visible characters (e.g., BOM \uFEFF) are stripped by .trim() and won't be detected. The header validation (lines 529-530) correctly checks raw values—apply the same pattern here.

Apply this diff:

           for (let i = 0; i < headers.length; i++) {
             const key = headers[i]!.trim();
-            const rawValue = elements[i]!.trim();
+            const cell = elements[i];
+            const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') {
               continue;
             }
 
             //Check for non visible char in every row, return error if present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {

921-933: Check raw cell value before trimming.

Line 922 trims the cell, then line 925 checks the trimmed rawValue. Leading/trailing non-visible characters are lost. You've already captured the raw cell in the cell variable on line 921—use it for the check.

Apply this diff:

             const key = headers[i]!.trim();
             const cell = elements[i];
             const rawValue = cell == null ? '' : cell.trim();
             if (rawValue === '\n') continue;
             // Return error if any non‑visible character is present
-            const nonVisibleChars = nonVisibleCharChecker(rawValue);
+            const nonVisibleChars = nonVisibleCharChecker(cell);
             if (nonVisibleChars !== null) {
🧹 Nitpick comments (2)
apps/web/src/utils/upload.ts (2)

66-82: Document the rationale for the chosen non-visible character set.

The regex pattern covers ANSI escapes, control characters, zero-width characters, BOM, and variation selectors. However, a past reviewer asked "Why only these five chars?" (referencing an earlier version). Consider adding a comment explaining which character classes are matched and why these were chosen, to help future maintainers understand the coverage.


529-549: Consider using singular forms in error messages.

Since you're reporting a single character code (the first match), the messages would read more naturally in singular form:

  • English: "contains non-visible character" (not "character(s)")
  • French: "contient un caractère non visible" (not "des caractères non visible(s)")

This applies to lines 536-537 and 545-546.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3876707 and a553823.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
apps/web/src/utils/__tests__/upload.test.ts (4)

61-64: Field-order updates are fine; note toMatchObject ignores order.

Reflecting schema order (notes before score) is OK, but toMatchObject doesn’t depend on property ordering. Keep if it improves readability.

Also applies to: 77-80


64-64: DRY up repeated instrument stubs.

Consider tiny helpers to avoid repeating as unknown as AnyUnilingualFormInstrument:

const inst3 = (schema: z3.ZodTypeAny) =>
  ({ validationSchema: schema } as unknown as AnyUnilingualFormInstrument);
const inst4 = (schema: z4.ZodTypeAny) =>
  ({ validationSchema: schema } as unknown as AnyUnilingualFormInstrument);

Then use inst3(z3.object({...})) / inst4(z4.object({...})).

Also applies to: 95-95, 117-117, 136-136, 203-203, 237-237, 256-256, 280-280


200-203: Zod4 reorder looks good; ordering in expectations is cosmetic.

Switching to feedback before score and updating expectations is consistent. Reminder: toMatchObject is order-agnostic.

Also applies to: 216-219


61-63: Add tests for non‑visible character rejection (headers and cells).

Given the PR goal, add focused cases for both Zod3 and Zod4:

  • Header with zero‑width char → reject with codepoint in message (e.g., U+200B), include row/column info.
  • Cell value with bidi/format char (e.g., U+200E) → reject similarly.

Example sketches:

it('rejects non-visible char in header (Zod3)', async () => {
  const instrument = inst3(z3.object({ score: z3.number() }));
  const csv = unparse([['subjectID', `date\u200B`, 'score'], ['s1','2024-01-01','10']]);
  const file = new File([csv], 'data.csv', { type: 'text/csv' });
  await expect(Zod3.processInstrumentCSV(file, instrument)).rejects.toThrow(/U\+200B/i);
});

it('rejects non-visible char in cell (Zod4)', async () => {
  const instrument = inst4(z4.object({ notes: z4.string() }));
  const csv = unparse([['subjectID','date','notes'], ['s1','2024-01-01', `ok\u200E`] ]);
  const file = new File([csv], 'data.csv', { type: 'text/csv' });
  await expect(Zod4.processInstrumentCSV(file, instrument)).rejects.toThrow(/U\+200E/i);
});

Happy to add coverage for U+180E and localized (FR) messages as well.

Also applies to: 200-203

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28fab6d and 10e8c78.

📒 Files selected for processing (2)
  • apps/web/src/utils/__tests__/upload.test.ts (12 hunks)
  • apps/web/src/utils/upload.ts (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/web/src/utils/upload.ts
🔇 Additional comments (2)
apps/web/src/utils/__tests__/upload.test.ts (2)

37-46: Nice regression guard on key normalization.

Reversing object keys in the input and asserting multiKeys: ['name','age'] plus aligned multiValues is a good way to enforce deterministic ordering.


92-95: Optional-before-required case reads well.

LGTM. This verifies empty string → undefined for optional, and preserves the required field.

Also applies to: 107-109

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
apps/web/src/utils/upload.ts (2)

572-582: Check raw cell before trimming to catch leading/trailing non-visible characters.

Line 567 trims the cell, then line 573 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM \uFEFF) are lost. This was flagged in past reviews but remains unaddressed.

Apply this diff:

         for (let i = 0; i < headers.length; i++) {
           const key = headers[i]!.trim();
-          const rawValue = elements[i]!.trim();
+          const cell = elements[i];
+          // Check for non-visible character in raw cell before trimming
+          const nonVisibleChars = nonVisibleCharChecker(cell);
+          if (nonVisibleChars !== null) {
+            const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
+            return reject(
+              new UploadError({
+                en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
+              })
+            );
+          }
+          const rawValue = cell == null ? '' : cell.trim();
           if (rawValue === '\n') {
             continue;
           }
-
-          //Check for non visible char in every row, return error if present
-          const nonVisibleChars = nonVisibleCharChecker(rawValue);
-          if (nonVisibleChars !== null) {
-            const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
-            return reject(
-              new UploadError({
-                en: `Value at row ${rowNumber} and column '${key}' contains non-visible character(s) (U+${charCode})`,
-                fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractère(s) non visibles (U+${charCode})`
-              })
-            );
-          }

Based on learnings.


927-936: Check raw cell before trimming to catch leading/trailing non-visible characters.

Line 924 trims the cell, then line 927 checks the trimmed value. Leading/trailing non-visible characters (e.g., BOM \uFEFF) are lost. This was flagged in past reviews but remains unaddressed.

Apply this diff:

         for (let i = 0; i < headers.length; i++) {
           const key = headers[i]!.trim();
           const cell = elements[i];
-          const rawValue = cell == null ? '' : cell.trim();
-          if (rawValue === '\n') continue;
-          // Return error if any non‑visible character is present
-          const nonVisibleChars = nonVisibleCharChecker(rawValue);
+          // Return error if any non-visible character is present
+          const nonVisibleChars = nonVisibleCharChecker(cell);
           if (nonVisibleChars !== null) {
             const charCode = nonVisibleChars[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
             return reject(
               new UploadError({
-                en: `Value at row ${rowNumber} and column '${key}' contains non-visible characters (U+${charCode})`,
-                fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient des caractères non visibles (U+${charCode})`
+                en: `Value at row ${rowNumber} and column '${key}' contains non-visible character (U+${charCode})`,
+                fr: `La valeur à la ligne ${rowNumber} et colonne '${key}' contient un caractère non visible (U+${charCode})`
               })
             );
           }
+          const rawValue = cell == null ? '' : cell.trim();
+          if (rawValue === '\n') continue;

Based on learnings.

🧹 Nitpick comments (2)
apps/web/src/utils/upload.ts (2)

534-551: Refine error messages for consistency.

Two minor issues:

  1. Use singular "character" since you're displaying a single char code.
  2. Remove double space in French Date message before "(U+".

Apply this diff:

         const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
         return reject(
           new UploadError({
-            en: `Subject ID at row ${rowNumber} contains non-visible character(s) (U+${charCode})`,
-            fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visible(s) (U+${charCode})`
+            en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+            fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
           })
         );
       }
       if (regexResultDate !== null) {
         const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
         return reject(
           new UploadError({
-            en: `Date at row ${rowNumber} contains non-visible character(s) (U+${charCode})`,
-            fr: `Date à la ligne ${rowNumber} contient des caractères non visible(s)  (U+${charCode})`
+            en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+            fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
           })
         );

891-908: Use singular "character" in error messages.

Since you display a single char code, use singular "character" instead of plural "characters".

Apply this diff:

         const charCode = regexResultSubject[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
         return reject(
           new UploadError({
-            en: `Subject ID at row ${rowNumber} contains non-visible characters (U+${charCode})`,
-            fr: `L'ID du sujet à la ligne ${rowNumber} contient des caractères non visibles (U+${charCode})`
+            en: `Subject ID at row ${rowNumber} contains non-visible character (U+${charCode})`,
+            fr: `L'ID du sujet à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
           })
         );
       }
       if (regexResultDate !== null) {
         const charCode = regexResultDate[0].charCodeAt(0).toString(16).toUpperCase().padStart(4, '0');
         return reject(
           new UploadError({
-            en: `Date at row ${rowNumber} contains non-visible characters (U+${charCode})`,
-            fr: `Date à la ligne ${rowNumber} contient des caractères non visibles (U+${charCode})`
+            en: `Date at row ${rowNumber} contains non-visible character (U+${charCode})`,
+            fr: `Date à la ligne ${rowNumber} contient un caractère non visible (U+${charCode})`
           })
         );
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10e8c78 and e8292c9.

📒 Files selected for processing (1)
  • apps/web/src/utils/upload.ts (7 hunks)

david-roper and others added 2 commits October 8, 2025 10:57
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@joshunrau joshunrau merged commit 5836659 into DouglasNeuroInformatics:main Oct 9, 2025
1 of 2 checks passed
@david-roper david-roper deleted the non-visisble-chars branch November 6, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants