Skip to content

Terminal search fails to match multi-character CJK text (Chinese/Japanese/Korean) #319

@arvinaij

Description

@arvinaij

Hi,
The terminal search function (Ctrl+F) cannot correctly match CJK (Chinese, Japanese, Korean) text when the search pattern contains more than one character. For example, searching for "中文" (two Chinese characters) yields no results, but searching for a single character like "中" works correctly.

The issue is in TerminalSearchUtil.searchInTerminalTextBuffer().

JediTerm stores double-width characters (CJK characters) with a special placeholder DWC ('\uE000') after each character to maintain a 1:1 mapping between buffer positions and screen cells. For example:

User input: "中文"
Buffer: ['中', '\uE000', '文', '\uE000']

The current searchInTerminalTextBuffer() method in TerminalSearchUtil does not convert the search pattern to match this internal buffer format, causing multi-character CJK searches to fail.

Proposed Solution

1. Modify JediTermWidget.java - findText method

Pass the ambiguousCharsAreDoubleWidth setting to the search function:
  private void findText(String text, boolean ignoreCase) {
      boolean ambiguousIsDWC = myTerminalPanel.ambiguousCharsAreDoubleWidth();
      FindResult results = TerminalSearchUtil.searchInTerminalTextBuffer(
          getTerminalTextBuffer(), text, ignoreCase, ambiguousIsDWC);
      myTerminalPanel.setFindResult(results);
      myFindComponent.onResultUpdated(results);
      myScrollBar.repaint();
  }

2. Modify TerminalSearchUtil.java

Update the search method to preprocess the pattern for double-width characters:

class TerminalSearchUtil {

  static @Nullable SubstringFinder.FindResult searchInTerminalTextBuffer(@NotNull TerminalTextBuffer textBuffer, @NotNull String pattern,
    boolean ignoreCase, boolean ambiguousIsDWC) {
    if (pattern.isEmpty()) {
      return null;
    }

    pattern = stringToDoubleWidthCharacter(pattern, ambiguousIsDWC);

    final SubstringFinder finder = new SubstringFinder(pattern, ignoreCase);

    textBuffer.processHistoryAndScreenLines(-textBuffer.getHistoryLinesCount(), -1, new StyledTextConsumer() {
      @Override public void consume(int x, int y, @NotNull TextStyle style, @NotNull CharBuffer characters, int startRow) {
        int offset = 0;
        int length = characters.length();
        if (characters instanceof SubCharBuffer) {
          SubCharBuffer subCharBuffer = (SubCharBuffer)characters;
          characters = subCharBuffer.getParent();
          offset = subCharBuffer.getOffset();
        }
        for (int i = offset; i < offset + length; i++) {
          finder.nextChar(x, y - startRow, characters, i);
        }
      }

      @Override public void consumeNul(int x, int y, int nulIndex, @NotNull TextStyle style, @NotNull CharBuffer characters, int startRow) {
      }

      @Override public void consumeQueue(int x, int y, int nulIndex, int startRow) {
      }
    });

    return finder.getResult();
  }

  private static String stringToDoubleWidthCharacter(String str, boolean ambiguousIsDWC) {
    char[] chars = str.toCharArray();

    int dwcCount = CharUtils.countDoubleWidthCharacters(chars, 0, chars.length, ambiguousIsDWC);

    char[] buf;

    if (dwcCount > 0) {
      // Leave gaps for the private use "DWC" character, which simply tells the rendering code to advance one cell.
      buf = new char[chars.length + dwcCount];

      int j = 0;
      for (int i = 0; i < chars.length; i++) {
        buf[j] = chars[i];
        int codePoint = Character.codePointAt(chars, i);
        boolean doubleWidthCharacter = CharUtils.isDoubleWidthCharacter(codePoint, ambiguousIsDWC);
        if (doubleWidthCharacter) {
          j++;
          buf[j] = '\uE000';
        }
        j++;
      }
    }
    else {
      buf = chars;
    }
    return new String(buf);
  }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions