Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 28, 2025

  • Understand the comment from @nixel2007 about text insertion without line breaks
  • Analyze the current algorithm in findSuffixMatchWithOffset
  • Create a test case that demonstrates the issue
  • Fix the algorithm to handle inline text insertions (lineOffset=0 with deltaStart changes)
  • Verify the fix with tests (all SemanticTokensProviderTest tests pass)
  • Run full test suite to ensure no regressions (all tests pass)
  • Code review and cleanup (removed debug statements, improved code clarity)
  • Final security scan (CodeQL passed with no alerts)
  • Add specific field assertions for decoded tokens (per @nixel2007 feedback)
  • Add detailed delta edit assertions (per @nixel2007 feedback)

Summary

Successfully addressed @nixel2007's feedback about text insertion without line breaks. The findSuffixMatchWithOffset algorithm now correctly handles inline text insertions (lineOffset=0) by allowing the boundary token to have a different deltaStart value while maintaining the same deltaLine, length, type, and modifiers.

Changes:

  • Modified SemanticTokensProvider.findSuffixMatchWithOffset() to handle inline insertions
  • Added test case deltaWithTextInsertedOnSameLine_shouldReturnOptimalDelta() with comprehensive assertions:
    • Decoded token field assertions for both original and modified arrays
    • Delta edit field assertions (start, deleteCount, data size)
  • Improved code clarity with named constants
  • All tests pass, no security issues found

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 28, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI and others added 2 commits December 28, 2025 11:06
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
When text is inserted on the same line without line breaks (lineOffset=0),
the boundary token may have a different deltaStart value while keeping
the same deltaLine. The algorithm now accounts for this case by allowing
deltaStart to differ for the boundary token when lineOffset=0.

Addresses comment from @nixel2007 about text insertion without line breaks.

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
@nixel2007
Copy link
Member

@coderabbitai в оригинальном pr ты говорил, что проблемы нет. А здесь проблема нашлась. Перепроверь, кто прав?

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 28, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get details about PR #3707 including description and comments
gh pr view 3707 --json title,body,comments --jq '{title, body, comments: [.comments[] | {author: .author.login, body: .body, createdAt: .createdAt}]}'

Length of output: 1859


🏁 Script executed:

#!/bin/bash
# Search for related PRs about semantic tokens
gh pr list --state all --search "semantic token" --limit 20 --json number,title,author,state,createdAt

# Also look at the files changed in this PR to understand what's being modified
gh pr view 3707 --json files --jq '.files[] | .path'

Length of output: 4314


🏁 Script executed:

#!/bin/bash
# Get details about PR #3706 including my comments
gh pr view 3706 --json title,body,comments --jq '{title, body, comments: [.comments[] | select(.author.login == "coderabbitai") | {body: .body, createdAt: .createdAt}]}'

# Also get all comments to see the context
gh pr view 3706 --json comments --jq '.comments[] | {author: .author.login, body: .body, createdAt: .createdAt}' | head -50

Length of output: 27649


🏁 Script executed:

#!/bin/bash
# Get review comments (on specific code lines) from PR #3706
gh api /repos/1c-syntax/bsl-language-server/pulls/3706/reviews --jq '.[] | select(.user.login == "coderabbitai") | {state: .state, body: .body, submitted_at: .submitted_at}'

# Get review comments on specific diff hunks
gh api /repos/1c-syntax/bsl-language-server/pulls/3706/comments --jq '.[] | select(.user.login == "coderabbitai") | {path: .path, body: .body, diff_hunk: .diff_hunk, line: .line}'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Get all comments from PR #3706 including review threads
gh api /repos/1c-syntax/bsl-language-server/pulls/3706/comments --jq '.[] | {author: .user.login, body: .body, path: .path, line: .line, in_reply_to_id: .in_reply_to_id}'

# Also check if there was a previous PR that was merged about semantic tokens
gh pr view 3696 --json title,body,number,state,mergedAt

Length of output: 6026


🏁 Script executed:

#!/bin/bash
# Get the diff from PR #3707 to see what was actually fixed
gh pr diff 3707

# Also get the test file to understand the test case that demonstrates the issue
gh pr view 3707 --json files --jq '.files[] | select(.path | contains("Test")) | .path' | head -1 | xargs -I {} gh api /repos/1c-syntax/bsl-language-server/contents/{} --jq '.content' | base64 -d | tail -100

Length of output: 9638


@nixel2007, вы правы — я ошибался в PR #3706.

Проблема, которую я не увидел:

В оригинальном PR #3706 я неправильно утверждал, что алгоритм корректно обрабатывает вставку текста без перевода строки. На самом деле, при вставке текста на той же строке (например, "Перем Новая, А;" вместо "Перем А;"):

  • Токены после точки вставки имеют одинаковый deltaLine (строка не изменилась)
  • Но разный deltaStart (позиция сдвинулась вправо)

Старый алгоритм в findSuffixMatchWithOffset проверял все поля, начиная с индекса j=1 (что включает deltaStart), и не находил совпадений, хотя сами токены идентичны — просто сдвинуты по горизонтали.

Исправление в PR #3707:

Теперь при lineOffset == 0 (вставка без переноса строки) для граничного токена алгоритм:

  • Начинает проверку с j=2, пропуская и deltaLine и deltaStart
  • Проверяет только length, type и modifiers
  • Это позволяет правильно распознать токен как совпадающий, несмотря на смещение позиции

Ваш вопрос в оригинальном PR был абсолютно оправдан, и обнаруженная проблема реальна. Приношу извинения за первоначальный неверный анализ.

   __  

__( o)>
\ <
. )
`---'
учусь


✏️ Learnings added
Learnt from: nixel2007
URL: https://github.com/1c-syntax/bsl-language-server/pull/3707

Timestamp: 2025-12-28T11:15:32.317Z
Learning: In PR #3706 about semantic token optimization, I incorrectly stated that the algorithm in findSuffixMatchWithOffset correctly handles inline text insertions (text insertion without line breaks). The actual bug is that when lineOffset=0, tokens after the insertion point have the same deltaLine but different deltaStart, and the old algorithm failed to match them because it checked deltaStart equality. PR #3707 fixes this by allowing deltaStart to differ for the boundary token when lineOffset==0.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copilot AI and others added 2 commits December 28, 2025 11:16
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
- Added DELTA_LINE_INDEX and DELTA_START_INDEX constants for clarity
- Simplified redundant condition check
- Improved test assertion to compare against new token size

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copilot AI changed the title [WIP] Address semantic token storage optimization and delta calculation Fix inline text insertion handling in semantic token delta calculation Dec 28, 2025
Copilot AI requested a review from nixel2007 December 28, 2025 11:22
…test

Added detailed assertions on decoded token fields (line, start, length, type, modifiers)
for both original and modified token arrays to verify the exact token structure.

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Added specific assertions on delta edit fields:
- start position (5 - after prefix match)
- deleteCount (0 - no deletion needed due to suffix match)
- data size (10 integers = 2 tokens to insert)

The test now verifies that with lineOffset=0, the suffix matching correctly
identifies "А" and ";" tokens despite different deltaStart positions.

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copilot AI requested a review from nixel2007 December 28, 2025 11:37
@nixel2007 nixel2007 marked this pull request as ready for review December 28, 2025 11:48
Copilot AI review requested due to automatic review settings December 28, 2025 11:48
@github-actions
Copy link
Contributor

github-actions bot commented Dec 28, 2025

Test Results

2 781 files  ±0  2 781 suites  ±0   48m 33s ⏱️ -28s
1 109 tests +1  1 109 ✅ +1  0 💤 ±0  0 ❌ ±0 
9 981 runs  +9  9 981 ✅ +9  0 💤 ±0  0 ❌ ±0 

Results for commit a967801. ± Comparison against base commit 65674c2.

♻️ This comment has been updated with latest results.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses inline text insertion handling in semantic token delta calculations, specifically for cases where text is inserted on the same line without line breaks (lineOffset = 0). The changes modify the findSuffixMatchWithOffset algorithm to detect when tokens have shifted deltaStart values while maintaining the same deltaLine.

Key Changes:

  • Modified the suffix matching algorithm to handle inline text insertions where lineOffset == 0 and deltaStart values differ
  • Added comprehensive test case with detailed token and delta edit assertions
  • Introduced named constants for improved code readability

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/main/java/com/github/_1c_syntax/bsl/languageserver/providers/SemanticTokensProvider.java Updated findSuffixMatchWithOffset method to handle inline edits by conditionally skipping deltaStart checks and tracking boundary tokens
src/test/java/com/github/_1c_syntax/bsl/languageserver/providers/SemanticTokensProviderTest.java Added test case deltaWithTextInsertedOnSameLine_shouldReturnOptimalDelta with detailed assertions for decoded tokens and delta edits

Comment on lines +318 to +324
// Для граничного токена при inline-редактировании (lineOffset == 0)
// разрешаем различие в deltaStart
int firstFieldToCheck = (!foundBoundary && lineOffset == 0) ? DELTA_START_INDEX + 1 : DELTA_START_INDEX;

// Проверяем поля кроме deltaLine (и возможно deltaStart для граничного токена)
boolean otherFieldsMatch = true;
for (int j = 1; j < tokenSize; j++) {
for (int j = firstFieldToCheck; j < tokenSize; j++) {
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a logical inconsistency between the algorithm implementation and the test expectations.

The test expects that when inserting text inline (lineOffset=0), BOTH the "А" token and the ";" token should be part of the suffix match, even though "А" has a different deltaStart value (line 1401-1402 comment states "suffix match includes А and ;").

However, the algorithm only allows ONE boundary token. After finding the first token with a different deltaStart (the ";" token), it sets foundBoundary = true. Then for the next token ("А"), it sets firstFieldToCheck = DELTA_START_INDEX (line 320), which means it WILL check deltaStart. Since "А" also has a different deltaStart, the check at line 325 will fail, breaking the loop.

This means the algorithm would find suffixMatch = 1 (only ";"), but the test expects suffixMatch = 2 (both "А" and ";"), resulting in deleteCount = 0.

Either:

  1. The algorithm needs to be fixed to allow ALL tokens with matching deltaLine when lineOffset=0 (not just one boundary token), OR
  2. The test expectations are incorrect and should expect deleteCount > 0

Please verify which behavior is intended and correct either the implementation or the test.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot check the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Проверил алгоритм - он работает корректно. Анализ бота неверен.

Фактическое поведение:

  • i=0: ";" токены: deltaStart одинаковы (1==1), НЕ граничный токен, suffixMatch=1
  • i=1: "А" токены: deltaStart различны (6≠2), ЭТО граничный токен, suffixMatch=2, foundBoundary=true
  • i=2: цикл прерывается из-за foundBoundary=true

Алгоритм находит suffixMatch=2 ("А" и ";"), что соответствует ожиданиям теста (deleteCount=0, insert 10 integers). Тест проходит успешно.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

@nixel2007 nixel2007 merged commit 9ee333e into feature/optimize-delta-semantic-tokens Dec 28, 2025
49 checks passed
@nixel2007 nixel2007 deleted the copilot/sub-pr-3706 branch December 28, 2025 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants