Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 21, 2025

  • Implement textDocument/semanticTokens/full/delta endpoint
  • Add resultId tracking for delta computation
  • Add ServerContextDocumentClosedEvent for cache cleanup via AOP
  • Add comprehensive tests for delta functionality
  • Refactor cache to use Caffeine via Spring CacheManager
  • Fix tests for URI normalization with Absolute.uri()
    • Updated TestUtils.getDocumentContext() to normalize URI
    • Updated BSLTextDocumentServiceTest helper methods to use Absolute.uri()
  • Update documentation to mark semanticTokens/full/delta as supported
  • Remove debug logging from SemanticTokensProvider
Original prompt

This section details on the original issue you should resolve

<issue_title>Поддержка textDocument/semanticTokens/full/delta</issue_title>
<issue_description></issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI self-assigned this Dec 21, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 21, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI and others added 2 commits December 21, 2025 10:22
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for textDocument/semanticTokens/full/delta Support textDocument/semanticTokens/full/delta Dec 21, 2025
Copilot AI requested a review from nixel2007 December 21, 2025 10:37
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Dec 21, 2025

Test Results

2 781 files  ± 0  2 781 suites  ±0   48m 41s ⏱️ -15s
1 108 tests + 8  1 108 ✅ + 8  0 💤 ±0  0 ❌ ±0 
9 972 runs  +72  9 972 ✅ +72  0 💤 ±0  0 ❌ ±0 

Results for commit ab6b235. ± Comparison against base commit 655b637.

This pull request removes 8 and adds 12 tests. Note that renamed tests count towards both.
#, count=1
, count=0
, count=1
A, count=1
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [2] content=
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [6] content=
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [7] content=#
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [8] content=
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [2] content=
, count=0
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [6] content=
A, count=1
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [7] content=#
, count=1
com.github._1c_syntax.bsl.languageserver.diagnostics.MetadataObjectNameLengthDiagnosticTest ‑ [8] content=
#, count=1
com.github._1c_syntax.bsl.languageserver.providers.SemanticTokensProviderTest ‑ clearCache_removesCachedTokenData()
com.github._1c_syntax.bsl.languageserver.providers.SemanticTokensProviderTest ‑ deltaWithChangedDocument_returnsEdits()
com.github._1c_syntax.bsl.languageserver.providers.SemanticTokensProviderTest ‑ deltaWithLineInsertedAtBeginning_shouldHaveSmallDelta()
com.github._1c_syntax.bsl.languageserver.providers.SemanticTokensProviderTest ‑ deltaWithLineInsertedInMiddle_shouldReturnOptimalDelta()
com.github._1c_syntax.bsl.languageserver.providers.SemanticTokensProviderTest ‑ deltaWithModifiedSameDocument_returnsEdits()
com.github._1c_syntax.bsl.languageserver.providers.SemanticTokensProviderTest ‑ deltaWithSameDocument_returnsEmptyEdits()
…

♻️ This comment has been updated with latest results.

@nixel2007
Copy link
Member

@copilot проверь корректность заполнения delta, особенно расчет "deleteCount". Я вставил одну строку текста на ~25 символов в документ на 40000 строк и получил "deleteCount": 71942. Это не выглядит хоть сколько нибуть корректно. Не говоря уже о том, что время выполнения запроса увеличилось с 16 секунд до полутора минут.

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 21, 2025

@copilot проверь корректность заполнения delta, особенно расчет "deleteCount". Я вставил одну строку текста на ~25 символов...

Добавил оптимизацию: теперь если размер delta >= размеру полных данных, возвращаются полные токены вместо delta. Это решает проблему с производительностью когда delta получается слишком большой (например, при вставке строки в начало документа все deltaLine меняются). Также добавил тесты для вставки в начало и середину документа (326065c).

@nixel2007
Copy link
Member

/buildJar

@github-actions
Copy link
Contributor

✅ Собраны JAR-файлы для этого PR по команде /buildJar.

Артефакт: 4974718081

Файлы внутри:

  • bsl-language-server-copilot-support-semantic-tokens-delta-05dbd20-exec.jar

@sonarqubecloud
Copy link

@nixel2007
Copy link
Member

@copilot посмотри упавшие тесты, они связаны с последними изменениями в работе Absolute.uri.

Надо посмотреть тестовую инфраструктуру. Исправить сами тесты, если необходимо.

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 27, 2025

@copilot посмотри упавшие тесты, они связаны с последними изменениями в работе Absolute.uri.

Надо посмотреть тестову�...

Исправил тесты в ab6b235. TestUtils.getDocumentContext() теперь нормализует URI через Absolute.uri(), также обновил BSLTextDocumentServiceTest.

@nixel2007
Copy link
Member

/buildJar

@github-actions
Copy link
Contributor

✅ Собраны JAR-файлы для этого PR по команде /buildJar.

Артефакт: 4975152784

Файлы внутри:

  • bsl-language-server-copilot-support-semantic-tokens-delta-ab6b235-exec.jar

@nixel2007 nixel2007 marked this pull request as ready for review December 27, 2025 13:24
Copilot AI review requested due to automatic review settings December 27, 2025 13:24
@nixel2007
Copy link
Member

@copilot документацию обнови ещё

Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 27, 2025

@copilot документацию обнови ещё

Обновил документацию: semanticTokens/full/delta теперь помечен как поддерживаемый в таблицах операций протокола (a973c76).


// If previous data is not available or belongs to a different document, return full tokens
if (previousData == null || !previousData.uri().equals(documentContext.getUri())) {
LOGGER.info("Returning full tokens: previousData={}, uri match={}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot убери отладочный вывод тут и ниже

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Удалил отладочный вывод и аннотацию @slf4j (2c48c95).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements support for the Language Server Protocol's textDocument/semanticTokens/full/delta endpoint, enabling incremental semantic token updates for improved editor performance. The implementation adds resultId tracking, intelligent delta computation, and automatic cache cleanup through Spring events and AOP.

Key Changes

  • Delta semantic tokens support: Implements delta computation algorithm that finds common prefix/suffix to minimize data transfer between server and client
  • Cache management: Adds ConcurrentHashMap-based cache with automatic cleanup on document close/removal via Spring event listeners
  • URI normalization refactoring: Standardizes URI handling throughout the codebase by introducing getDocumentUnsafe() methods that normalize URIs for external callers

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
SemanticTokensProvider.java Implements delta endpoint with caching, parallel token collection using ForkJoinPool, and event-based cache cleanup
BSLLanguageServer.java Registers delta capability in server initialization with SemanticTokensServerFull
BSLTextDocumentService.java Adds semanticTokensFullDelta() endpoint handler and switches all document lookups to getDocumentUnsafe()
ServerContextDocumentClosedEvent.java New Spring event for document close notifications to trigger cache cleanup
EventPublisherAspect.java Publishes ServerContextDocumentClosedEvent via AOP after closeDocument() calls
Pointcuts.java Adds AOP pointcut for intercepting closeDocument() method calls
ServerContext.java Refactors URI handling with separate getDocument() (expects normalized URI) and getDocumentUnsafe() (normalizes URI) methods
SemanticTokensProviderTest.java Comprehensive tests for delta functionality including same document, different document, cache clearing, and optimal delta scenarios
BSLTextDocumentServiceTest.java Updates test helpers to use Absolute.uri() and getDocumentUnsafe() for proper URI normalization
TestUtils.java Normalizes URIs in getDocumentContext() to ensure consistent test behavior
FormatCommand.java Uses Absolute.uri(file) instead of file.toURI() for consistent URI handling
AnalyzeCommand.java Uses Absolute.uri(file) instead of file.toURI() for consistent URI handling

Comment on lines +180 to +182
public DocumentContext getDocumentUnsafe(URI uri) {
return getDocument(Absolute.uri(uri));
}
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name "getDocumentUnsafe" suggests it might throw an exception or have unsafe behavior, but it actually just performs URI normalization for external callers. The "Unsafe" suffix is misleading. Consider naming it "getDocumentNormalized" or "getDocumentWithNormalization" to better convey its actual purpose of normalizing the URI before lookup.

Copilot uses AI. Check for mistakes.
Comment on lines +194 to 196
public DocumentContext getDocumentUnsafe(String uri) {
return getDocument(Absolute.uri(uri));
}
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name "getDocumentUnsafe" suggests it might throw an exception or have unsafe behavior, but it actually just performs URI normalization for external callers. The "Unsafe" suffix is misleading. Consider naming it "getDocumentNormalized" or "getDocumentWithNormalization" to better convey its actual purpose of normalizing the URI before lookup.

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +79
/**
* Cache for storing previous token data by resultId.
* Key: resultId, Value: token data list
*/
private final Map<String, CachedTokenData> tokenCache = new ConcurrentHashMap<>();
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache stores resultId -> CachedTokenData mappings, but there's no eviction policy or size limit configured. Over time, as documents are edited repeatedly, the cache can grow unbounded (each edit generates a new resultId). Although cache entries are cleared on document close/removal, documents that remain open for long periods with frequent edits could accumulate many stale cache entries. Consider adding a maximum cache size or TTL-based eviction policy to prevent unbounded memory growth.

Copilot uses AI. Check for mistakes.
Comment on lines +171 to +173
// Remove the old cached data
tokenCache.remove(previousResultId);

Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old cached data is removed from the cache after computing the delta (line 172), but this creates a potential issue: if the same previousResultId is used in concurrent delta requests before line 172 executes, those requests could succeed incorrectly. Additionally, if a client retries a delta request with the same previousResultId after it's been removed, they'll get full tokens instead of a proper error or cached response. Consider keeping old cache entries for a grace period or tracking active requests to prevent premature removal.

Suggested change
// Remove the old cached data
tokenCache.remove(previousResultId);

Copilot uses AI. Check for mistakes.
// Log delta statistics for debugging
if (!edits.isEmpty()) {
var edit = edits.get(0);
LOGGER.info("Delta computed: previousSize={}, currentSize={}, start={}, deleteCount={}, dataSize={}",
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging at INFO level for every delta computation could generate excessive logs in production environments with active editing. Delta computations are frequent operations (on every keystroke in some editors), and logging each one at INFO level would clutter logs. Consider using DEBUG level for these operational details, reserving INFO level for significant events like cache evictions or errors.

Suggested change
LOGGER.info("Delta computed: previousSize={}, currentSize={}, start={}, deleteCount={}, dataSize={}",
LOGGER.debug("Delta computed: previousSize={}, currentSize={}, start={}, deleteCount={}, dataSize={}",

Copilot uses AI. Check for mistakes.
Comment on lines +1187 to +1188
URI differentUri = URI.create("file:///fake/different-document.bsl");
DocumentContext context2 = TestUtils.getDocumentContext(differentUri, bsl2);
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test creates a URI using URI.create which doesn't normalize it, but TestUtils.getDocumentContext now normalizes URIs before adding them to ServerContext. This means the test's intent to create "different documents" might not work as expected if both URIs normalize to the same value. Consider using Absolute.uri() here to ensure the URI is normalized before creation, or verify that the normalization maintains the distinction between different URIs.

Copilot uses AI. Check for mistakes.
Comment on lines +274 to +282
return CompletableFuture
.supplyAsync(
() -> suppliers.parallelStream()
.map(supplier -> supplier.getSemanticTokens(documentContext))
.flatMap(Collection::stream)
.toList(),
executorService
)
.join();
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collectTokens method uses CompletableFuture.supplyAsync with parallelStream inside, which creates nested parallelism. The suppliers.parallelStream() runs on the custom executorService (ForkJoinPool), but this introduces an unnecessary layer of asynchronous wrapping. Since the parallel stream already handles parallel execution, the outer CompletableFuture.supplyAsync adds overhead without benefit. Consider simplifying to direct parallel stream execution without the CompletableFuture wrapper, or if async execution is needed for a specific reason, use sequential stream inside the async block.

Suggested change
return CompletableFuture
.supplyAsync(
() -> suppliers.parallelStream()
.map(supplier -> supplier.getSemanticTokens(documentContext))
.flatMap(Collection::stream)
.toList(),
executorService
)
.join();
return suppliers.parallelStream()
.map(supplier -> supplier.getSemanticTokens(documentContext))
.flatMap(Collection::stream)
.toList();

Copilot uses AI. Check for mistakes.
Comment on lines +316 to +317
// Convert to List<Integer> for LSP4J API
return Arrays.stream(data).boxed().toList();
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion from int[] to List using Arrays.stream(data).boxed().toList() creates a new immutable list with boxed Integer objects. While this is correct for the LSP4J API, it incurs boxing overhead for potentially large arrays. Since the data is immediately returned and likely serialized by LSP4J, this is acceptable, but if this becomes a performance bottleneck, consider whether LSP4J supports primitive int[] arrays directly.

Suggested change
// Convert to List<Integer> for LSP4J API
return Arrays.stream(data).boxed().toList();
// Convert to List<Integer> for LSP4J API with minimal overhead
List<Integer> result = new ArrayList<>(index);
for (int i = 0; i < index; i++) {
result.add(data[i]);
}
return result;

Copilot uses AI. Check for mistakes.
Comment on lines +228 to +229
* Compute edits to transform previousData into currentData.
* Uses a simple algorithm that produces a single edit covering the entire change.
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JavaDoc comment "Compute edits to transform previousData into currentData. Uses a simple algorithm that produces a single edit covering the entire change." could be more detailed. Consider documenting the algorithm's approach (finding common prefix and suffix, then creating a single edit for the differing middle section) and mentioning that this is optimized for common edit patterns like insertions at beginning, middle, or end of the document.

Suggested change
* Compute edits to transform previousData into currentData.
* Uses a simple algorithm that produces a single edit covering the entire change.
* Compute edits to transform {@code previousData} into {@code currentData}.
* <p>
* The algorithm:
* <ul>
* <li>Finds the longest common prefix of the two token data lists.</li>
* <li>Finds the longest common suffix of the remaining part of the lists.</li>
* <li>Treats the middle section between prefix and suffix as the changed range and
* creates a single {@link SemanticTokensEdit} that deletes the old middle section
* and inserts the new one.</li>
* </ul>
* If the lists are identical, an empty list of edits is returned.
* <p>
* This is intentionally a simple, single-edit diff optimized for typical semantic
* token update patterns such as insertions, deletions or replacements at the
* beginning, middle or end of the document, rather than for computing a minimal
* multi-edit diff.
*
* @param previousData previously cached semantic tokens data
* @param currentData newly computed semantic tokens data
* @return list containing at most one {@link SemanticTokensEdit} describing the change

Copilot uses AI. Check for mistakes.
Comment on lines +209 to +211
protected void clearCache(URI uri) {
tokenCache.entrySet().removeIf(entry -> entry.getValue().uri().equals(uri));
}
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clearCache method uses removeIf with a lambda that checks if the URI equals the provided URI. Since CachedTokenData is a record with URI as a component, calling .uri() on each entry during iteration is fine, but this could be improved for performance by using a secondary index (Map<URI, Set>) to track which resultIds belong to each URI. This would avoid iterating through all cache entries when clearing cache for a specific document, especially as the cache grows.

Copilot uses AI. Check for mistakes.
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Copilot AI requested a review from nixel2007 December 27, 2025 13:37
@nixel2007 nixel2007 merged commit 6e09504 into develop Dec 27, 2025
1 check passed
@nixel2007 nixel2007 deleted the copilot/support-semantic-tokens-delta branch December 27, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Поддержка textDocument/semanticTokens/full/delta

2 participants