-
Notifications
You must be signed in to change notification settings - Fork 121
Support textDocument/semanticTokens/full/delta #3696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Comment |
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
src/main/java/com/github/_1c_syntax/bsl/languageserver/BSLTextDocumentService.java
Outdated
Show resolved
Hide resolved
...test/java/com/github/_1c_syntax/bsl/languageserver/providers/SemanticTokensProviderTest.java
Show resolved
Hide resolved
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
src/main/java/com/github/_1c_syntax/bsl/languageserver/providers/SemanticTokensProvider.java
Show resolved
Hide resolved
c7bfb5b to
dd1767d
Compare
Test Results2 781 files ± 0 2 781 suites ±0 48m 41s ⏱️ -15s Results for commit ab6b235. ± Comparison against base commit 655b637. This pull request removes 8 and adds 12 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
|
@copilot проверь корректность заполнения delta, особенно расчет "deleteCount". Я вставил одну строку текста на ~25 символов в документ на 40000 строк и получил "deleteCount": 71942. Это не выглядит хоть сколько нибуть корректно. Не говоря уже о том, что время выполнения запроса увеличилось с 16 секунд до полутора минут. |
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
Добавил оптимизацию: теперь если размер delta >= размеру полных данных, возвращаются полные токены вместо delta. Это решает проблему с производительностью когда delta получается слишком большой (например, при вставке строки в начало документа все deltaLine меняются). Также добавил тесты для вставки в начало и середину документа (326065c). |
|
/buildJar |
|
✅ Собраны JAR-файлы для этого PR по команде Артефакт: 4974718081 Файлы внутри:
|
|
|
@copilot посмотри упавшие тесты, они связаны с последними изменениями в работе Absolute.uri. Надо посмотреть тестовую инфраструктуру. Исправить сами тесты, если необходимо. |
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
|
/buildJar |
|
✅ Собраны JAR-файлы для этого PR по команде Артефакт: 4975152784 Файлы внутри:
|
|
@copilot документацию обнови ещё |
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>
|
|
||
| // If previous data is not available or belongs to a different document, return full tokens | ||
| if (previousData == null || !previousData.uri().equals(documentContext.getUri())) { | ||
| LOGGER.info("Returning full tokens: previousData={}, uri match={}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot убери отладочный вывод тут и ниже
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements support for the Language Server Protocol's textDocument/semanticTokens/full/delta endpoint, enabling incremental semantic token updates for improved editor performance. The implementation adds resultId tracking, intelligent delta computation, and automatic cache cleanup through Spring events and AOP.
Key Changes
- Delta semantic tokens support: Implements delta computation algorithm that finds common prefix/suffix to minimize data transfer between server and client
- Cache management: Adds ConcurrentHashMap-based cache with automatic cleanup on document close/removal via Spring event listeners
- URI normalization refactoring: Standardizes URI handling throughout the codebase by introducing
getDocumentUnsafe()methods that normalize URIs for external callers
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| SemanticTokensProvider.java | Implements delta endpoint with caching, parallel token collection using ForkJoinPool, and event-based cache cleanup |
| BSLLanguageServer.java | Registers delta capability in server initialization with SemanticTokensServerFull |
| BSLTextDocumentService.java | Adds semanticTokensFullDelta() endpoint handler and switches all document lookups to getDocumentUnsafe() |
| ServerContextDocumentClosedEvent.java | New Spring event for document close notifications to trigger cache cleanup |
| EventPublisherAspect.java | Publishes ServerContextDocumentClosedEvent via AOP after closeDocument() calls |
| Pointcuts.java | Adds AOP pointcut for intercepting closeDocument() method calls |
| ServerContext.java | Refactors URI handling with separate getDocument() (expects normalized URI) and getDocumentUnsafe() (normalizes URI) methods |
| SemanticTokensProviderTest.java | Comprehensive tests for delta functionality including same document, different document, cache clearing, and optimal delta scenarios |
| BSLTextDocumentServiceTest.java | Updates test helpers to use Absolute.uri() and getDocumentUnsafe() for proper URI normalization |
| TestUtils.java | Normalizes URIs in getDocumentContext() to ensure consistent test behavior |
| FormatCommand.java | Uses Absolute.uri(file) instead of file.toURI() for consistent URI handling |
| AnalyzeCommand.java | Uses Absolute.uri(file) instead of file.toURI() for consistent URI handling |
| public DocumentContext getDocumentUnsafe(URI uri) { | ||
| return getDocument(Absolute.uri(uri)); | ||
| } |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method name "getDocumentUnsafe" suggests it might throw an exception or have unsafe behavior, but it actually just performs URI normalization for external callers. The "Unsafe" suffix is misleading. Consider naming it "getDocumentNormalized" or "getDocumentWithNormalization" to better convey its actual purpose of normalizing the URI before lookup.
| public DocumentContext getDocumentUnsafe(String uri) { | ||
| return getDocument(Absolute.uri(uri)); | ||
| } |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method name "getDocumentUnsafe" suggests it might throw an exception or have unsafe behavior, but it actually just performs URI normalization for external callers. The "Unsafe" suffix is misleading. Consider naming it "getDocumentNormalized" or "getDocumentWithNormalization" to better convey its actual purpose of normalizing the URI before lookup.
| /** | ||
| * Cache for storing previous token data by resultId. | ||
| * Key: resultId, Value: token data list | ||
| */ | ||
| private final Map<String, CachedTokenData> tokenCache = new ConcurrentHashMap<>(); |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache stores resultId -> CachedTokenData mappings, but there's no eviction policy or size limit configured. Over time, as documents are edited repeatedly, the cache can grow unbounded (each edit generates a new resultId). Although cache entries are cleared on document close/removal, documents that remain open for long periods with frequent edits could accumulate many stale cache entries. Consider adding a maximum cache size or TTL-based eviction policy to prevent unbounded memory growth.
| // Remove the old cached data | ||
| tokenCache.remove(previousResultId); | ||
|
|
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old cached data is removed from the cache after computing the delta (line 172), but this creates a potential issue: if the same previousResultId is used in concurrent delta requests before line 172 executes, those requests could succeed incorrectly. Additionally, if a client retries a delta request with the same previousResultId after it's been removed, they'll get full tokens instead of a proper error or cached response. Consider keeping old cache entries for a grace period or tracking active requests to prevent premature removal.
| // Remove the old cached data | |
| tokenCache.remove(previousResultId); |
| // Log delta statistics for debugging | ||
| if (!edits.isEmpty()) { | ||
| var edit = edits.get(0); | ||
| LOGGER.info("Delta computed: previousSize={}, currentSize={}, start={}, deleteCount={}, dataSize={}", |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging at INFO level for every delta computation could generate excessive logs in production environments with active editing. Delta computations are frequent operations (on every keystroke in some editors), and logging each one at INFO level would clutter logs. Consider using DEBUG level for these operational details, reserving INFO level for significant events like cache evictions or errors.
| LOGGER.info("Delta computed: previousSize={}, currentSize={}, start={}, deleteCount={}, dataSize={}", | |
| LOGGER.debug("Delta computed: previousSize={}, currentSize={}, start={}, deleteCount={}, dataSize={}", |
| URI differentUri = URI.create("file:///fake/different-document.bsl"); | ||
| DocumentContext context2 = TestUtils.getDocumentContext(differentUri, bsl2); |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test creates a URI using URI.create which doesn't normalize it, but TestUtils.getDocumentContext now normalizes URIs before adding them to ServerContext. This means the test's intent to create "different documents" might not work as expected if both URIs normalize to the same value. Consider using Absolute.uri() here to ensure the URI is normalized before creation, or verify that the normalization maintains the distinction between different URIs.
| return CompletableFuture | ||
| .supplyAsync( | ||
| () -> suppliers.parallelStream() | ||
| .map(supplier -> supplier.getSemanticTokens(documentContext)) | ||
| .flatMap(Collection::stream) | ||
| .toList(), | ||
| executorService | ||
| ) | ||
| .join(); |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collectTokens method uses CompletableFuture.supplyAsync with parallelStream inside, which creates nested parallelism. The suppliers.parallelStream() runs on the custom executorService (ForkJoinPool), but this introduces an unnecessary layer of asynchronous wrapping. Since the parallel stream already handles parallel execution, the outer CompletableFuture.supplyAsync adds overhead without benefit. Consider simplifying to direct parallel stream execution without the CompletableFuture wrapper, or if async execution is needed for a specific reason, use sequential stream inside the async block.
| return CompletableFuture | |
| .supplyAsync( | |
| () -> suppliers.parallelStream() | |
| .map(supplier -> supplier.getSemanticTokens(documentContext)) | |
| .flatMap(Collection::stream) | |
| .toList(), | |
| executorService | |
| ) | |
| .join(); | |
| return suppliers.parallelStream() | |
| .map(supplier -> supplier.getSemanticTokens(documentContext)) | |
| .flatMap(Collection::stream) | |
| .toList(); |
| // Convert to List<Integer> for LSP4J API | ||
| return Arrays.stream(data).boxed().toList(); |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion from int[] to List using Arrays.stream(data).boxed().toList() creates a new immutable list with boxed Integer objects. While this is correct for the LSP4J API, it incurs boxing overhead for potentially large arrays. Since the data is immediately returned and likely serialized by LSP4J, this is acceptable, but if this becomes a performance bottleneck, consider whether LSP4J supports primitive int[] arrays directly.
| // Convert to List<Integer> for LSP4J API | |
| return Arrays.stream(data).boxed().toList(); | |
| // Convert to List<Integer> for LSP4J API with minimal overhead | |
| List<Integer> result = new ArrayList<>(index); | |
| for (int i = 0; i < index; i++) { | |
| result.add(data[i]); | |
| } | |
| return result; |
| * Compute edits to transform previousData into currentData. | ||
| * Uses a simple algorithm that produces a single edit covering the entire change. |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JavaDoc comment "Compute edits to transform previousData into currentData. Uses a simple algorithm that produces a single edit covering the entire change." could be more detailed. Consider documenting the algorithm's approach (finding common prefix and suffix, then creating a single edit for the differing middle section) and mentioning that this is optimized for common edit patterns like insertions at beginning, middle, or end of the document.
| * Compute edits to transform previousData into currentData. | |
| * Uses a simple algorithm that produces a single edit covering the entire change. | |
| * Compute edits to transform {@code previousData} into {@code currentData}. | |
| * <p> | |
| * The algorithm: | |
| * <ul> | |
| * <li>Finds the longest common prefix of the two token data lists.</li> | |
| * <li>Finds the longest common suffix of the remaining part of the lists.</li> | |
| * <li>Treats the middle section between prefix and suffix as the changed range and | |
| * creates a single {@link SemanticTokensEdit} that deletes the old middle section | |
| * and inserts the new one.</li> | |
| * </ul> | |
| * If the lists are identical, an empty list of edits is returned. | |
| * <p> | |
| * This is intentionally a simple, single-edit diff optimized for typical semantic | |
| * token update patterns such as insertions, deletions or replacements at the | |
| * beginning, middle or end of the document, rather than for computing a minimal | |
| * multi-edit diff. | |
| * | |
| * @param previousData previously cached semantic tokens data | |
| * @param currentData newly computed semantic tokens data | |
| * @return list containing at most one {@link SemanticTokensEdit} describing the change |
| protected void clearCache(URI uri) { | ||
| tokenCache.entrySet().removeIf(entry -> entry.getValue().uri().equals(uri)); | ||
| } |
Copilot
AI
Dec 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clearCache method uses removeIf with a lambda that checks if the URI equals the provided URI. Since CachedTokenData is a record with URI as a component, calling .uri() on each entry during iteration is fine, but this could be improved for performance by using a secondary index (Map<URI, Set>) to track which resultIds belong to each URI. This would avoid iterating through all cache entries when clearing cache for a specific document, especially as the cache grows.
Co-authored-by: nixel2007 <1132840+nixel2007@users.noreply.github.com>



textDocument/semanticTokens/full/deltaendpointServerContextDocumentClosedEventfor cache cleanup via AOPAbsolute.uri()TestUtils.getDocumentContext()to normalize URIBSLTextDocumentServiceTesthelper methods to useAbsolute.uri()semanticTokens/full/deltaas supportedOriginal prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.