Skip to content

Commit 387826b

Browse files
committed
Address review feedback.
1 parent 01b09ec commit 387826b

File tree

10 files changed

+263
-166
lines changed

10 files changed

+263
-166
lines changed

CONTRIBUTING.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Contributing guide
2+
3+
See https://sourcegraph.github.io/lsif-java/docs/contributing.html

README.md

Lines changed: 7 additions & 148 deletions
Original file line numberDiff line numberDiff line change
@@ -1,150 +1,9 @@
11
# Java indexer for the Language Server Index Format (LSIF) ![](https://img.shields.io/badge/status-development-yellow?style=flat)
22

3-
## Usage
4-
5-
Visit https://sourcegraph.github.io/lsif-java to get started with lsif-java.
6-
7-
⚠ The rest of this readme is targeted at contributors to the lsif-java codebase.
8-
9-
## Overview
10-
11-
This project is implemented as a
12-
[Java compiler plugin](https://docs.oracle.com/en/java/javase/11/docs/api/jdk.compiler/com/sun/source/util/Plugin.html)
13-
that generates one
14-
[SemanticDB](https://scalameta.org/docs/semanticdb/specification.html) file for
15-
every `*.java` source file. After compilation completes, the SemanticDB files
16-
are processed to produce LSIF.
17-
18-
![A three stage pipeline that starts with a list of Java sources, creates a list of SemanticDB files that then become a single LSIF index.](docs/assets/semanticdb-javac-pipeline.svg)
19-
20-
### Why Java compiler plugin?
21-
22-
There are several benefits to implementing lsif-java as a compiler plugin:
23-
24-
- **Simple installation**: compiler plugins are enabled with the `-Xplugin`
25-
compiler option. All Java build tools support a way to customize compiler
26-
options, simplifying installation.
27-
- **Language fidelity**: by using the Java compiler to produce semantic
28-
information, we ensure that the produced LSIF data is accurate even as new
29-
Java language versions with new language features are released.
30-
- **Environment fidelity**: by hooking into the compilation process of the build
31-
tool, we minimize the risk of diverging from the CI build environment such as
32-
installed system dependencies, custom compiler options and custom annotation
33-
processors.
34-
35-
### Why SemanticDB?
36-
37-
SemanticDB is Protobuf schema for information about symbols and types in Java
38-
programs, Scala programs and other languages. There are several benefits to
39-
using SemanticDB as an intermediary representation for LSIF:
40-
41-
- **Simplicity**: It's easy to translate a single Java source file into a single
42-
SemanticDB file inside a compiler plugin. It's more complicated to produce
43-
LSIF because compiler plugins does not have access to a project-wide context,
44-
which is necessary to produce accurate definitions and hovers in multi-module
45-
projects with external library dependencies.
46-
- **Performance**: SemanticDB is fast to write and read. Each compilation unit
47-
can be processed independently to keep memory usage low. The final conversion
48-
from SemanticDB to LSIF can be safely parallelized.
49-
- **Cross-language**: SemanticDB has a
50-
[spec](https://scalameta.org/docs/semanticdb/specification.html) for Java and
51-
Scala enabling cross-language navigation in hybrid Java/Scala codebases.
52-
- **Cross-repository**: Compiler plugins have access to both source code and the
53-
classpath (compiled bytecode of upstream dependencies). SemanticDB has been
54-
designed so that it's also possible to generate spec-compliant symbols from
55-
the classpath alone (no source code) and from the syntax tree of an individual
56-
source file (no classpath). This flexibility allows the
57-
[Metals](https://scalameta.org/metals/) language server to index codebases
58-
from a variety of different inputs, and will be helpful for lsif-java in the
59-
future to unblock cross-repository navigation.
60-
61-
## Contributing
62-
63-
The following sections provide tips on how to contribute to this codebase.
64-
65-
### System dependencies
66-
67-
- `java`: any version should work
68-
- `git`: any version should work
69-
- `lsif-semanticdb`:
70-
`go get github.com/sourcegraph/lsif-semanticdb/cmd/lsif-semanticdb`
71-
- `gradle`: `brew install gradle`, or see
72-
[general installation guide](https://gradle.org/install/).
73-
- `mvn`: `brew install maven`, or see
74-
[general installation guide](https://www.baeldung.com/install-maven-on-windows-linux-mac).
75-
76-
### Project structure
77-
78-
These are the main components of the project.
79-
80-
- `semanticdb-javac/src/main/java`: the Java compiler plugin that creates
81-
SemanticDB files.
82-
- `tests/minimized`: minimized Java source files that reproduce interesting test
83-
cases.
84-
- `tests/unit`: fast running unit tests that are helpful for local edit-and-test
85-
workflows.
86-
- `tests/snapshots`: slow running
87-
["snapshot tests"](https://jestjs.io/docs/en/snapshot-testing) that index a
88-
corpus of published Java libraries.
89-
- `build.sbt`: the sbt build definition.
90-
- `project/plugins.sbt`: plugins for the sbt build.
91-
92-
### Helpful commands
93-
94-
| Command | Where | Description |
95-
| ------------------------------------------------------------------- | -------- | ----------------------------------------------------------------------------------- |
96-
| `./sbt` | terminal | Start interactive sbt shell with Java 11. Takes a while to load on the first run. |
97-
| `unit/test` | sbt | Run fast unit tests. |
98-
| `~unit/test` | sbt | Start watch mode to run tests on file save, good for local edit-and-test workflows. |
99-
| `buildTools/test` | sbt | Run slow build tool tests (Gradle, Maven). |
100-
| `snapshots/testOnly tests.MinimizedSnapshotSuite` | sbt | Runs fast snapshot tests. Indexes a small set of files under `tests/minimized`. |
101-
| `snapshots/testOnly tests.MinimizedSnapshotSuite -- *InnerClasses*` | sbt | Runs only individual tests cases matching the name "InnerClasses". |
102-
| `snapshots/testOnly tests.LibrarySnapshotSuite` | sbt | Runs slow snapshot tests. Indexes a corpus of external Java libraries. |
103-
| `snapshots/test` | sbt | Runs all snapshot tests. |
104-
| `snapshots/run` | sbt | Update snapshot tests. Use this command after you have fixed a bug. |
105-
| `cli/run --cwd DIRECTORY` | sbt | Run `lsif-java` command-line tool against a given Gradle/Maven build. |
106-
| `cd website && yarn install && yarn start` | terminal | Start live-reload preview of the website at http://localhost:3000/lsif-java. |
107-
| `docs/mdoc --watch` | sbt | Re-generate markdown files in the `docs/` directory. |
108-
| `fixAll` | sbt | Run Scalafmt, Scalafix and Javafmt on all sources. Run this before opening a PR. |
109-
110-
### Import the project into IntelliJ
111-
112-
It's recommended to use IntelliJ when editing code in this codebase.
113-
114-
First, install the
115-
[IntelliJ Community Edition](https://www.jetbrains.com/idea/download/). The
116-
community edition is
117-
[open source](https://github.com/JetBrains/intellij-community) and free to use.
118-
119-
Next, install the IntelliJ Scala plugin.
120-
121-
Finally, run "File > Project From Existing Sources" to import the sbt build into
122-
IntelliJ. Select the "sbt" option if it asks you to choose between
123-
sbt/BSP/Bloop.
124-
125-
It's best to run tests from the sbt shell, not from the IntelliJ UI.
126-
127-
### Don't use VS Code/Vim/Sublime Text/Emacs
128-
129-
If you want to use completions and precise code navigation, it's not recommended
130-
to use other editors than IntelliJ. IntelliJ is the only IDE that properly
131-
supports hybrid Java/Scala codebases at the moment, although that may change
132-
soon thanks to lsif-java :)
133-
134-
### Tests are written in Scala
135-
136-
This codebases uses the Scala library [MUnit](https://scalameta.org/munit/) to
137-
write tests because:
138-
139-
- MUnit has built-in assertions that print readable multiline diffs in color.
140-
- MUnit makes it easy to implement
141-
[snapshot testing](https://jestjs.io/docs/en/snapshot-testing), which is a
142-
testing technique that's heavily used in this codebase.
143-
- Multiline literal strings in Scala make it easy to write unit tests for source
144-
code (which is always multiline). Modern versions of Java support multiline
145-
string literals, but they're not supported in Java 8, which is supported by
146-
lsif-java.
147-
148-
## Benchmarks
149-
150-
See [docs/benchmarks.md] for benchmark results.
3+
| Documentation | Link |
4+
| -------------------- | ---------------------------------------------------------------------- |
5+
| Landing page | https://sourcegraph.github.io/lsif-java |
6+
| Getting started | https://sourcegraph.github.io/lsif-java/docs/getting-started.html |
7+
| Manual configuration | https://sourcegraph.github.io/lsif-java/docs/manual-configuration.html |
8+
| Contributing | https://sourcegraph.github.io/lsif-java/docs/contributing.html |
9+
| Design | https://sourcegraph.github.io/lsif-java/docs/design.html |

build.sbt

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import scala.xml.{Node => XmlNode, NodeSeq => XmlNodeSeq, _}
2+
import scala.xml.transform.{RewriteRule, RuleTransformer}
13
import java.io.File
24
import java.util.Properties
35
import scala.collection.mutable.ListBuffer
@@ -97,6 +99,16 @@ lazy val plugin = project
9799
old.withEnabled(false)
98100
},
99101
fatjarPackageSettings,
102+
assemblyShadeRules.in(assembly) :=
103+
Seq(
104+
ShadeRule
105+
.rename(
106+
"com.google.**" -> "com.sourcegraph.shaded.com.google.@1",
107+
"google.**" -> "com.sourcegraph.shaded.google.@1",
108+
"org.relaxng.**" -> "com.sourcegraph.shaded.relaxng.@1"
109+
)
110+
.inAll
111+
),
100112
crossPaths := false,
101113
PB.targets.in(Compile) :=
102114
Seq(PB.gens.java -> (Compile / sourceManaged).value)
@@ -280,7 +292,7 @@ lazy val fatjarPackageSettings = List[Def.Setting[_]](
280292
case PathList("sun", _ @_*) =>
281293
MergeStrategy.discard
282294
case PathList("META-INF", "versions", "9", "module-info.class") =>
283-
MergeStrategy.first
295+
MergeStrategy.discard
284296
case x =>
285297
val oldStrategy = (assemblyMergeStrategy in assembly).value
286298
oldStrategy(x)
@@ -291,6 +303,33 @@ lazy val fatjarPackageSettings = List[Def.Setting[_]](
291303
val _ = assembly.value
292304
IO.copyFile(fatJar, slimJar, CopyOptions().withOverwrite(true))
293305
slimJar
306+
},
307+
packagedArtifact.in(Compile).in(packageBin) := {
308+
val (art, slimJar) = packagedArtifact.in(Compile).in(packageBin).value
309+
val fatJar =
310+
new File(crossTarget.value + "/" + assemblyJarName.in(assembly).value)
311+
val _ = assembly.value
312+
IO.copy(List(fatJar -> slimJar), CopyOptions().withOverwrite(true))
313+
(art, slimJar)
314+
},
315+
pomPostProcess := { node =>
316+
new RuleTransformer(
317+
new RewriteRule {
318+
private def isAbsorbedDependency(node: XmlNode): Boolean = {
319+
node.label == "dependency" &&
320+
node.child.exists(child => child.label == "artifactId")
321+
}
322+
override def transform(node: XmlNode): XmlNodeSeq =
323+
node match {
324+
case e: Elem if isAbsorbedDependency(node) =>
325+
Comment(
326+
"the dependency that was here has been absorbed via sbt-assembly"
327+
)
328+
case _ =>
329+
node
330+
}
331+
}
332+
).transform(node).head
294333
}
295334
)
296335

cli/src/main/scala/com/sourcegraph/lsif_java/IndexCommand.scala

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,7 @@ case class IndexCommand(
8686
def workingDirectory: Path = AbsolutePath.of(app.env.workingDirectory)
8787
def finalTargetroot(default: Path): Path =
8888
AbsolutePath.of(targetroot.getOrElse(default), workingDirectory)
89+
def finalOutput: Path = AbsolutePath.of(output, workingDirectory)
8990
def finalBuildCommand(default: List[String]): List[String] =
9091
if (buildCommand.isEmpty)
9192
default
@@ -149,8 +150,14 @@ case class IndexCommand(
149150
} else {
150151
val generateLsifResult = process(
151152
"lsif-semanticdb",
153+
s"--out=${finalOutput}",
152154
s"--semanticdbDir=${tool.targetroot}"
153155
)
156+
if (
157+
generateLsifResult.exitCode == 0 && Files.isRegularFile(finalOutput)
158+
) {
159+
app.info(finalOutput.toAbsolutePath().toString())
160+
}
154161
generateSemanticdbResult.exitCode + generateLsifResult.exitCode
155162
}
156163
case many =>

docs/architecture.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
id: design
3+
title: Design
4+
---
5+
6+
This project is implemented as a
7+
[Java compiler plugin](https://docs.oracle.com/en/java/javase/11/docs/api/jdk.compiler/com/sun/source/util/Plugin.html)
8+
that generates one
9+
[SemanticDB](https://scalameta.org/docs/semanticdb/specification.html) file for
10+
every `*.java` source file. After compilation completes, the SemanticDB files
11+
are processed to produce LSIF.
12+
13+
![A three stage pipeline that starts with a list of Java sources, creates a list of SemanticDB files that then become a single LSIF index.](docs/assets/semanticdb-javac-pipeline.svg)
14+
15+
### Why Java compiler plugin?
16+
17+
There are several benefits to implementing lsif-java as a compiler plugin:
18+
19+
- **Simple installation**: compiler plugins are enabled with the `-Xplugin`
20+
compiler option. All Java build tools support a way to customize compiler
21+
options, simplifying installation.
22+
- **Language fidelity**: by using the Java compiler to produce semantic
23+
information, we ensure that the produced LSIF data is accurate even as new
24+
Java language versions with new language features are released.
25+
- **Environment fidelity**: by hooking into the compilation process of the build
26+
tool, we minimize the risk of diverging from the CI build environment such as
27+
installed system dependencies, custom compiler options and custom annotation
28+
processors.
29+
30+
### Why SemanticDB?
31+
32+
SemanticDB is Protobuf schema for information about symbols and types in Java
33+
programs, Scala programs and other languages. There are several benefits to
34+
using SemanticDB as an intermediary representation for LSIF:
35+
36+
- **Simplicity**: It's easy to translate a single Java source file into a single
37+
SemanticDB file inside a compiler plugin. It's more complicated to produce
38+
LSIF because compiler plugins does not have access to a project-wide context,
39+
which is necessary to produce accurate definitions and hovers in multi-module
40+
projects with external library dependencies.
41+
- **Performance**: SemanticDB is fast to write and read. Each compilation unit
42+
can be processed independently to keep memory usage low. The final conversion
43+
from SemanticDB to LSIF can be safely parallelized.
44+
- **Cross-language**: SemanticDB has a
45+
[spec](https://scalameta.org/docs/semanticdb/specification.html) for Java and
46+
Scala enabling cross-language navigation in hybrid Java/Scala codebases.
47+
- **Cross-repository**: Compiler plugins have access to both source code and the
48+
classpath (compiled bytecode of upstream dependencies). SemanticDB has been
49+
designed so that it's also possible to generate spec-compliant symbols from
50+
the classpath alone (no source code) and from the syntax tree of an individual
51+
source file (no classpath). This flexibility allows the
52+
[Metals](https://scalameta.org/metals/) language server to index codebases
53+
from a variety of different inputs, and will be helpful for lsif-java in the
54+
future to unblock cross-repository navigation.

docs/benchmarks.md

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,15 @@
1-
# Benchmarks results
1+
---
2+
id: benchmarks
3+
title: Benchmarks
4+
---
5+
6+
The repository contains benchmarks to measure the overhead of the SemanticDB
7+
compiler plugin.
28

39
```
4-
sbt:root> bench/jmh:run -i 3 -wi 3 -f1 -t1
10+
$ sbt
11+
...
12+
sbt:root> bench/jmh:run -i 10 -wi 10 -f1 -t1
513
...
614
[info] Benchmark (lib) Mode Cnt Score Error Units
715
[info] CompileBench.compile guava ss 10 2291.036 ± 243.428 ms/op 1x

0 commit comments

Comments
 (0)