Compute preliminary overlay database mode #3141

cklin · 2025-09-23T19:52:03Z

This PR updates the init action to compute a preliminary overlay database mode before the CodeQL CLI becomes available. The preliminary overlay database mode will be used in a future PR.

Risk assessment

For internal use only. Please select the risk level of this change:

Low risk: Changes are fully under feature flags, or have been fully tested and validated in pre-production environments and are highly observable, or are documentation or test only.

Merge / deployment checklist

Confirm this change is backwards compatible with existing workflows.
Consider adding a changelog entry for this change.
Confirm the readme and docs have been updated if necessary.

Copilot

Pull Request Overview

This PR updates the init action to compute a preliminary overlay database mode before the CodeQL CLI becomes available. This computation will be used in a future PR.

Key changes:

Creates a new JSON file for overlay language aliases mapping
Extracts the inputs object creation to happen earlier in the workflow
Adds a new getPreliminaryOverlayDatabaseMode function to determine overlay database mode without CodeQL CLI
Updates isOverlayAnalysisFeatureEnabled to work without CodeQL dependency

Reviewed Changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/overlay-language-aliases.json	New mapping file for language aliases used in overlay analysis
src/init.ts	Updates `initConfig` to accept CodeQL parameter
src/init-action.ts	Restructures input creation and adds preliminary overlay mode computation
src/feature-flags.ts	Removes CodeQL dependency for overlay analysis feature minimum version
src/config-utils.ts	Adds functions for computing overlay mode and loading config without CodeQL
src/config-utils.test.ts	Updates tests to accommodate new function signatures
lib/*	Generated JavaScript code reflecting TypeScript changes

src/feature-flags.ts

src/config-utils.ts

getOverlayDatabaseMode() already performs the same version check, so we can remove minimumVersion from Feature.OverlayAnalysis. Doing so allows the action to perform feature checks without CodeQL CLI.

This commit changes isOverlayAnalysisFeatureEnabled() so that it uses the overlay-language-aliases.json file to resolve language aliases instead of relying on the CodeQL CLI.

This commit makes getOverlayDatabaseMode() accept undefined as arguments for codeql and languages.

This commit extracts into amendInputConfigFile() the code that processes configInput, and moves the call from initConfig() into init-action.ts.

mbg

Thanks for having a go at this as an alternative to #3116. As discussed elsewhere, I think that having just the hard-coded alias mappings (in overlay-language-aliases.json here) is a better approach than having all of the CLI output and maintaining that with an extra workflow.

I have had a look over the details of the implementation here (but I have not yet looked at #3158). For the changes here, I have some fairly significant concerns about the duplication of the logic producing the UserConfig (i.e. the database configuration) in both getPreliminaryOverlayDatabaseMode and initConfig -- see my comments for details there.

This is also a fairly big change generally and I would class this as high risk given the concerns I discuss in the comments and that I don't have much confidence that we have enough existing overall test coverage to understand what this may or may not break.

I would propose one of the following:

Depending on the extent to which the UserConfig for getOverlayDatabaseMode must align with the one computed in initConfig, perhaps getPreliminaryOverlayDatabaseMode could be modified so that what it does has no effect on initConfig and initConfig continues to work largely as is.
Placing all of the changes behind a FF, but this is likely just going to make the logic even more complex and could introduce more issues than it might prevent if the above isn't possible; or
Making smaller changes incrementally, perhaps starting by refactoring some existing functions to set things up for the changes here in ways that don't change any behaviour.

mbg · 2025-10-01T10:55:53Z

src/config-utils.ts

+  logger: Logger,
+): Promise<string[]> {
+  // Obtain languages without filtering them.
+  const { rawLanguages } = await getRawLanguages(


getRawLanguages makes an API call as part of calling getRawLanguagesInRepo (if there's no languages input). Since getRawLanguages is currently only called at most once, the result of that API call is not cached. With this case, it could get called more than once. It might make sense to cache the response since it shouldn't change between calls.

mbg · 2025-10-01T10:58:19Z

src/config-utils.ts

+ * CodeQL CLI. It is intended to be used for overlay analysis preparations
+ * before the CodeQL CLI is available.
+ */
+async function getUnverifiedLanguagesForOverlay(


Minor: It would be nice if this could live in overlay-database-utils.ts since it is specific to overlay databases. I imagine it's here instead because of the call to getRawLanguages. Do you think you could make rawLanguages a parameter of getUnverifiedLanguagesForOverlay and move it to overlay-database-utils.ts?

mbg · 2025-10-01T11:02:06Z

src/config-utils.ts

+  );
+  const languageAliases = overlayLanguageAliases as Record<string, string>;
+
+  const languagesSet: string[] = [];


Minor: languagesSet suggests that this is a Set object, but it is actually an array. Also the name is misleading, because I don't think that getRawLanguages attempts any de-duplication, so in theory we could have multiple languages multiples times here if the languages input contains them more than once.

mbg · 2025-10-01T11:07:29Z

src/config-utils.ts

+ * This function should be called only once on any specific `InitConfigInputs`
+ * object. Otherwise it could emit a false warning.
+ */
+export function amendInputConfigFile(


This name is misleading because it suggests that we are somehow changing an existing configuration file, while we are actually just determining the one we wish to use (and writing the one provided as input to disk, if needed).

mbg · 2025-10-01T11:09:01Z

src/config-utils.ts

+ * This function should be called only once on any specific `InitConfigInputs`
+ * object. Otherwise it could emit a false warning.


It would be nice to enforce this, e.g. with some state capturing whether this has been called already for the given inputs or in the types.

mbg · 2025-10-01T11:11:26Z

src/config-utils.ts

+        `Both a config file and config input were provided. Ignoring config file.`,
+      );
+    }
+    inputs.configFile = userConfigFromActionPath(inputs.tempDir);


Minor: I don't love that this relies on mutating inputs and that the return type of this function is void. Perhaps you could refactor this function so that it returns the appropriate value for inputs.configFile (i.e. either the initial inputs.configFile or the result of userConfigFromActionPath(inputs.tempDir)) and then update inputs.configFile to the result at the call site.

mbg · 2025-10-01T11:22:48Z

src/config-utils.ts

+ * @returns An object containing the overlay database mode and whether the
+ * action should perform overlay-base database caching.
+ */
+export async function getPreliminaryOverlayDatabaseMode(


I am concerned about the amount of logic duplication that is taking place here (particularly the involvement of calculateAugmentation and generateCodeScanningConfig). I understand that these are required for getOverlayDatabaseMode, but we risk turning this into a big source of errors if there is inconsistency in what this function determines vs what initActionState determines.

For example, this function does not account for analysis-kinds being only code-quality, in which case all query customisation should be disabled. Notably, that has different semantics for repositoryProperties (where additional configuration is an info-level log message) and for inputs/config file values (where it is a ConfigurationError).

I assume that it is important that the computedConfig here mirrors the one in initConfig as accurately as possible, but correct me if I am wrong and this particular aspect doesn't matter.

If that does matter and viewing this issue in isolation, we would ideally have one function which figures computedConfig out given the necessary arguments for calculateAugmentation and generateCodeScanningConfig taking analysis-kinds into account, and then use that in both this function and initConfig.

mbg · 2025-10-01T11:24:23Z

src/init-action.ts

-    const qualityQueriesInput = getOptionalInput("quality-queries");
-
-    if (qualityQueriesInput !== undefined) {
+    if (inputs.qualityQueriesInput !== undefined) {


Minor: is there any particular reason for removing this constant, since it's still used in two places as before?

mbg · 2025-10-01T11:37:15Z

src/config-utils.ts

+  const userConfig = await loadUserConfig(
+    inputs.configFile,
+    inputs.workspacePath,
+    inputs.apiDetails,
+    tempDir,
+    logger,
+  );
+  const config = await initActionState(inputs, userConfig, codeql);


The interaction between this and what happens in getPreliminaryOverlayDatabaseMode is not obvious and possibly problematic. Specifically, getPreliminaryOverlayDatabaseMode suggests that it is related to only overlay databases, but it actually writes the configuration file that loadUserConfig ends up loading here. That configuration file already contains a CLI configuration that is a combination of a config-file or config input, with other Action inputs such as queries and packs as well as repository property values.

initActionState then performs essentially duplicate work calling calculateAugmentation (although I understand that it takes the languages as input and therefore could lead to different results than for the first call to it) which then gets fed to a second call to generateCodeScanningConfig. This could lead to some odd behaviours where assumptions that we make in generateCodeScanningConfig are invalidated because userConfig already contains merged results of various inputs from the first call to generateCodeScanningConfig that took place in getPreliminaryOverlayDatabaseMode. So we might end up with duplicate entires in the final configuration file, duplicate or confusing log messages, etc.

mbg · 2025-10-01T11:53:31Z

src/config-utils.ts

+  languages: Language[] | undefined,
+  languagesInput: string | undefined,


I am worried that having both (possibly disagreeing) values here could be a source of confusion or errors.

cklin · 2025-10-01T22:35:55Z

Hi @mbg,

Thank you for your detailed comments. Before we go into the details, can you confirm that I understand your main concerns correctly?

You are concerned that this change is high-risk because it is a fairly big change, not under feature flags, and there is insufficient test coverage.
You are concerned that there is too much code duplication between getPreliminaryOverlayDatabaseMode() and the rest of the init action, and that the behavior of getPreliminaryOverlayDatabaseMode() is inconsistent with the main init action logic.
You are concerned that there is insufficient logical separation between getPreliminaryOverlayDatabaseMode() and the rest of the init action (for example, with initActionState()), and that the hidden interactions will cause maintenance burden down the road.

Do I understand your main concerns correctly? Am I missing anything?

henrymercer

Can we simplify our decision about whether to match the CLI version to the overlay version, and therefore reduce the amount of complexity here? I think it would be OK if we matched the CLI version to the overlay version, even if once we obtained the CLI we decided later we aren't actually going to run overlay. The main thing we need to ensure is that the decision is stable and doesn't flip on and off.

In particular, it would simplify things greatly if we didn't need to compute the full code scanning configuration, and instead made a decision based on the user's requested languages and build modes, and the feature flags.

cklin marked this pull request as ready for review September 23, 2025 20:32

cklin requested a review from a team as a code owner September 23, 2025 20:32

Copilot AI review requested due to automatic review settings September 23, 2025 20:32

Copilot AI reviewed Sep 23, 2025

View reviewed changes

src/feature-flags.ts Show resolved Hide resolved

src/config-utils.ts Show resolved Hide resolved

src/config-utils.ts Show resolved Hide resolved

cklin force-pushed the cklin/preliminary-overlay-mode branch from 0e88ef0 to 0d55271 Compare September 25, 2025 17:52

cklin added 10 commits September 26, 2025 15:13

Remove Feature.OverlayAnalysis minimumVersion

6e69a92

getOverlayDatabaseMode() already performs the same version check, so we can remove minimumVersion from Feature.OverlayAnalysis. Doing so allows the action to perform feature checks without CodeQL CLI.

Overlay: check features without CodeQL CLI

f6247bb

This commit changes isOverlayAnalysisFeatureEnabled() so that it uses the overlay-language-aliases.json file to resolve language aliases instead of relying on the CodeQL CLI.

Overlay: choose database mode without CodeQL CLI

046ce56

This commit makes getOverlayDatabaseMode() accept undefined as arguments for codeql and languages.

Move codeql out of InitConfigInputs

9ebca4c

Compute InitConfigInputs early

fcd4657

Call amendInputConfigFile() early

c079287

This commit extracts into amendInputConfigFile() the code that processes configInput, and moves the call from initConfig() into init-action.ts.

Move support code into loadUserConfig()

25b6845

Add getPreliminaryOverlayDatabaseMode()

57444cc

Compute preliminary overlay database mode

c3d80a1

build: refresh js files

c4d96be

cklin force-pushed the cklin/preliminary-overlay-mode branch from 0d55271 to c4d96be Compare September 26, 2025 22:15

cklin requested a review from mbg September 29, 2025 14:39

mbg requested changes Oct 1, 2025

View reviewed changes

henrymercer reviewed Oct 2, 2025

View reviewed changes

		* This function should be called only once on any specific `InitConfigInputs`
		* object. Otherwise it could emit a false warning.

		languages: Language[] \| undefined,
		languagesInput: string \| undefined,

Compute preliminary overlay database mode #3141

Are you sure you want to change the base?

Compute preliminary overlay database mode #3141

Uh oh!

Conversation

cklin commented Sep 23, 2025

Risk assessment

Merge / deployment checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mbg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cklin commented Oct 1, 2025

Uh oh!

henrymercer left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants