Skip to content

Add removing unused parameter bytecode process to parser#1543

Merged
ksh8281 merged 9 commits intoSamsung:masterfrom
kwonjeomsim:optimize
Mar 8, 2026
Merged

Add removing unused parameter bytecode process to parser#1543
ksh8281 merged 9 commits intoSamsung:masterfrom
kwonjeomsim:optimize

Conversation

@kwonjeomsim
Copy link
Copy Markdown
Contributor

@kwonjeomsim kwonjeomsim commented Feb 27, 2026

This PR adds some processes to parser to pass creation of bytecode for unused function parameters to optimize memory usage.

To implement this feature, we need to add some variables in ASTScopeContext and InterpretedCodeBlock which checking parameter using. And actual check process is executed when closing function scope.

When eval function and arguments is in function body, all parameters in this scope(all parent scopes too for arguments) are considered as used because of complexity of implementation.

Also, if statement starting at 3806 in esprima.cpp needs to care test/vendortest/v8/test/mjsunit/es6/regress/regress-4395.js test.

I test this feature with web-tooling-benchmark to watch peak memory usage, but GC was running irregularly so result shows good performance sometime, or not. Differance is about 10 to 100 KB up or down.

Here is some result of benchmark tests:

  • buble.js : 582,367 -> 582,344 (-23 KB)
  • terser.js : 582,387 -> 582,427 (+40 KB)
  • acorn.js : 582,404 -> 582,412 (+8 KB)
  • babel-minify.js : 582,480 -> 582,377 (-103 KB)

#endif

#ifndef ESCARGOT_DEBUGGER
uint16_t m_parameterUsed : 16;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment about meaning
like 0xFFFF means the function have many parameters(>16)
....

@kwonjeomsim kwonjeomsim force-pushed the optimize branch 2 times, most recently from 3345bfa to 72bba12 Compare February 27, 2026 11:27
for (size_t j = 0; j < ret->m_childBlockScopes.size(); j++) {
ASTBlockContext* block = ret->m_childBlockScopes[j];
if (VectorUtil::findInVector(block->m_usingNames, paramName) != VectorUtil::invalidIndex) {
scopeCtx->m_parameterUsed |= (1 << i);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scopeCtx->m_parameterUsed |= (1 << i); line may shift 1 by i bits, which can overflow if i is greater than the bit width of m_parameterUsed, leading to undefined behavior. Ensure i is within bounds or use a wider type or bitset.

#ifndef ESCARGOT_DEBUGGER
uint16_t m_parameterUsed : 16; // 0xFFFF means all parameters are used or function has more than 16 parameters

ASTScopeContext *m_parentScope;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach searches every use case of parameter in bottom-up method, so this additional m_parentScope is required.
But have you tried top-down method instead?
It seems possible to search from top to bottom without this additional info

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that already, and it is possible. But that way uses almost equal memory size compared to this way, and gets more complicated code because of adding process which dealing with arguments object.
Additionaly, top-down search implementation itself is more complicated than bottom-up search.
So I think bottom-up search is better than top-down.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hyukwoo-park
I rewrited to top-down method, and it looks better than my thought.
Please check again!

void setParameterUsedValue(ASTScopeContext* scopeCtx)
{
while (scopeCtx) {
for (size_t i = 0; i < this->currentScopeContext->m_parameters.size(); i++) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this->currentScopeContext->m_parameters is used in the loop header of setParameterUsedValue, but the function receives scopeCtx. As a result, when setParameterUsedValue is called recursively on child scopes, the loop still iterates over the original currentScopeContext's parameters, leaving child scopes' parameters unprocessed and potentially mis-marking used parameters. This logic flaw can lead to incorrect parameter usage detection in the parser. The minimal fix is to replace this->currentScopeContext with scopeCtx in the loop header and all subsequent accesses inside the function.

}

#ifndef ESCARGOT_DEBUGGER
void setParameterUsedValue(ASTScopeContext* scopeCtx)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the function argument scopeCtx consistently inside setParameterUsedValue instead of this->currentScopeContext. The current loop accesses this->currentScopeContext->m_parameters and this->currentScopeContext->m_parameterUsed, which can be confusing because the function receives scopeCtx. Switching to scopeCtx clarifies that the function operates on the passed context and prevents accidental reliance on the outer currentScopeContext. This small change improves readability and maintainability without altering the overall logic. It also reduces the risk of subtle bugs when the function is called recursively on child scopes.

}

#ifndef ESCARGOT_DEBUGGER
void setParameterUsedValue(ASTScopeContext* scopeCtx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor review) IMO setParameterUsedValue is not intuitive, what about another name such as checkParameterUsage or others?

#endif
#ifndef ESCARGOT_DEBUGGER
if (UNLIKELY(paramNames.size() > 16)) {
this->currentScopeContext->m_parameterUsed = 0xFFFF;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0xFFFF might be confusing, so It would be better to use macro variable like below (just an example)

#define DISABLE_PARAMETER_REMOVE 0xFFFF

@kwonjeomsim kwonjeomsim force-pushed the optimize branch 2 times, most recently from b28e581 to a07c5f4 Compare March 3, 2026 06:08
Comment on lines +325 to +327
if (this->currentScopeContext->m_parameterUsed == DISABLE_PARAM_CHECK) {
break;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this check code out of the outermost while statement (with UNLIKELY macro), and it seems possible to return directly inside this check statement

break;
}

AtomicString name = this->currentScopeContext->m_parameters[i];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line could be improved as below (using a reference)

// parameter list is fixed, so using a reference out of the while statement to shorten the code
AtomicStringTightVector& parameters = this->currentScopeContext->m_parameters;

while (...) {
  ...
  AtomicString& name = parameters[i];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in this approach, the outermost while could be removed

Comment on lines +342 to +351
if (scopeCtx->firstChild()) {
checkUsedParameters(scopeCtx->firstChild());
}

if (scopeCtx != this->currentScopeContext) {
scopeCtx = scopeCtx->nextSibling();
} else {
break;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part could be improved(refactored) below

size_t childCount = scopeCtx->childCount();
if (childCount > 0) {
  ASTScopeContext* childScope = ScopeCtx->firstChild();
  for (size_t i = 0; i < childCount; i++) {
    checkUsedParameters(childScope);
    childScope = childScope->nextSibling();
  }
}

I think that this approach is better for maintenance.

ASTScopeContext* popScopeContext(ASTScopeContext* lastPushedScopeContext)
{
#ifndef ESCARGOT_DEBUGGER
checkUsedParameters(this->currentScopeContext);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkUsedParameters is invoked at each popScopeContext; this means that parameter check is done after each scan of the function completes.
That is, at this moment, we can know whether this function uses eval or has more than 16 parameters.
So, we can check the above cases just before invoking checkUsedParameters here.

for (size_t j = 0; j < scopeCtx->m_childBlockScopes.size(); j++) {
ASTBlockContext& blockCtx = *scopeCtx->m_childBlockScopes[j];
if (VectorUtil::findInVector(blockCtx.m_usingNames, name) != VectorUtil::invalidIndex) {
this->currentScopeContext->m_parameterUsed |= (1 << i);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression 1 << i in this->currentScopeContext->m_parameterUsed |= (1 << i); performs a left shift by the loop index i. If i is greater than or equal to the bit width of m_parameterUsed, the shift is undefined and can corrupt the bitmask. Cast the shift to a wider type or guard against i exceeding the width of m_parameterUsed before setting the bit.

if (VectorUtil::findInVector(blockCtx.m_usingNames, name) != VectorUtil::invalidIndex) {
this->currentScopeContext->m_parameterUsed |= (1 << i);
break;
} else if (scopeCtx->m_parameterUsed == DISABLE_PARAM_CHECK && VectorUtil::findInVector(blockCtx.m_usingNames, stringArguments) != VectorUtil::invalidIndex) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition checks scopeCtx->m_parameterUsed == DISABLE_PARAM_CHECK but the assignment later sets this->currentScopeContext->m_parameterUsed = DISABLE_PARAM_CHECK. This mismatch means the disable flag may never be detected, causing incorrect parameter usage checks. Change the condition to test this->currentScopeContext->m_parameterUsed instead of scopeCtx->m_parameterUsed to ensure the flag is correctly evaluated.

#endif
#ifndef ESCARGOT_DEBUGGER
, m_hasStringArguments(false)
, m_parameterUsed(0)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the duplicate initialization of m_parameterUsed(0) inside the second #ifndef ESCARGOT_DEBUGGER block. The struct already initializes m_parameterUsed in the first conditional block, so the second occurrence causes a duplicate member initialization error when ESCARGOT_DEBUGGER is not defined. Keeping a single initialization ensures the constructor remains well-formed and avoids compilation failures. If the intention was to add a new member, consider renaming or moving the initialization to a separate block.

if (VectorUtil::findInVector(blockCtx.m_usingNames, name) != VectorUtil::invalidIndex) {
this->currentScopeContext->m_parameterUsed |= (1 << i);
break;
} else if (scopeCtx->m_parameterUsed == DISABLE_PARAM_CHECK) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else if now triggers whenever scopeCtx->m_parameterUsed == DISABLE_PARAM_CHECK, causing an early return that skips checking remaining parameters. This can lead to incorrect m_parameterUsed state. Restore the original condition or add a guard for stringArguments.

Comment on lines +336 to +337
bool m_hasStringArguments : 1;
uint16_t m_parameterUsed : 16;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering padding, please locate each boolean and uint16_t member together with other boolean and 16-bit sized members

@clover2123
Copy link
Copy Markdown
Contributor

Looks much better 👍

@kwonjeomsim kwonjeomsim force-pushed the optimize branch 4 times, most recently from 9d23dc6 to c26915d Compare March 5, 2026 12:27
@clover2123
Copy link
Copy Markdown
Contributor

@ksh8281
Could you review this patch?
In real apps, quite a lot of parameters are defined but not used.
So, we try to search these cases and optimize them by just removing parameter-load bytecodes
If there is another better way, please share your opinion

@ksh8281
Copy link
Copy Markdown
Contributor

ksh8281 commented Mar 6, 2026

Please add comment about meaning of m_parameterUsed
like 0xFFFF means the function have many parameters(>16)

@ksh8281 ksh8281 merged commit f41ec34 into Samsung:master Mar 8, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants