Skip to content

Conversation

dstogov
Copy link
Member

@dstogov dstogov commented Oct 1, 2024

Execution of ZEND_FETCH_STATIC_PROP_R_SPEC_HANDLER in interpreter now requires ~35% less CPU instructions (according to callgrind).

JIT eliminates ~80% of CPU instructions for each ZEND_FETCH_STATIC_PROP_R opcode.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's late here, so I only looked at VM so far, I will check JIT tomorrow evening.

}
if (EXPECTED(op1_type == IS_CONST) && EXPECTED(CACHED_PTR(cache_slot) == ce)) {
*retval = CACHED_PTR(cache_slot + sizeof(void *));
if (EXPECTED(op1_type == IS_CONST) && EXPECTED(CACHED_PTR(cache_slot + sizeof(void *)) == ce)) {
Copy link
Member

@nielsdos nielsdos Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem right that you add sizeof(void *) here. CACHE_POLYMORPHIC_PTR takes offset 0 to the cache_slot for ce. The result is stored in offset sizeof(void *) instead of ce.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem right that you add sizeof(void *) here. CACHE_POLYMORPHIC_PTR takes offset 0 to the cache_slot for ce. The result is stored in offset sizeof(void *) instead of ce.

You are right! Thanks!

@kocsismate
Copy link
Member

I ran a benchmark (with the instruction count metric enabled) for this branch against the current master: https://gist.github.com/kocsismate/30d5e043d412f19fbebf7bc8f3fa16b8
Apparently this improves basically all tests indeed.

@dstogov
Copy link
Member Author

dstogov commented Oct 2, 2024

I ran a benchmark (with the instruction count metric enabled) for this branch against the current master: https://gist.github.com/kocsismate/30d5e043d412f19fbebf7bc8f3fa16b8 Apparently this improves basically all tests indeed.

Thanks for benchmarking. FETCH_STATIC_PROP is not very often used in real-life apps, so the average speed-up is less than a percent. On a syntactic benchmark the speed-up is really good.6.7%

I started to look into FETCH_STATIC_PROP speed now, because it is very useful in FFI code.
Together with #14491 it opens a way to write performance-competitive PHP extensions in PHP.

@dstogov dstogov closed this Oct 2, 2024
@iluuu1994
Copy link
Member

@dstogov I don't think you meant to close this?

@dstogov dstogov reopened this Oct 2, 2024
@dstogov
Copy link
Member Author

dstogov commented Oct 2, 2024

@dstogov I don't think you meant to close this?

I pressed to "close" by accident :)

@iluuu1994
Copy link
Member

Can't blame you 😉 GH is not great at UI.

Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see any other mistakes.

return result;
}

ZEND_API zval* zend_fetch_static_property(zend_execute_data *ex, int fetch_type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be marked with ZEND_FASTCALL since it is called from JIT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. It makes sense to switch to FASTCALL. This will allow to pass parameters in register in 32-bit x86 build.
I don't care about 32-bit x86 a lot.

ZEND_ASSERT(Z_TYPE_P(zv) == IS_STRING);
class_name = Z_STR_P(zv);
ce = zend_lookup_class_ex(class_name, NULL, ZEND_FETCH_CLASS_NO_AUTOLOAD);
if (ce && (ce->type == ZEND_INTERNAL_CLASS || ce->info.user.filename != op_array->filename)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we skipping internal classes only because of Windows? Not that it matters, we don't have many static properties in internal classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your question make me think about the preloaded user classes...
I'll think how to improve this condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to commit this without changes, because this is consistent with regular properties.
Both should be improved later to support preloaded classes and may be internal ones.

fetch_type = BP_VAR_UNSET;
break;
default:
ZEND_UNREACHABLE();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: EMPTY_SWITCH_DEFAULT_CASE().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree!

ir_CONST_U32(_ZEND_TYPE_MASK)));
ir_IF_FALSE(if_typed);
ir_END_list(merge);
ir_IF_TRUE(if_typed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the cold-path, should we move the loading of the property info to zend_jit_uninit_static_prop()? That would reduce code size a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree!

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have a question left, code looks good for the rest.

goto done;
case ZEND_FETCH_STATIC_PROP_FUNC_ARG:
if (!JIT_G(current_frame)
|| !JIT_G(current_frame)->call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: looking at similar code there is also a check for !JIT_G(current_frame)->call->func, is that missing here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These func check seem not necessary.
JIT/FFI patch is going to remove them, because it adds an ability to inline FFI calls implemented through trampolines (without known func).
Here I didn't add the check in the first place.

@dstogov
Copy link
Member Author

dstogov commented Oct 2, 2024

@nielsdos @iluuu1994 Thank you very much! Your feedback was very helpful.
I'm going to merge this when tests are passed.

@dstogov dstogov merged commit 3f913c1 into php:master Oct 2, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants