Memory model and its impact on refactoring #8
Replies: 18 comments 83 replies
-
|
Hi all. Senior software artisan here with 15 years of PHP experience, 8 years of async PHP experience (amphp v2, then v3), 5 years of go experience, 3 years of Rust experience, and much more. My recommendation for async PHP is:
This model matches the status quo in golang and amphp v3: from my 8 years of experience in writing async both business logic and abstraction logic in multiple languages, this is the best concurrency model. Some evidence: I maintain MadelineProto, the biggest PHP MTProto client. MTProto is an async binary protocol over TCP (no HTTP involved), requires heavy caching in order for clients to function correctly, and a lot of race-heavy abstractions. MadelineProto is a framework which contains a fully async MTProto client. I migrated MadelineProto to async PHP with amphp v2 (colored, stackless async) in 2018, and then to amphp v3 (colorless, stackful async) in 2022.
Outside in PHP: in 2020, I started writing heavily async business logic in Go, and was very positively surprised by the superior developer experience, especially when using async: I was also very pleased to learn that Go encourages (not forces) the use of the actor model through channels. In my professional Go experience, I wrote heavily concurrent and massively parallel, high-load services, and golang's stackful, colorless approach. I also used Rust to write async business logic, and suffered from the same issues I suffered with amphp v2: large amounts of boilerplate await keywords, heavily impacting UX. To this day, I consider Go's approach superior to that of any other language, confirmed by extensive development experience. |
Beta Was this translation helpful? Give feedback.
-
|
From the PHP Engine side, we can provide developers with additional tools to help detect errors. For example:
The In addition, code safety can be reinforced through a special directive at the package / Composer level. There was also an idea to use a flag in php.ini. This gives us several mechanisms at once to avoid unintended side effects. |
Beta Was this translation helpful? Give feedback.
-
|
Treating WordPress and other code using globals heavily as not target for this proposal is likely path to rejection. It was already expressed by couple of voters that such approach is a no-go (it was in FPM context but I would guess it will be the same here) and if you consider that there are probably some "no" votes just based on the complexity and opinion that this is not needed at all, then I think there is not much chance this could pass. So I think the most likely model to pass is shared nothing. |
Beta Was this translation helpful? Give feedback.
-
|
INI approach is also path to rejection - that will give quite a few "no" votes straight away. |
Beta Was this translation helpful? Give feedback.
-
|
I want to share a few thoughts regarding models 2 and 3. It seems that the shared-nothing model is an absolute winner in terms of safety. I want to point out that this does not come for free. The second memory model (as well as the first, of course) requires manual synchronization from the programmer. The third memory model requires the same as the second. The difference is that shared memory must always be declared explicitly in the language.
I have a classic PHP task that must be solved efficiently.
Each request is handled by a separate coroutine. Memory model 2 answers: Memory model 3 answers: What are the possible solutions for model 3? Readonly classes. Using actors. An actor is a class whose methods are guaranteed to be called only within a single coroutine (at the same time). Actors significantly simplify writing asynchronous code, are thread-safe, and can be freely passed between coroutines. |
Beta Was this translation helpful? Give feedback.
-
|
I believe this "working group" already starts on the wrong foot with even the definition of "Stage 1". Its first purpose ought to be: "What is the most single basic thing for most of the PHP users." I believe that would be a simple "Run these three things in parallel and then wait until they're done/exceptioned and get the return values". Launching straight into the intricacies of implementation isn't a constructive way forward, IMO. |
Beta Was this translation helpful? Give feedback.
-
|
PHP's reputation has long been tarnished because it's so "easy" to use that PHP gets used a lot by people who don't know what they're doing, and so produce crap code that happens to be in PHP. And for a long time, PHP let them do it. Hell, it still does. 😃 Introducing an "easy" async toolset that means people who don't have solid async experience will shoot themselves in the foot is actively harmful to the language and ecosystem. We must prioritize not allowing mixed-package applications (which is nearly all of them) to develop heisenbugs because some random library did something async. Even if that makes it harder to use, even if that means it's not as powerful as it could be, that must be priority 1. And I still hold that structured-only nursery-style is the best way to achieve that. If that means certain fringe use cases cannot be easily handled... that's an acceptable tradeoff if it protects users from sloppy other users. (Where "sloppy" means "anyone who doesn't really know how to think about async", which is 99% of PHP devs today.) |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
Several cases regarding memory model 3.Case 1. Getting the value of a coroutine or a Future. This issue can be partially mitigated by cloning the object, but that immediately introduces additional pitfalls. Case 2. Lazy-objects I have a service, and this service has two dependencies provided as lazy objects. Consequences: Model 3 forbids using lazy objects, and DI resolution becomes difficult unless it is solved in some other way. Case 3. The problem of passing complex data structures. I have a data structure that contains cyclic references. This is a classic headache in Rust and not only Rust. Case 4. Complex unexpected errors There is code that passes object X into a coroutine. I write a new function and accidentally capture the object in some closure without noticing it. This problem would not exist if PHP were a compiled language with a memory model that could be validated at compile time. So, memory model 3, while solving one problem, creates another. There is a way out: explicitly define two types of coroutines: coroutines that run in the same thread and coroutines that run in different threads. |
Beta Was this translation helpful? Give feedback.
-
|
This discussion is focused on the specific set of questions stated in the topic title. If you would like to share feedback, criticism, or general comments, please use a separate topic. Thank you! |
Beta Was this translation helpful? Give feedback.
-
|
First of all, thank you sincerely for all the time and dedication you've put into this. I know the community can be very demanding, and working through all the feedback in such detail is both tedious and exhausting. After going back through the threads and discussions, I realized that for developers who haven’t worked with asynchronous logic before, it’s genuinely difficult to grasp how it affects program flow, why shared globals are dangerous, how race conditions can arise, and why different memory models matter for safety and predictability. A big source of confusion also seems to be understanding how existing synchronous code would need to be refactored under each memory isolation model. Without clear intuition for that, it's hard for people to assess the real-world impact of each option. The recurring questions like “why do we even need this?” and calls for concrete use cases suggest that we’ll need to be extremely beginner friendly in all discussions where possible, if this is going to move forward and eventually gain adoption without friction. Would it make sense to expand the initial post with a few very simple, high-level examples under
These kinds of examples have come up in scattered places in the RFC proposals, but gathering them in one place could really help readers who aren’t used to async programming. Lastly, would it be worthwhile to actively reach out to major players like WordPress, Laravel, Symfony, Adobe, Prestashop, etc., and invite them to weigh in? Although I believe they would need a strong nudge from the active php maintainers to actually invest time to it, their perspective could help shape the proposal and build confidence that the direction aligns with the wider ecosystem and will eventually be adopted |
Beta Was this translation helpful? Give feedback.
-
|
Support for no memory separation |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Due to how static properties are implemented, I also tend to think it would be reasonable to limit isolation to GLOBALS and superglobals. At the same time, I would suggest isolating superglobals not per coroutine but per separate scope, which would make the code much more flexible. Superglobals could be used as a Coroutine/Scope context. The next step is to put all these options up for a vote. |
Beta Was this translation helpful? Give feedback.
-
|
One thing to properly consider is separation of internal globals. If those are shared, we should stright away reject any future paralellization as extension already rely on the fact that those globals are thread safe so it would not be realisticly possible to change this expectation in the future. If we separate those, then we would be doing pretty much the same thing that parallel does but also supporting coroutines (which means also changes in TSRM - similar ones that FrankenPHP needs for using goroutines). I honestly think that the full separation is the way to go as it closes all footguns and would alow using threads in the future. |
Beta Was this translation helpful? Give feedback.
-
|
There seems to be a misunderstanding about how the threading model of PHP is supposed to work. Parallel doesn't do any "tricks", it implements the threading model as required. Talk of needing "sharable opcodes" and a new memory model betray misunderstanding; opcodes are already sharable, and there's no problem with the memory model. Talk of something called "true parallelism" does exactly the same thing. I'm not certain that this is a productive discussion, talk of burdening lightweight concurrency primitives with not only that which is required by the engine, but also what you imagine is necessary on top, changes the nature of the primitive from something lightweight, into something that whatever else can no longer be called lightweight. The reason coroutines are suitable as a primitive in a reactive application is their nature - the very fact that they are lightweight - to reenter IO doesn't require a heavy context switch ... This direction is a non-starter for me ... |
Beta Was this translation helpful? Give feedback.
-
|
Here you see many instances of WordPress running inside a single PHP process that never dies — well, almost never. Each request creates its own personal WordPress instance, which sees only its own global variables and therefore works perfectly without conflicting with other requests. The cherry on top is the SUPERGLOBALS: they are shared between multiple coroutines within a single request, forming a kind of sandbox. No performance degradation. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Memory model and its impact on refactoring
At this stage of the discussion, I propose focusing on one of the three main questions.
🔑 Key Questions
1. Choosing the Coroutine Model
How coroutines should behave internally:
At the moment, the stackful coroutine model has been chosen, since it allows interaction with C functions and enables mixing C and PHP function calls.
The likelihood of using a different model is low, so this question can be considered resolved.
2. Choosing the Memory Model
How state is managed in asynchronous execution:
The current RFC proposes using the least protected memory model, where coroutines freely share global and static variables.
This question is not resolved and requires further consideration.
The choice of memory model will have a significant impact on the evolution of the language for years to come.
3. Choosing the Runtime Strategy
Defining how async integrates into PHP:
This question will not be discussed at this stage, but it remains important for the consideration of question 2.
🎯 Purpose of Stage 1
The goal of the first discussion stage is to explore the memory model for async execution in PHP.
The memory model has a significant impact on how coroutines will interact with existing PHP code. This is of primary importance. In addition, the memory model determines the direction of the language’s evolution for many years, defining whether a multithreaded model becomes possible or not.
Without resolving this question, we cannot move forward.
Important elements of the discussion can be found here:
https://discourse.thephp.foundation/t/php-dev-vote-true-async-rfc-1-6/4777
Please note the message from
Rowan_Tommins_IMSoP:
https://discourse.thephp.foundation/t/php-dev-vote-true-async-rfc-1-6/4777/75
Possible Memory Models for Coroutines
There are three potential memory models for coroutine execution in PHP:
1. No Memory Separation (current behavior)
Coroutines freely share:
Characteristics:
2. Separated Globals/Statics (objects still shared)
Each coroutine receives:
But:
Status:
3. Full Shared-Nothing Model (maximum isolation)
This model extends option 2 with strict memory isolation:
Rules:
RefCount > 1.Advantages:
Disadvantages:
own) into the language.P.S. Models one and two are used in many languages, including Go. In Go example, developers themselves are responsible for the incorrect use of shared memory.
Beta Was this translation helpful? Give feedback.
All reactions