Skip to content

[PoC] IAsyncProcess#52055

Closed
ArtificialOwl wants to merge 1 commit intomasterfrom
enh/noid/async-process-run
Closed

[PoC] IAsyncProcess#52055
ArtificialOwl wants to merge 1 commit intomasterfrom
enh/noid/async-process-run

Conversation

@ArtificialOwl
Copy link
Member

@ArtificialOwl ArtificialOwl commented Apr 8, 2025

AsyncProcess

Using IAsyncProcess allow the execution of code on a separated process in order to improve the quality of the user experience.

Concept

To shorten the hanging time on heavy process that reflect on the user experience and to avoid delay between
instruction and execution, this API allows to prepare instructions to be executed on a parallel thread.

This is obtained by creating a loopback HTTP request, initiating a fresh PHP process that will execute the
prepared instruction after emulating a connection termination, freeing the main process.

Technology

The logic is to:

  • store a serialized version of the code to be executed on another process into database.
  • start a new process as soon as possible that will retrieve the code from database and execute it.

Setup

The feature require setting a loopback url.
This is done automatically by a background job. The automatic process uses 'overwrite.cli.url' and 'trusted_domains' from config/config.php to find a list of domain name to test.
It can be initiated via occ:

./occ async:setup --discover

Or manually set with:

./occ async:setup --loopback https://cloud.example.net/

Blocks & Sessions

  • We will define as Block complete part of unsplittable code.
  • A list of Blocks can be grouped in Sessions.
  • While Sessions are independent of each other, interactions can be set between Blocks of the same Session.

Interactions

  • Blocks are executed in the order they have been created.
  • It is possible for a Block to get results from a previous process from the session.
  • A Block defined as blocker will stop further process from that session on failure.
  • A Block can require a previous process to be successful before being executed.

Replayability

  • A block can be set as replayable, meaning that in case of failure it can be run multiple time until it end properly.

Quick example

// define all part of the code that can be async
$this->asyncProcess->invoke($myInvoke1)->id('block1')->replayable();       // block1 can be replayed until successful
$this->asyncProcess->invoke($myInvoke2)->id('block2')->require('block1');  // block2 will not be executed until block1 has not been successful
$this->asyncProcess->invoke($myInvoke3)->id('block3')->blocker();          // block3 will run whatever happened to block1 and block2 and its suc1cess is mandatory to continue with the session
$this->asyncProcess->invoke($myInvoke4)->id('block4');

$this->asyncProcess->async(); // close the session and initiate async execution

ProcessExecutionTime

Code is to be executed as soon as defined, with alternative fallback solutions in that order:

  • ::NOW - main process will fork and execute the code in parallel (instantly)
  • ::ASAP - process will be executed by an optional live service (a second later)
  • ::LATER - process will be executed at the next execution of the cron tasks (within the next 10 minutes)
  • ::ON_REQUEST - process needs to be executed manually

IAsyncProcess

IAsyncProcess is the public interface that contains few methods to prepare the code to be processed on a parallel process.

Closure

The code can be directly written in a closure by calling exec():

$this->asyncProcess->exec(function (int $value, string $line, MyObject $obj): void {
	// long process
},
	random_int(10000, 99999),
	'this is a line',
	$myObj
)->async();

Invokable

Within the magic method __invoke() in a configured objects

class MyInvoke {
	public function __construct(
		private array $data = []
	) {
	}

	public function __invoke(int $n): void {
		// do long process
	}
}


$myInvoke = new MyInvoke(['123']);
$this->asyncProcess->invoke($myInvoke, random_int(10000, 99999))->async();

PHP Class

Via the method async() from a class

<?php

namespace OCA\MyApp;

class MyObj {
	public function __construct(
	) {
	}

	public function async(int $n): void {
		// run heavy stuff
	}
}
$this->asyncProcess->call(\OCA\MyApp\MyObj::class, random_int(10000, 99999))->async();

IBlockInterface

When storing a new Block via IAsyncProcess::call(),IAsyncProcess::invoke() or IAsyncProcess::async(), will be returned a IBlockInterface to provided details about the Block.

name(string)

Identification and/or description of the Block for better understanding when debugging

$this->asyncProcess->call(\OCA\MyApp\MyObj::class)->name('my process');

id(string)

Identification of the Block for future interaction between Blocks within the same Session

$this->asyncProcess->call(\OCA\MyApp\MyObj::class)->id('my_process');

As an example, id are to be used to obtain IBlockInterface from a specific Block:

ISessionInterface::byId('my_process'); // returns IBlockInterface

blocker()

Set current Block as Blocker, meaning that further Blocks of the Session are lock until this process does not run successfully

$this->asyncProcess->call(\OCA\MyApp\MyObj::class)->blocker();

require(string)

Define that the Block can only be executed if set Block, identified by its id, ran successfully.
Multiple Blocks can be set a required.

$this->asyncProcess->call(\OCA\MyApp\MyObj::class)->require('other_block_1')->require('other_block_2');

replayable()

The Block is configured as replayable, meaning that it will be restarted until it runs correctly

$this->asyncProcess->call(\OCA\MyApp\MyObj::class)->replayable();

The delay is calculated using 6 (six) exponent current retry, capped at 6:

  • 1st retry after few seconds,
  • 2nd retry after 30 seconds,
  • 3rd retry after 3 minutes,
  • 4th retry after 20 minutes,
  • 5th retry after 2 hours,
  • 6th retry after 12 hours,
  • other retries every 12 hours.

delay(int)

Only try to initiate the process n seconds after current time.

dataset(array)

It is possible to set a list of arguments to be applied to the same Block.
The Block will be executed for each defined set of data

$this->asyncProcess->call(\OCA\MyApp\MyObj::class)->dataset(
    [
        ['this is a string', 1], 
        ['this is another string', 12],
        ['and another value as first parameter', 42],
    ]
);

post-execution

Post execution of a Block, its IBlockInterface can be used to get details about it:

  • getExecutionTime() returns the ProcessExecutionTime that initiated the process,
  • getResult() returns the array returned by the process in case of success,
  • getError() returns the error in case of failure

ISessionInterface

ISessionInterface is available to your code via ABlockWrapper and helps interaction between all the Blocks of the same Session

getAll()

returns all IBlockInterface from the Session

byToken(string)

returns a IBockInterface using its token

byId(string)

returns a IBockInterface using its id

getGlobalStatus()

return a BlockStatus (Enum) based on every status of all Blocks of the Session:

  • returns ::PREP if one block is still at prep stage,
  • return ::BLOCKER if at least one block is set as blocker and is failing,
  • returns ::SUCCESS if all blocks are successful,
  • returns ::ERROR if all process have failed,
  • returns ::STANDBY or ::RUNNING if none of the previous condition are met. ::RUNNING if at least one Block is currently processed.

ABlockWrapper

This abstract class helps to interface with other Blocks from the same Session.
It will be generated and passed as argument to the defined block if the first parameter is an AprocessWrapper is expected as first parameter:

As a Closure

$this->asyncProcess->exec(function (ABlockWrapper $wrapper, array $data): array {
	$resultFromProc1 = $wrapper->getSessionInterface()->byId('block1')?->getResult(); // can be null if 'block1' is not success, unless using require()
	$wrapper->activity(BlockActivity::NOTICE, 'result from previous process: ' . json_encode($resultFromProc1))
},
	['mydata' => true]
	)->id('block2');

When using invoke()

class MyInvoke {
	public function __construct(
	) {
	}

	public function __invoke(ABlockWrapper $wrapper, int $n): void {
		$data = $wrapper->getSessionInterface()->byId('block1')?->getResult(); // can be null if 'block1' is not success
	}
}

$myInvoke = new MyInvoke();
$this->asyncProcess->invoke($myInvoke, random_int(10000, 99999))->requipe('block1'); // require ensure block1 has run successfully before this one

Syntax is the same with call(), when defining the async() method

class MyObj {
	public function __construct(
	) {
	}
	
	public function async(ABlockWrapper $wrapper): void {
	}
}

Abstract methods

ABlockWrapper is an abstract class with a list of interfaced methods that have different behavior on the BlockWrapper sent by the framework.

  • DummyBlockWrapper will do nothing,
  • CliBlockWrapper will generate/manage console output,
  • LoggerBlockWrapper will only create new nextcloud logs entry.

List of usefull methods:

  • activity(BlockActivity $activity, string $line = ''); can be used to update details about current part of your code during its process
  • getSessionInterface() return the ISessionInterface for the session
  • getReplayCount() returns the number of retry

Other tools

Live Service

This will cycle every few seconds to check for any session in stand-by mode and will execute its blocks

 ./occ async:live

Manage sessions and blocks

Get resume about current session still in database.

`./occ async:manage`

Get details about a session.

`./occ async:manage --session <sessionId>

Get details about a block

`./occ async:manage --details <blockId>

Get excessive details about a block

`./occ async:manage --details <blockId> --full-details

Replay a not successful block

`./occ async:manage --replay <blockId>

Mocking process

./occ async:setup allow an admin to generate fake processes to emulate the feature:

  • --mock-session int create n sessions
  • --mock-block int create n blocks
  • --fail-process string create failing process

Work in Progress

missing element of the feature:

  • discussing the need of signing the PHP code stored in database to ensure its authenticity before execution,
  • ability to overwrite the code or arguments of a process to fix a failing process,
  • Check and confirm value type compatibility between parameters and arguments when storing new block
  • implementing dataset(),
  • implementing delay(),
  • full documentation,
  • tests, tests, tests.

set_time_limit(max($timeLimit, 0));
ob_start();

echo($result);

Check failure

Code scanning / Psalm

TaintedHtml Error

Detected tainted HTML
set_time_limit(max($timeLimit, 0));
ob_start();

echo($result);

Check failure

Code scanning / Psalm

TaintedTextWithQuotes Error

Detected tainted text with possible quotes
@ArtificialOwl ArtificialOwl force-pushed the enh/noid/async-process-run branch 5 times, most recently from 856ec01 to 6a348ea Compare April 24, 2025 20:25
@ArtificialOwl ArtificialOwl force-pushed the enh/noid/async-process-run branch from 6a348ea to 404c3e0 Compare April 28, 2025 20:56
Signed-off-by: Maxence Lange <maxence@artificial-owl.com>
@ArtificialOwl ArtificialOwl force-pushed the enh/noid/async-process-run branch from 404c3e0 to 6d272ff Compare April 28, 2025 20:58
@ArtificialOwl ArtificialOwl changed the title [draft] async process [PoC] async process Apr 28, 2025
@ArtificialOwl ArtificialOwl changed the title [PoC] async process [PoC] IAsyncProcess Apr 28, 2025
@icewind1991
Copy link
Member

I don't think loopback requests should be used for this if we want to implement "async". I don't believe they would be reliable enough, they would take up limited slots in the fpm worker queue, create "unusual" traffic that firewalls etc might need to be adjusted for and just seems unexpected behavior of software in general.

Moving this to a new request also means we pay the setup overhead of various resources again, and given how often the setup is (among the) most expensive part of the request, this would make it easy to accidentally double (or more) the resource usage of a request.

I think in general, running code in the background should be avoided in most instances since it makes it harder to reason about possible failure and can lead to logic issues around other code/requests relying on the async code being executed (or not yet executed).
In that way, background jobs having a minimum delay of 5-15 min is a "feature" as it discourages using it for user-interactive things.

To me it feels that instead of trying to move longer running logic out of the user-initiated request, it would make more sense to have the UI not block on the request in the first place. That way the code can remain linear and easier to reason about, and also do a better job of relaying failure/completion status to the user.

I'm sorry for such a negative response on 3k+ lines of work, but we need to make sure that any advantage of "async" code is worth the extra complexity and new potential failure cases.

@icewind1991
Copy link
Member

After some discussion we've decided to work towards improving the background jobs instead to cover the uses cases better.

See #52877 for the discussion around that

@ChristophWurst
Copy link
Member

Please consider adopting https://amphp.org/ before re-implementing the wheel

I've worked it it. It works surprisingly well.

@icewind1991
Copy link
Member

Closing in favor of to-be-implemented background job improvements

@ChristophWurst ChristophWurst deleted the enh/noid/async-process-run branch May 16, 2025 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants