Add a new retry feature to `block` #824

hirokuni-kitahara · 2025-03-25T09:20:08Z

Signed-off-by: hirokuni-kitahara [email protected]

This PR is based on the issue #823 and it adds two new fields retry and trace_error_on_retry to the block class.
retry is an integer value which indicates the number of retry when some errors happen within the block. retry is enabled only when the value is specified and positive.
trace_error_on_retry is a boolean value whether to add error information to the trace. This defaults to None (=False), but when it is set to True, errors during retry are added to the trace. This is useful for multiple trials with model block to leverage self-reflection behavior of LLM.

The following is the original PR description before discussion in the comments.
---
This PR is based on the issue https://github.com/IBM/prompt-declaration-language/issues/823 and it adds two new fields `retry_on_error` and `retry_max` to the existing `repeat` block.
`retry_on_error` is a simple boolean value which indicates if the retry feature is enabled or not, and if true, errors while running the `repeat` block are added to the background context of the LLM.
`retry_max` is an integer value of the number of maximum retry.
This PR contains the following changes
- Update to the RepeatBlock model in `pdl_ast.py`
- Update while loop for repeat block in `pdl_interpreter.py`
- Update to the `schema.json`

vazirim · 2025-03-25T13:12:51Z

Hi @hirokuni-kitahara, the PR looks great. How about instead of adding 2 new fields we only have retry_max, set to 0 by default. When it's set to 0 it means that retry_on_error == False. When it's set to anything strictly greater than 0, then it's equivalent to retry_on_error == True with that number of max tries. wdyt?

When you make changes to the AST, you can run:

pdl --schema > src/pdl/pdl-schema.json

to automatically regenerate the jsonschema. This is probabaly what you did, but just double checking.

There are also some other files that need to change (with an AST change):
pdl_dumper.py
pdl_ast_utils.py

Finally, you can run the following locally to make sure everything is in good shape:

pytest

and:

pre-commit run --all-files

See the contribution docs
Thank you so much!

hirokuni-kitahara · 2025-03-26T08:00:52Z

Thank you for your feedback @vazirim !
I have updated the codes so that it uses retry_max for the feature flag and I removed retry_on_error based on your suggestion.
Also, I followed the steps in the contribution doc and now all checks passed.
I really appreciate your if you can review the updated codes.

starpit · 2025-03-26T09:37:33Z

@hirokuni-kitahara can you rebase (other PRs have also changed the schema) and then run npm run types in pdl-live-react? this will re-generate the typescript types in accordance with the changes to the schema. thanks!

mandel · 2025-03-26T19:32:23Z

Thank you that is a great feature!

Instead of doing it only on repeat blocks, what about adding this field in the Block class inherited by all the blocks?

I would also rename retry_max simply into retry.

hirokuni-kitahara · 2025-03-27T03:29:11Z

@starpit @mandel
Thank you very much for your feedbacks!
Yes, I will rebase and update ts with the command, and also I will add just retry to the Block class.

hirokuni-kitahara · 2025-04-11T07:03:58Z

Hi @vazirim @mandel @starpit,
the retry feature is ready now and I think I already updated all the related files (including the schema.json and ts file).
Could you please review the changes? Thank you!

vazirim · 2025-04-14T14:54:26Z

Thank you very much for the changes @hirokuni-kitahara!

LGTM

@mandel @starpit any more feedback?

starpit · 2025-04-22T14:40:18Z

does this handle the case where an async/future'd model block invocation fails?

mandel

It is great. I just added a couple of comments.

mandel · 2025-05-01T17:48:08Z

src/pdl/pdl_interpreter.py

+                raise exc from exc
+            if do_retry:
+                error = f"An error occurred in a PDL block. Error details: {err_msg}"
+                print(f"\n\033[0;31m[Retry {trial_idx+1}/{max_retry}] {error}\033[0m\n")


This could be printed on stderr.

Thank you @mandel , I will fix this part.

mandel · 2025-05-01T18:00:47Z

src/pdl/pdl_interpreter.py

+            if do_retry:
+                error = f"An error occurred in a PDL block. Error details: {err_msg}"
+                print(f"\n\033[0;31m[Retry {trial_idx+1}/{max_retry}] {error}\033[0m\n")
+                scope = set_error_to_scope_for_retry(scope, error, block.pdl__id)


Why do you add the error to the trace instead of re-executing the block in the original scope?

Thank you @mandel ,
I'm adding the error to the trace so that the agent's LLM can understand what went wrong in the previous trial. (I'm assuming the retry feature is being used within a ReAct-style loop.)

This allows the LLM to generate a different output from the model block in the next iteration.

However, I now realize there's also a need to support more traditional retry scenarios—such as retrying an HTTP connection—where adding the error to the trace might be excessive.

What do you think about adding a new block attribute like trace_error_on_retry as a boolean field? It could default to false and can be set to true for agent use-case.

Signed-off-by: hirokuni-kitahara <[email protected]>

hirokuni-kitahara · 2025-05-23T00:16:49Z

@vazirim @mandel I could fix the schema issue and now all the checks passed!
Could you please review this? Thank you!

mandel · 2025-05-23T14:51:38Z

That's great. Thank you!

hirokuni-kitahara mentioned this pull request Mar 25, 2025

Add retry_on_error feature to repeat block to help developing agents #823

Closed

hirokuni-kitahara force-pushed the error-handling-in-repeat-1 branch from fe52fdc to 4ea3dde Compare March 26, 2025 07:29

hirokuni-kitahara changed the title ~~Add a new retry feature to repeat block~~ Add a new retry feature to block Apr 10, 2025

hirokuni-kitahara force-pushed the error-handling-in-repeat-1 branch 2 times, most recently from a4bd87d to 506e109 Compare April 10, 2025 08:44

mandel reviewed May 1, 2025

View reviewed changes

hirokuni-kitahara added 16 commits May 22, 2025 22:01

add retry on error to repeat block

f418c7a

Signed-off-by: hirokuni-kitahara <[email protected]>

update pdl-schema.json for retry of repeat block

9efef37

Signed-off-by: hirokuni-kitahara <[email protected]>

use retry_max only

373f1c0

Signed-off-by: hirokuni-kitahara <[email protected]>

apply lint to pdl_interpreter.py

f3f9a09

Signed-off-by: hirokuni-kitahara <[email protected]>

fix unnecessary changes in schema json

ebe4eaa

Signed-off-by: hirokuni-kitahara <[email protected]>

add retry feature to block class

9b16144

Signed-off-by: hirokuni-kitahara <[email protected]>

update schema with typescript types

bfa13a6

Signed-off-by: hirokuni-kitahara <[email protected]>

update pdl_dumper.py for retry feature

2e180d5

Signed-off-by: hirokuni-kitahara <[email protected]>

reduce complexity of retry function

342168b

Signed-off-by: hirokuni-kitahara <[email protected]>

apply lint fixes to pdl_interpreter.py

bafc5df

Signed-off-by: hirokuni-kitahara <[email protected]>

update ts file and remove unnecessary change in schema.json

c07a8b4

Signed-off-by: hirokuni-kitahara <[email protected]>

remove unnecessary changes in schema.json

10bd167

Signed-off-by: hirokuni-kitahara <[email protected]>

remove unnecessary changes in schema.json

a329027

Signed-off-by: hirokuni-kitahara <[email protected]>

add test code for retry

2eae481

Signed-off-by: hirokuni-kitahara <[email protected]>

apply lint to new retry test code

ca6f5ce

Signed-off-by: hirokuni-kitahara <[email protected]>

fix rebase inconsistency

80c459a

Signed-off-by: hirokuni-kitahara <[email protected]>

add trace_error_on_retry for controlling trace while retrying

baa673e

Signed-off-by: hirokuni-kitahara <[email protected]>

hirokuni-kitahara force-pushed the error-handling-in-repeat-1 branch from 2845347 to baa673e Compare May 22, 2025 13:29

hirokuni-kitahara added 6 commits May 22, 2025 22:33

apply lint for retry related codes

de3d75a

Signed-off-by: hirokuni-kitahara <[email protected]>

revert wrong change of shcema

5f6694c

Signed-off-by: hirokuni-kitahara <[email protected]>

update pdl-schema.json

40dd650

Signed-off-by: hirokuni-kitahara <[email protected]>

update pdl-schema.json

37fbd64

Signed-off-by: hirokuni-kitahara <[email protected]>

update pdl-schema.json

ec7b5be

Signed-off-by: hirokuni-kitahara <[email protected]>

fix schema inconsistency

789e45f

Signed-off-by: hirokuni-kitahara <[email protected]>

mandel merged commit 47f4adc into IBM:main May 23, 2025
7 checks passed

Add a new retry feature to block #824

Add a new retry feature to block #824

Uh oh!

Conversation

hirokuni-kitahara commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vazirim commented Mar 25, 2025

Uh oh!

hirokuni-kitahara commented Mar 26, 2025

Uh oh!

starpit commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mandel commented Mar 26, 2025

Uh oh!

hirokuni-kitahara commented Mar 27, 2025

Uh oh!

hirokuni-kitahara commented Apr 11, 2025

Uh oh!

vazirim commented Apr 14, 2025

Uh oh!

starpit commented Apr 22, 2025

Uh oh!

mandel left a comment

Choose a reason for hiding this comment

Uh oh!

mandel May 1, 2025

Choose a reason for hiding this comment

Uh oh!

hirokuni-kitahara May 8, 2025

Choose a reason for hiding this comment

Uh oh!

mandel May 1, 2025

Choose a reason for hiding this comment

Uh oh!

hirokuni-kitahara May 8, 2025

Choose a reason for hiding this comment

Uh oh!

hirokuni-kitahara commented May 23, 2025

Uh oh!

Uh oh!

mandel commented May 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add a new retry feature to `block` #824

Add a new retry feature to `block` #824

hirokuni-kitahara commented Mar 25, 2025 •

edited

Loading

starpit commented Mar 26, 2025 •

edited

Loading