Skip to content

Conversation

@borzunov
Copy link
Collaborator

@borzunov borzunov commented Sep 28, 2023

#331 introduced a bug during inference retries that caused this:

[INFO] Route found: 0:18 via1EBzGt
[WARN] [petals.client.inference_session.step:327] Caught exception when running inference via RemoteSpanInfo(peer_id=<libp2p.peer.id.ID (12D3KooWLRHAtX9ccW9i1NvpPLigwGX9MstGw3oyCZwmh21EBzGt)>, start=0, end=18, server_info=ServerInfo(state=<ServerState.ONLINE: 2>, throughput=1040.823002928876, public_name=':duck:FYY:sun_with_face:', version='2.2.0', network_rps=1185.4980980484086, forward_rps=9887.81852782432, inference_rps=343.100557603763, adapters=(), torch_dtype='bfloat16', quant_type='nf4', using_relay=False, cache_tokens_left=1179648, next_pings={...})) (retry in 2 sec): AssertionError("Broken input cache: span=RemoteSpanInfo(peer_id=<libp2p.peer.id.ID (12D3KooWLRHAtX9ccW9i1NvpPLigwGX9MstGw3oyCZwmh21EBzGt)>, start=0, end=18, server_info=ServerInfo(state=<ServerState.ONLINE: 2>, throughput=1040.823002928876, public_name=':duck:FYY:sun_with_face:', version='2.2.0', network_rps=1185.4980980484086, forward_rps=9887.81852782432, inference_rps=343.100557603763, adapters=(), torch_dtype='bfloat16', quant_type='nf4', using_relay=False, cache_tokens_left=1179648, next_pings={...})) shape=torch.Size([1, 579, 8192]) position=0 n_input_tokens=1")

# If there is a failed span, this code replaces it, otherwise it just adds new ones
if server_idx < n_prev_spans:
updated_sessions[0].history = self._server_sessions[server_idx].history
updated_sessions[0].position = self._position
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix

@borzunov borzunov marked this pull request as draft September 28, 2023 17:33
@borzunov borzunov force-pushed the fix-inference-retry branch 2 times, most recently from 160211a to 567e34b Compare September 28, 2023 18:40
@borzunov borzunov force-pushed the fix-inference-retry branch 2 times, most recently from ef59cf6 to 3f70ab6 Compare October 23, 2023 16:50
@borzunov borzunov force-pushed the fix-inference-retry branch from 9289d93 to 63282af Compare October 23, 2023 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants