Skip to content

Commit 55f6d68

Browse files
committed
habanalabs: flush EQ workers in hard reset
During hard-reset, there can be multiple events received from the H/W. For each event, the driver opens a worker thread to handle it. For some of the events, the driver will read/write registers in the code that handles the event. In case of hard-reset, we must prevent reads/writes to the registers during the reset operation because the device might get stuck if that happens. Therefore, flush the EQ workers before resetting the device (in hard-reset only). Additional events won't arrive as we synced and disabled the interrupts. Signed-off-by: Oded Gabbay <[email protected]> Reviewed-by: Tomer Tayar <[email protected]>
1 parent 1af69d3 commit 55f6d68

File tree

1 file changed

+11
-5
lines changed

1 file changed

+11
-5
lines changed

drivers/misc/habanalabs/device.c

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -887,13 +887,19 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset,
887887
/* Go over all the queues, release all CS and their jobs */
888888
hl_cs_rollback_all(hdev);
889889

890-
/* Kill processes here after CS rollback. This is because the process
891-
* can't really exit until all its CSs are done, which is what we
892-
* do in cs rollback
893-
*/
894-
if (hard_reset)
890+
if (hard_reset) {
891+
/* Kill processes here after CS rollback. This is because the
892+
* process can't really exit until all its CSs are done, which
893+
* is what we do in cs rollback
894+
*/
895895
device_kill_open_processes(hdev);
896896

897+
/* Flush the Event queue workers to make sure no other thread is
898+
* reading or writing to registers during the reset
899+
*/
900+
flush_workqueue(hdev->eq_wq);
901+
}
902+
897903
/* Release kernel context */
898904
if ((hard_reset) && (hl_ctx_put(hdev->kernel_ctx) == 1))
899905
hdev->kernel_ctx = NULL;

0 commit comments

Comments
 (0)