Skip to content

Job.data_model may reference deleted BaseDataModel on save failure #3225

@gks281263

Description

@gks281263

Location

File: api_app/engines_manager/models.py
Method: EngineConfig.run()

Description

In EngineConfig.run(), when a job already has a data model, the existing data model is deleted and a new one is assigned. These operations are not atomic. If job.save() fails after the deletion, the job's GenericForeignKey fields (data_model_content_type, data_model_object_id) can end up referencing a deleted BaseDataModel.

This leaves the job in a partially updated and inconsistent state.

Execution Flow

Current order of operations:

  1. Existing job.data_model is deleted from the database
  2. New BaseDataModel is assigned to job.data_model in memory
  3. job.save() is called to persist the change

If step 3 fails, the previous deletion is not rolled back.

Trigger Scenarios

job.save() can fail due to normal conditions such as:

  • Validation errors
  • Signal exceptions
  • Database constraint errors
  • Connection or transaction failures

Observed Invalid State

After a failed save:

  • The old BaseDataModel is already deleted
  • The database still contains FK values pointing to the deleted object
  • The new BaseDataModel exists but is not linked to the job
  • Accessing job.data_model raises DoesNotExist or returns None

This results in an inconsistent GenericForeignKey relationship.

Impact

If this occurs, it can lead to:

  • Runtime errors when accessing job.data_model
  • Engines, pivots, or connectors operating on inconsistent job state
  • Queries involving job.data_model returning incorrect results
  • Orphaned BaseDataModel records accumulating over time
  • Difficult debugging due to silent data corruption

Expected Safe Behavior

The update should be atomic, so that either:

  • The old data model remains until the new one is safely saved, or
  • Partial state is prevented if the operation fails

This ensures job.data_model always references a valid object.

Notes

This issue is about atomicity and data integrity, not about changing intended behavior. The exact solution approach is open for discussion. Scope is limited to EngineConfig.run().

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions