Skip to content

Sequential worker triggered twice after parallel flow finishes #43

@danielcj

Description

@danielcj

We (@michetti and I) were experiencing some issues in which sequential jobs that should be triggered once after parallel flows finishes were being triggered twice. After investigation, we found that when parallel jobs finishes at the same time (within 2-10 milliseconds) the sequential worker is invoked twice with the same JID.

Following is a extract of Sidekiq logs illustrating some of the errors:

2016-01-26T22:26:42.576Z 14805 TID-ctgu8 TestingJob1 JID-9142aad113477418ed919a77:2 INFO: start
2016-01-26T22:26:42.577Z 14805 TID-d14aw TestingJob2 JID-9142aad113477418ed919a77:3 INFO: start
2016-01-26T22:26:45.591Z 14805 TID-d14aw TestingJob2 JID-9142aad113477418ed919a77:3 INFO: done: 3.014 sec
2016-01-26T22:26:45.592Z 14805 TID-cq05w TestingJob3 JID-9142aad113477418ed919a77:4 INFO: start
2016-01-26T22:26:45.594Z 14805 TID-ctgu8 TestingJob1 JID-9142aad113477418ed919a77:2 INFO: done: 3.017 sec
2016-01-26T22:26:45.594Z 14805 TID-ckep4 TestingJob3 JID-9142aad113477418ed919a77:4 INFO: start
2016-01-26T22:26:48.601Z 14805 TID-ckep4 TestingJob3 JID-9142aad113477418ed919a77:4 INFO: done: 3.007 sec
2016-01-26T22:26:48.603Z 14805 TID-cq05w TestingJob3 JID-9142aad113477418ed919a77:4 INFO: done: 3.01 sec

To create this scenario, I created the superworker as following:

Superworker.define(:TestingSuperWorker, :random) do
  parallel do
    TestingJob1 :random
    TestingJob2 :random
  end
  TestingJob3 :random
end

TestingJob1, 2 and 3 have the same definition, as follows:

class TestingJob1
  include Sidekiq::Worker
  def perform(random)
    sleep 3
  end
end
class TestingJob2
  include Sidekiq::Worker
  def perform(random)
    sleep 3
  end
end
class TestingJob3
  include Sidekiq::Worker
  def perform(random)
    sleep 3
  end
end

It's possible to invoke the superworker from Rails console as following:

TestingSuperWorker.perform_async "100"

Maybe you'll have to run this a few times to see the error.

This seems to be related to a race condition, as it happens only when TestJob1 and TestJob2 finishes within a difference that is small then 10 ms (so it does not happens all the time; we could reproduce consistently, but sometimes after invoking the Superworker 3 or 4 times, so both TestJob1 and TestJob2 finishes almost at the same time).

Environment:

  • Ubuntu Linux 14.04.3 LTS
  • Sidekiq 3.4.2
  • Rails 4.1.11
  • Redis 2.8.4
  • Redis Gem 3.2.1
  • Sidekiq-superworker 1.2.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions