Skip to content
This repository was archived by the owner on Apr 22, 2025. It is now read-only.

JVM SupendApp never terminatesΒ #140

@weronkagolonka

Description

@weronkagolonka

Hello!

After upgrading to 0.5.0, we've noticed a strange behaviour where our Kafka consumer, launched in a separate coroutine, was abruptly stopped few seconds after startup. What's worth noticing, the application itself was not terminated or crashed. The kafka coroutine is running within the same scope of the suspended app and it is polling for messages as long as the jobs is active. Initially, it seemed like simply wrapping it in a new scope would solve the issue - and it did, as the consumer was successfully polling and handling the messages. However, after a deeper investigation to find the root cause of the issue we noticed that the application would never stop, regardless of what was put in the SuspendApp lambda. Given this extremely simple example:

fun main() = SuspendApp { logger.info { "The app is running!" } }

We discovered a significant change in the top-level coroutine launched in the SuspendApp, where after executing the lambda block, the process is being exited with status code 0:

@OptIn(ExperimentalStdlibApi::class)
fun SuspendApp(
  context: CoroutineContext = Dispatchers.Default,
  uncaught: (Throwable) -> Unit = Throwable::printStackTrace,
  timeout: Duration = Duration.INFINITE,
  process: Process = process(),
  block: suspend CoroutineScope.() -> Unit,
): Unit =
  process.use { env ->
    env.runScope(context) {
      val job =
        launch(start = CoroutineStart.LAZY) {
          try {
            block()
            env.exit(0) // <- here
          } catch (_: SuspendAppShutdown) {} catch (e: Throwable) {
            uncaught(e)
            env.exit(-1)
          }
        }
      val unregister =
        env.onShutdown {
          withTimeout(timeout) {
            job.cancel(SuspendAppShutdown)
            job.join()
          }
        }
      job.start()
      job.join()
      unregister()
    }
  }

This call triggers the sequence of JVM shutdown hooks, which then executes env.OnShutdown lambda defined in unregister variable, where the job is being cancelled (and that explains why our kafka consumer was suddenly stopped and why running it in a new scope allowed the consumer to proceed). The process then waits for the cancelled job to complete, but that never happens - the application basically hangs, unless a finite timeout duration is specified, then it simply terminates with timeout exception.

My question is - what is the rationale behind exiting the process after lambda execution instead of letting the job complete and eventually calling unregister() as it was done in version 0.4.0? Right now it doesn't seem like the unregistering logic is ever executed, as the exit(0) method is called first.

If this behaviour is expected, how should the logic of SuspendApp be defined to make sure that the application actually terminates at some point?

Thanks in advance for your help, for now we're skipping the upgrade of the library until we get some more insight πŸ˜„

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions