-
Notifications
You must be signed in to change notification settings - Fork 129
Description
I believe this backend is leaking file descriptors after the backend has been removed.
Our application starts a number of temporary GenServers that are started to handle on-demand data processing. I have the following code (actual module and structs renamed) that is called during the GenServer's init/2
callback to add a file backend for the GenServer and remove the backend when the GenServer terminates:
defmodule JobServer.Jobs.Logging do
@moduledoc """
Conveniences for configuring logging for job runners.
"""
alias JobServer.Jobs.Job
require Logger
@doc """
Setup a logger file backend for the current process and given `job_id`.
The backend will be automatically removed when the current process exits.
"""
def setup_logger_file_backend(%Job{id: job_id} = job) do
Logger.metadata(job_id: job_id)
opts = [
path: job |> Job.outputs_path() |> Path.join("log"),
format: "$date $time [$level]$levelpad $message\n",
level: :info,
metadata_filter: [application: :job_server, job_id: job_id]
]
{:ok, _pid} = Logger.add_backend({LoggerFileBackend, job_id}, flush: true)
Logger.configure_backend({LoggerFileBackend, job_id}, opts)
# Start a background task to monitor the current process and remove
# the backend when this process exits
Task.start(__MODULE__, :teardown_logger_file_backend_on_exit, [job_id, self()])
end
@doc """
Monitor the given `pid` and, upon exit, remove the logger file backend
associated with `job_id`.
"""
def teardown_logger_file_backend_on_exit(job_id, pid) do
ref = Process.monitor(pid)
receive do
{:DOWN, ^ref, :process, ^pid, _reason} ->
Logger.remove_backend({LoggerFileBackend, job_id}, flush: true)
end
end
end
I can see in Observer that the Erlang process associated with the logger file backend is created and terminated as expected but looking at the operating system file descriptors shows that these log files are still open after the associated backend processes are shutdown:
> ls -l /proc/<os_pid>/fd
...
l-wx------. 1 vagrant vagrant 64 Oct 5 10:41 35 -> /home/vagrant/code/software/jobserver/_tmp/jobs/outputs/e7dea011-bcc5-4f31-bb0d-5e9f16de8616/log
l-wx------. 1 vagrant vagrant 64 Oct 5 10:41 36 -> /home/vagrant/code/software/jobserver/_tmp/jobs/outputs/0af6fe65-4f54-415d-9ff3-c1722feaad08/log
...
I believe this is a fairly serious one IMO as it breaks the assumption that removing a file backend will free all operating system resources claimed when starting said backend. We discovered this issue after our application became unresponsive after exhausting the file descriptors available to the Linux process.
Let me know if there's any additional information that would help with identifying the cause of this issue.