Skip to content

PG_JOB_CACHE_DIR disallowed as mount (tmpfs backed) #8362

@wdoekes

Description

@wdoekes

Hi!

I'm investigating excessive load in a particular setup. Here we notice many many small writes (38 bytes) on this ZFS backed filesystem; which look like they cause a write amplification there. They appear to be in PG_JOB_CACHE_DIR.

I thought I'd mount tmpfs on that path -- or at least something with a lot less strong recovery guarantees -- but then I ran into this:

2025-11-26 16:39:32 UTC [45]: [1-1] 69272d44.2d 0     FATAL:  could not remove file "base/pgsql_job_cache": Device or resource busy
2025-11-26 16:39:32 UTC [45]: [2-1] 69272d44.2d 0     LOG:  database system is shut down

I traced that back to the citus startup, where rmdir/mkdir failures are fatal. This effectively makes putting the cache on a different filesystem impossible.

My initial minimal patch would look like this:

diff --git a/src/backend/distributed/utils/directory.c b/src/backend/distributed/utils/directory.c
index 6701bf8fb..fcf2745f2 100644
--- a/src/backend/distributed/utils/directory.c
+++ b/src/backend/distributed/utils/directory.c
@@ -32,7 +32,13 @@ CitusCreateDirectory(StringInfo directoryName)
 	int makeOK = MakePGDirectory(directoryName->data);
 	if (makeOK != 0)
 	{
-		ereport(ERROR, (errcode_for_file_access(),
+		/*
+		 * Don't raise an ERROR here. If we do, we cannot use a (bind)
+		 * mount to move the job path to another filesystem (type).
+		 * (Postgres treats ERRORs as fatal and aborts the current task.
+		 * That also applies to the initialize task.)
+		 */
+		ereport((errno == EEXIST ? WARNING : ERROR), (errcode_for_file_access(),
 						errmsg("could not create directory \"%s\": %m",
 							   directoryName->data)));
 	}
@@ -147,7 +153,13 @@ CitusRemoveDirectory(const char *filename)
 
 		if (removed != 0 && errno != ENOENT)
 		{
-			ereport(ERROR, (errcode_for_file_access(),
+			/*
+			 * Don't raise an ERROR here. If we do, we cannot use a (bind)
+			 * mount to move the job path to another filesystem (type).
+			 * (Postgres treats ERRORs as fatal and aborts the current task.
+			 * That also applies to the initialize task.)
+			 */
+			ereport(WARNING, (errcode_for_file_access(),
 							errmsg("could not remove file \"%s\": %m", filename)));
 		}
 

Thoughts:

  • this does affect all calls to CitusCreateDirectory, but that is only used for PG_JOB_CACHE_DIR, so not a problem;
  • same for CitusRemoveDirectory;
  • I don't like that CitusRemoveDirectory and CitusCreateDirectory have different function signatures, that's a bit smelly;
  • for a better fix, we could check if the path is a mount and then skip everything, and/or replace the CitusRemoveDirectory+CitusCreateDirectory with a CitusClearDirectory that does everything except remove the parent;
  • I'd also consider moving PG_JOB_CACHE_DIR to a configuration option, but the above fixes will still be needed.

Let me know what you think / how you would solve this.

Cheers,
Walter Doekes
OSSO B.V.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions