Skip to content

How to make monitoring / rotating log handlers multiprocessing safe #287

@jikquantopian

Description

@jikquantopian

I've experimented with MultiProcessingSubscriber and MultiProcessingHandler and for whatever reason they don't always seem to work. For example, I had a huge amount of trouble getting them to work -- and ultimately didn't -- with running a Flask app under Werkzeug's run_simple with processes set to a >1 value.

It seems like much of what these classes are trying to accomplish -- though certainly not all of it -- is to make logging to monitoring or rotating file handlers multiprocessing safe, since if you have multiple processes checking if log files need to be reopened or rotated, there are all sorts of race conditions leading to log messages being lost.

There is, however, a solution. Using fcntl.flock, I've successfully built a rotating log handler which is -- as far as I can tell -- 100% resilient in a multiprocessing environment. Log messages are not lost, log files are not rotated prematurely, etc.

I present it below for your consideration. This isn't a finished product I'd expect to be merged into Logbook as-is, for a number of reasons, including:

  1. I wrote it for a Python3 environment so I made no effort to ensure Python2/3 compatibility for the code.

  2. I'm only using it in environments where fcntl.flock works, so I haven't tested what would happen in environments where it doesn't.

  3. On other operating systems from Linux there are different ways to do file locking, so it would be nice if the code were more generic about figuring out what locking mechanisms are available and using one of them, rather than just supporting fcntl.flock.

  4. This doesn't actually have to be its own handler -- the locking mechanism could be integrated into existing handlers, though whether that's appropriate is up for discussion.

  5. I made no effort in my handler to support mode values other than a because that's all I'm using and other modes are complicated because you have to be careful about multiple processes truncating the log file and causing log entries to be lost.

Having said all that, I wanted to submit this here to ask whether it is something you'd be interested in merging if its deficiencies were addressed, and if so what approach you'd like to take, i.e., adding to existing handlers vs. having separate ones. If there's interest we could talk about what you think would be needed to make it release-ready and I could try to find time to hammer it into shape.

So here's the proof-of-concept code:

class FlockFileLock(object):
    def __init__(self, path):
        self.path = path
        self.file = None

    def acquire(self):
        """Returns True if file has been changed, False otherwise"""
        ret = False
        while True:
            if not self.file:
                self.file = open(self.path, 'a')
                self.fstatd = os.fstat(self.file.fileno())
                ret = True
            try:
                fcntl.flock(self.file, fcntl.LOCK_EX)
            except Exception:
                try:
                    self.file.close()
                except Exception:
                    pass
                self.file = None
                raise
            try:
                statd = os.stat(self.path)
            except FileNotFoundError:
                try:
                    self.file.close()
                except Exception:
                    pass
                self.file = None
                continue
            except Exception:
                try:
                    self.file.close()
                except Exception:
                    pass
                self.file = None
                raise
            if self.fstatd[stat.ST_DEV] == statd[stat.ST_DEV] and \
               self.fstatd[stat.ST_INO] == statd[stat.ST_INO]:
                return ret
            self.file.close()
            self.file = None

    def release(self):
        if not self.file:
            return
        fcntl.flock(self.file, fcntl.LOCK_UN)


class FlockRotatingFileHandler(logbook.FileHandler):
    def __init__(self, filename, encoding='utf-8', level=logbook.NOTSET,
                 format_string=None, delay=False, max_size=1024 * 1024,
                 backup_count=5, filter=None, bubble=False):
        self.max_size = max_size
        self.backup_count = backup_count
        super(FlockRotatingFileHandler, self).__init__(
            filename, encoding=encoding, level=level,
            format_string=format_string, delay=delay, filter=filter,
            bubble=bubble)
        self.flock = FlockFileLock(filename)

    def should_rollover(self):
        self.stream.seek(0, 2)
        return self.stream.tell() >= self.max_size

    def perform_rollover(self):
        self.stream.close()
        for x in range(self.backup_count - 1, 0, -1):
            src = '%s.%d' % (self._filename, x)
            dst = '%s.%d' % (self._filename, x + 1)
            try:
                os.rename(src, dst)
            except FileNotFoundError:
                pass
        os.rename(self._filename, self._filename + '.1')
        self._open('a')

    def emit(self, record):
        try:
            self.lock.acquire()
            changed = self.flock.acquire()
            if changed:
                self.flush()
                if self.stream:
                    try:
                        self.stream.close()
                    except Exception:
                        pass
                    self.stream = None
                self.ensure_stream_is_open()
            super(FlockRotatingFileHandler, self).emit(record)
            if self.should_rollover():
                self.perform_rollover()
        finally:
            try:
                self.lock.release()
            except Exception:
                pass
            self.flock.release()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions