Skip to content

Batch write transaction log entries #254

@kriszyp

Description

@kriszyp

Currently each commit involves a write of transaction entries to the transaction log(s). This is accomplished with a write mutex surrounding a writev call to the transaction log file. Based on my performance testing, I would estimate that the writev calls are taking about 5μs each. And since these are in a mutex, they must be performed sequentially (regardless of how many threads we are using), limiting our transaction log performance to roughly 200K writes / second. There are some applications with write requirements closer to a million writes / second, so eventually improving this performance would be beneficial.

One possible optimization is to move the writev call outside the write mutex, and only "stage" the writes in the mutex. We would need to eliminate the O_APPEND flag and do writes with exact offsets so that multiple writev calls could be done in parallel. However, I would guess this is simply going to move the contention into the kernel, and may not significantly improve performance (although possibly worth trying).

I believe it is more likely that higher performance could be achieved by batching parallel transaction writes. Here is rough outline of a proposed commit process with transaction log write batching:

  1. On the start of the commit, queue all of the transaction entries to be written into a queue in TransactionLogStore (using a mutex for this queue). Record the ending position of these entries once they are written to the transaction log (computed from the size of entries and existing queued entries).
  2. Perform the RocksDB commit.
  3. Check to see if the transaction log has written this commit's queued entries. If so, commit is done.
  4. If not, acquire write mutex.
  5. Again, check to see commit's queued entries have been written and if so, commit is done. Otherwise, construct an iovecs of all queued entries (from this commit and any other commits that have entries in the store queue), and then remove all the entries transferred to the iovecs. Then perform the writev call for all the queued entries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions