Skip to content

Commit 29390bb

Browse files
YuKuai-huaweiaxboe
authored andcommitted
blk-throttle: support prioritized processing of metadata
Currently, blk-throttle handle all IO fifo, hence if data IO is throttled and then meta IO is dispatched, the meta IO will have to wait for the data IO, causing priority inversion problems. This patch support to handle metadata first and then pay debt while throttling data. Test script: use cgroup v1 to throttle root cgroup, then create new dir and file while write back is throttled test() { mkdir /mnt/test/xxx touch /mnt/test/xxx/1 sync /mnt/test/xxx sync /mnt/test/xxx } mkfs.ext4 -F /dev/nvme0n1 -E lazy_itable_init=0,lazy_journal_init=0 mount /dev/nvme0n1 /mnt/test echo "259:0 $((1024*1024))" > /sys/fs/cgroup/blkio/blkio.throttle.write_bps_device dd if=/dev/zero of=/mnt/test/foo1 bs=16M count=1 conv=fdatasync status=none & sleep 4 time test echo "259:0 0" > /sys/fs/cgroup/blkio/blkio.throttle.write_bps_device sleep 1 umount /dev/nvme0n1 Test result: time cost for creating new dir and file before this patch: 14s after this patch: 0.1s Signed-off-by: Yu Kuai <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
1 parent 3bf73e6 commit 29390bb

File tree

1 file changed

+43
-22
lines changed

1 file changed

+43
-22
lines changed

block/blk-throttle.c

Lines changed: 43 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1595,6 +1595,22 @@ void blk_throtl_cancel_bios(struct gendisk *disk)
15951595
spin_unlock_irq(&q->queue_lock);
15961596
}
15971597

1598+
static bool tg_within_limit(struct throtl_grp *tg, struct bio *bio, bool rw)
1599+
{
1600+
/* throtl is FIFO - if bios are already queued, should queue */
1601+
if (tg->service_queue.nr_queued[rw])
1602+
return false;
1603+
1604+
return tg_may_dispatch(tg, bio, NULL);
1605+
}
1606+
1607+
static void tg_dispatch_in_debt(struct throtl_grp *tg, struct bio *bio, bool rw)
1608+
{
1609+
if (!bio_flagged(bio, BIO_BPS_THROTTLED))
1610+
tg->carryover_bytes[rw] -= throtl_bio_data_size(bio);
1611+
tg->carryover_ios[rw]--;
1612+
}
1613+
15981614
bool __blk_throtl_bio(struct bio *bio)
15991615
{
16001616
struct request_queue *q = bdev_get_queue(bio->bi_bdev);
@@ -1611,29 +1627,34 @@ bool __blk_throtl_bio(struct bio *bio)
16111627
sq = &tg->service_queue;
16121628

16131629
while (true) {
1614-
/* throtl is FIFO - if bios are already queued, should queue */
1615-
if (sq->nr_queued[rw])
1630+
if (tg_within_limit(tg, bio, rw)) {
1631+
/* within limits, let's charge and dispatch directly */
1632+
throtl_charge_bio(tg, bio);
1633+
1634+
/*
1635+
* We need to trim slice even when bios are not being
1636+
* queued otherwise it might happen that a bio is not
1637+
* queued for a long time and slice keeps on extending
1638+
* and trim is not called for a long time. Now if limits
1639+
* are reduced suddenly we take into account all the IO
1640+
* dispatched so far at new low rate and * newly queued
1641+
* IO gets a really long dispatch time.
1642+
*
1643+
* So keep on trimming slice even if bio is not queued.
1644+
*/
1645+
throtl_trim_slice(tg, rw);
1646+
} else if (bio_issue_as_root_blkg(bio)) {
1647+
/*
1648+
* IOs which may cause priority inversions are
1649+
* dispatched directly, even if they're over limit.
1650+
* Debts are handled by carryover_bytes/ios while
1651+
* calculating wait time.
1652+
*/
1653+
tg_dispatch_in_debt(tg, bio, rw);
1654+
} else {
1655+
/* if above limits, break to queue */
16161656
break;
1617-
1618-
/* if above limits, break to queue */
1619-
if (!tg_may_dispatch(tg, bio, NULL))
1620-
break;
1621-
1622-
/* within limits, let's charge and dispatch directly */
1623-
throtl_charge_bio(tg, bio);
1624-
1625-
/*
1626-
* We need to trim slice even when bios are not being queued
1627-
* otherwise it might happen that a bio is not queued for
1628-
* a long time and slice keeps on extending and trim is not
1629-
* called for a long time. Now if limits are reduced suddenly
1630-
* we take into account all the IO dispatched so far at new
1631-
* low rate and * newly queued IO gets a really long dispatch
1632-
* time.
1633-
*
1634-
* So keep on trimming slice even if bio is not queued.
1635-
*/
1636-
throtl_trim_slice(tg, rw);
1657+
}
16371658

16381659
/*
16391660
* @bio passed through this layer without being throttled.

0 commit comments

Comments
 (0)