Skip to content

Commit 0c82882

Browse files
add a executor for checkoutRevision endpoint
Summary: # Context We’ve encountered issues in the past of certain functions having wide execution fan-out, meaning one Thrift call could tie up all of the Thrift CPU threads to serve the request. This is not ideal, as this results in other clients receiving `QueueTimeout`s due to waiting too long to get Thrift requests picked up for execution. We’ve combated this in the past by limiting some Thrift entry points to serial execution. This has become a game of whack-a-mole with new instances of `QueueTimeout`s, requiring copying this logic to different functions that seem to have similar fan-out, and is error prone since its easy to forget to set up a SerialExecutor while refactoring pre-existing functions. There are some mission critical entry points that are known to be more heavyweight and should not be executed serially. There are a few options to be explored for these cases, including adding thread pools for particular heavyweight entry points, or implementing a Executor which can use smarter internal load balancing, such as a token bucket backed solution. This diff explores the first option. # This Diff This diff adds a `UnboundedQueueExecutor` that can be used to drive `checkout()`. Without this, and while using serial execution for Thrift requests across the board, checkout becomes too slow. This allows us to use serial execution for everything while keeping `checkoutRevision` performant. We might need to introduce similar executors for other endpoints, TBD. # Technical Details The use of a `UnboundedQueueExecutor` is explicit here. I plan to go through and remove uses of bounded executors in EdenFS. It is not recommended to use bounded executors with async code [1] # Facebook [1] - https://fb.workplace.com/groups/474291069286180/posts/5360947373953834/?comment_id=5361092820605956 [Threadpool Project Plan](https://docs.google.com/document/d/1yvYFYk8UT3hW9M7Qw-NSIpvqYFUxjHL8FhM81cYQGeQ/edit?tab=t.0) Reviewed By: jdelliot Differential Revision: D70135765 fbshipit-source-id: 66634542dbc3c69e8bb2c75ab071313e0ab37f80
1 parent 195d8db commit 0c82882

File tree

4 files changed

+128
-0
lines changed

4 files changed

+128
-0
lines changed

eden/fs/config/EdenConfig.h

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,27 @@ class EdenConfig : private ConfigSettingManager {
328328
false,
329329
this};
330330

331+
/**
332+
* Whether Eden should use a dedicated executor for checkout requests. This is
333+
* meant to help with checkoutRevision performance while using serial
334+
* execution for other Thrift requests. This feature can be used even if not
335+
* using serial execution for other Thrift requests, but if serial execution
336+
* is being used, its a good idea to turn on this config as well.
337+
*/
338+
ConfigSetting<bool> thriftUseCheckoutExecutor{
339+
"thrift:use-checkout-executor",
340+
false,
341+
this};
342+
343+
/**
344+
* Number of threads that will service the checkoutRevision Thrift endpoint
345+
* when using its own executor.
346+
*/
347+
ConfigSetting<uint64_t> numCheckoutThreads{
348+
"thrift:checkout-revision-num-servicing-threads",
349+
std::thread::hardware_concurrency(),
350+
this};
351+
331352
/**
332353
* How often to collect Thrift server metrics. The default value mirrors the
333354
* value from facebook::fb303::TServerCounters::kDefaultSampleRate

eden/fs/service/EdenServer.cpp

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,17 @@ std::shared_ptr<folly::Executor> makeFsChannelThreads(
343343
return fsChannelThreads;
344344
}
345345

346+
std::shared_ptr<folly::Executor> makeCheckoutRevisionThreads(
347+
bool useCheckoutExecutor,
348+
std::shared_ptr<const EdenConfig>& edenConfig) {
349+
if (useCheckoutExecutor) {
350+
return std::make_shared<UnboundedQueueExecutor>(
351+
edenConfig->numCheckoutThreads.getValue(),
352+
"CheckoutRevisionThreadPool");
353+
}
354+
return nullptr;
355+
}
356+
346357
} // namespace
347358

348359
namespace facebook::eden {
@@ -486,6 +497,10 @@ EdenServer::EdenServer(
486497
thriftUseResourcePools_{edenConfig->thriftUseResourcePools.getValue()},
487498
thriftUseSerialExecution_{
488499
edenConfig->thriftUseSerialExecution.getValue()},
500+
thriftUseCheckoutExecutor_{
501+
edenConfig->thriftUseCheckoutExecutor.getValue()},
502+
checkoutRevisionExecutor_{
503+
makeCheckoutRevisionThreads(thriftUseCheckoutExecutor_, edenConfig)},
489504
progressManager_{
490505
std::make_unique<folly::Synchronized<EdenServer::ProgressManager>>()},
491506
startupStatusChannel_{std::move(startupStatusChannel)} {
@@ -2254,6 +2269,26 @@ ImmediateFuture<CheckoutResult> EdenServer::checkOutRevision(
22542269
return std::move(result);
22552270
});
22562271

2272+
if (thriftUseCheckoutExecutor_) {
2273+
// This is an UnboundedQueueExecutor, which is guaranteed to never block
2274+
// nor throw (except OOM), nor execute inline from `add()`.
2275+
//
2276+
// Thrift documentation states that "it is almost never a good idea to send
2277+
// work off Thrift’s CPU worker". However, it also notes that "user code may
2278+
// delegate the rest of the work to other thread pools, thus shifting load
2279+
// off Thrift’s CPU Workers thread and letting the thread pick up new
2280+
// incoming work" but warns that "if there is work offloaded to other thread
2281+
// pools, there should be backpressure mechanism that would block Thrift CPU
2282+
// Workers thread when those internal thread pools are overloaded."
2283+
//
2284+
// Without backpressure, EdenFS would see an increase in memory pressure.
2285+
// potential slowdowns, and/or OOMS. EdenFS only allows one checkout at a
2286+
// time (https://fburl.com/code/a6eoy8wu), which acts as the aformentioned
2287+
// backpressure mechanism.
2288+
checkoutFuture =
2289+
std::move(checkoutFuture).semi().via(checkoutRevisionExecutor_.get());
2290+
}
2291+
22572292
return std::move(checkoutFuture).ensure([mountHandle] {});
22582293
}
22592294

eden/fs/service/EdenServer.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -799,6 +799,19 @@ class EdenServer : private TakeoverHandler {
799799
*/
800800
bool thriftUseSerialExecution_;
801801

802+
/**
803+
* If the Thrift Server created at startup is configured to use a dedicated
804+
* executor for the checkoutRevision endpoint.
805+
*/
806+
bool thriftUseCheckoutExecutor_;
807+
808+
/**
809+
* Dedicated executor for checkoutRevision Thrift endpoint. This will be null
810+
* if the config to use a dedicated executor is not set at daemon startup, if
811+
* thriftUseCheckoutExecutor_ is false.
812+
*/
813+
std::shared_ptr<folly::Executor> checkoutRevisionExecutor_;
814+
802815
/**
803816
* Remounting progress state.
804817
*/

eden/integration/hg/update_test.py

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1473,3 +1473,62 @@ def stop_eden() -> None:
14731473
# throws if eden exits uncleanly - which it does if there is some
14741474
# sort of crash.
14751475
shutdown_thread.join()
1476+
1477+
1478+
@hg_test
1479+
# pyre-ignore[13]: T62487924
1480+
class UpdateDedicatedExecutorTest(EdenHgTestCase):
1481+
# pyre-fixme[13]: Attribute `commit1` is never initialized.
1482+
commit1: str
1483+
# pyre-fixme[13]: Attribute `commit2` is never initialized.
1484+
commit2: str
1485+
# pyre-fixme[13]: Attribute `commit3` is never initialized.
1486+
commit3: str
1487+
enable_fault_injection: bool = True
1488+
1489+
def edenfs_logging_settings(self) -> Dict[str, str]:
1490+
return {
1491+
"eden.fs.inodes.TreeInode": "DBG5",
1492+
"eden.fs.inodes.CheckoutAction": "DBG5",
1493+
"eden.fs.inodes.CheckoutContext": "DBG5",
1494+
}
1495+
1496+
def edenfs_extra_config(self) -> Optional[Dict[str, List[str]]]:
1497+
parent_config = super().edenfs_extra_config()
1498+
if parent_config is None:
1499+
parent_config = {}
1500+
if "thrift" not in parent_config:
1501+
parent_config["thrift"] = []
1502+
1503+
parent_config["thrift"].append("use-checkout-executor = true")
1504+
1505+
return parent_config
1506+
1507+
def populate_backing_repo(self, repo: hgrepo.HgRepository) -> None:
1508+
repo.write_file("hello.txt", "hola")
1509+
repo.write_file(".gitignore", "ignoreme\n")
1510+
repo.write_file("foo/.gitignore", "*.log\n")
1511+
repo.write_file("foo/bar.txt", "test\n")
1512+
repo.write_file("foo/subdir/test.txt", "test\n")
1513+
self.commit1 = repo.commit("Initial commit.")
1514+
1515+
repo.write_file("foo/.gitignore", "*.log\n/_*\n")
1516+
self.commit2 = repo.commit("Update foo/.gitignore")
1517+
1518+
repo.write_file("foo/bar.txt", "updated in commit 3\n")
1519+
self.commit3 = repo.commit("Update foo/.gitignore")
1520+
1521+
def test_checkout_on_dedicated_executor(self) -> None:
1522+
"""Test that checkout can be completed on a dedicated executor."""
1523+
self.assert_status_empty()
1524+
1525+
self.chmod("hello.txt", 0o755)
1526+
self.assert_status({"hello.txt": "M"})
1527+
commit4 = self.repo.commit("Update hello.txt mode")
1528+
1529+
self.repo.update(self.commit1)
1530+
self.repo.update(commit4)
1531+
self.touch("hello.txt")
1532+
self.repo.update(self.commit1)
1533+
self.repo.update(commit4)
1534+
self.assert_status_empty()

0 commit comments

Comments
 (0)