-
Notifications
You must be signed in to change notification settings - Fork 1
- Two-level namepace: blob level and tract level
- Talk to raw disk interface
- Atomicity is guaranteed:
- Asynchronous programming API
Metadata server:
- monitor tractservers
- TLT changes
- Reply with client requests for TLT
Client:
- Receive TLT from metadata server. TLT is cached for a long time. Will the cache expire so the client contact metadata server for TLT again? If the cache will never expire, the client can never get a updated TLT table if the client does not access tractservers affected in configuration changes.
Hash(o) mod m is the ordinary hashing which maps object o to m servers.
The problem is that a lot of data movements occurs when m is changed.
In FDS, Tract Locator = (Hash(g)+i) mod TLT\_Length is used to map a tract to a
tractserver. Since TLT_Length is fixed, there is no the above problem. Dynamo
divide the hash space to partions (consistent hashing) to avoid the above problem.
For ordinary hash table, the resizes only happens for the following two cases. The first case is to update m to 2m when the load factor is 1. The second case is to update m to m/2 when the load factor is 1/4.
b(k;n, p)
n = 1TB / 8MB = 125k p = 1/n = 1/1000 μ = np = 125 σ = sqrt(npq) = 11.2
Now use N(μ, σ) to approximate b(k;n, p). Based on 3σ rule, the tract number in a tractserver is [μ-3σ, μ-3σ] which is [91.4, 158.6]. The paper says it is [92, 161].
Similar to GFS.