-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[ENH] s3heap, meet the compactor #5584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2224127
da2371f
ddec8ce
c844996
c3f061a
648a9f2
6c7e23c
cf7022a
1d9c7a5
6e1c380
4b3c1ba
a140142
8bf9197
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# s3heap-service | ||
|
||
The s3heap-service integrates with the task manager to trigger tasks at no faster than a particular | ||
cadence, with reasonable guarantees that writing data will cause a task to run. | ||
|
||
propel-code-bot[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This document refines the design of the heap-tender and heap service until it can be implemented | ||
safely. | ||
|
||
## Abstract: A heap and a sysdb. | ||
propel-code-bot[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
At the most abstract level, we have a heap and the sysdb. An item is either in the heap or not in | ||
the heap. For the sysdb, an item is not in the sysdb, in the sysdb and should be scheduled, or in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be good to bullet point this entire sentence with one bullet per state. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can I leave the prose? This is meant to match the table below and the prose is what I need more than a table or bullets. Happy to add redundancy so we all have what we need to understand. |
||
the sysdb and waiting for writes to trigger the next scheduled run. | ||
|
||
That gives this chart | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Documentation] Add missing colon: change Context for Agents
|
||
|
||
| Heap State | Sysdb State | | ||
|------------|-------------| | ||
| Not in heap | Not in sysdb | | ||
| Not in heap | In sysdb, should be scheduled | | ||
| Not in heap | In sysdb, waiting for writes | | ||
| In heap | Not in sysdb | | ||
| In heap | In sysdb, should be scheduled | | ||
| In heap | In sysdb, waiting for writes | | ||
|
||
More abstractly, view it like this: | ||
|
||
| | On Heap | Not On Heap | | ||
|---------------------|------------|-------------| | ||
| Not in sysdb | A_1 | A_2 | | ||
| In sysdb, scheduled | B_1 | B_2 | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Documentation] Fix inconsistent terminology: change Context for Agents
|
||
| In sysdb, waiting | C_1 | C_2 | | ||
|
||
When viewed like this, we can establish rules for state transitions in our system. Each operation | ||
operates on either the sysdb or the heap, never both because there is no transactionality between S3 | ||
and databases. Thus, we can reason that we can jump to any row within the same column, or to | ||
another column within the same row. | ||
|
||
## State space diagram | ||
|
||
From | ||
| | | A_1 | A_2 | B_1 | B_2 | C_1 | C_2 | | ||
|-----|------|------|------|------|------|------|------| | ||
| | A_1 | - | IMP1 | YES1 | X | YES1 | X | | ||
| | A_2 | GC1 | - | X | GC2 | X | YES1 | | ||
| To | B_1 | IMP2 | X |- | NEW2 | YES3 | X | | ||
| | B_2 | X | NEW1 | IMP3 | - | X | YES3 | | ||
| | C_1 | IMP2 | X | YES2 | X | - | IMP4 | | ||
| | C_2 | X | NO1 | X | YES2 | IMP3 | - | | ||
|
||
- GC1: Item gets a perpetual "is-done" from the sysdb and transitions to A_2. | ||
- GC2: Garbage collection. | ||
- NEW1: Create a new task in the sysdb. | ||
- NEW2: Finish the new operation by priming the task and putting it on the heap. | ||
- YES1: Task gets deleted from sysdb. | ||
- YES2: This implies that we move from scheduled to waiting while the task is on heap. This happens when a job completes and reads all data from the log. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Documentation] Add "the" so the phrase reads "on the heap." Context for Agents
|
||
- YES3: There was a write, the heap needed to schedule, so it picked a time and updated sysdb. | ||
- NO1: This implies that the state transitioned from being not-in-sysdb to in-sysdb. A new task will always run straight away, so it should not be put into waiting state. | ||
propel-code-bot[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- IMP1: The item is not on heap or in the database. First transition is to B_2 or C_2. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Documentation] Add "the" so the phrase reads "not on the heap." Context for Agents
|
||
- IMP2: Task UUIDs are not re-used. Starting from A_1 implies the task was created and then put on the heap and subsequently removed from sysdb. There should be no means by which it reappears in the sysdb. Therefore this path is impossible. | ||
- IMP3: We never take something off the heap until the sysdb is updated to reflect the job being done. Therefore we don't take this transition. | ||
- IMP4: We don't add something to the heap until it has been scheduled. | ||
- X: Impossible. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
use std::collections::HashMap; | ||
|
||
use chroma_sysdb::SysDb; | ||
use chroma_types::{CollectionUuid, ScheduleEntry}; | ||
use s3heap::{HeapScheduler, Schedule, Triggerable}; | ||
use uuid::Uuid; | ||
|
||
/// Scheduler that integrates with SysDb to manage task scheduling. | ||
pub struct SysDbScheduler { | ||
sysdb: SysDb, | ||
} | ||
|
||
impl SysDbScheduler { | ||
pub fn new(sysdb: SysDb) -> SysDbScheduler { | ||
Self { sysdb } | ||
} | ||
} | ||
|
||
#[async_trait::async_trait] | ||
impl HeapScheduler for SysDbScheduler { | ||
async fn are_done(&self, items: &[(Triggerable, Uuid)]) -> Result<Vec<bool>, s3heap::Error> { | ||
let collection_ids = items | ||
.iter() | ||
.map(|item| CollectionUuid(*item.0.partitioning.as_uuid())) | ||
.collect::<Vec<_>>(); | ||
let schedules = self | ||
.sysdb | ||
.clone() | ||
.peek_schedule_by_collection_id(&collection_ids) | ||
.await | ||
.map_err(|e| s3heap::Error::Internal(format!("sysdb error: {}", e)))?; | ||
let mut by_triggerable: HashMap<Triggerable, ScheduleEntry> = HashMap::default(); | ||
for schedule in schedules.into_iter() { | ||
by_triggerable.insert( | ||
Triggerable { | ||
partitioning: schedule.collection_id.0.into(), | ||
scheduling: schedule.task_id.into(), | ||
}, | ||
schedule, | ||
); | ||
} | ||
let mut results = Vec::with_capacity(items.len()); | ||
for (triggerable, nonce) in items.iter() { | ||
let Some(schedule) = by_triggerable.get(triggerable) else { | ||
propel-code-bot[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
results.push(true); | ||
continue; | ||
}; | ||
results.push(schedule.task_run_nonce != *nonce); | ||
} | ||
Ok(results) | ||
} | ||
|
||
async fn get_schedules(&self, ids: &[Uuid]) -> Result<Vec<Schedule>, s3heap::Error> { | ||
let collection_ids = ids.iter().cloned().map(CollectionUuid).collect::<Vec<_>>(); | ||
let schedules = self | ||
.sysdb | ||
.clone() | ||
.peek_schedule_by_collection_id(&collection_ids) | ||
.await | ||
.map_err(|e| s3heap::Error::Internal(format!("sysdb error: {}", e)))?; | ||
let mut results = Vec::new(); | ||
tracing::info!("schedules {schedules:?}"); | ||
propel-code-bot[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for schedule in schedules.into_iter() { | ||
if let Some(when_to_run) = schedule.when_to_run { | ||
results.push(Schedule { | ||
triggerable: Triggerable { | ||
partitioning: schedule.collection_id.0.into(), | ||
scheduling: schedule.task_id.into(), | ||
}, | ||
nonce: schedule.task_run_nonce, | ||
next_scheduled: when_to_run, | ||
}); | ||
} | ||
} | ||
Ok(results) | ||
} | ||
} |
Uh oh!
There was an error while loading. Please reload this page.