-
Notifications
You must be signed in to change notification settings - Fork 88
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The data format of the results as published on the website hasn't changed significantly in comparison to OpenProblems v1. Each benchmark's outputs is eventually translated into json files named task_info.json, method_info.json, metric_info.json, results.json,
classDiagram
class TaskInfo{
task_id: String
commit_sha: String?
task_name: String
task_summary: String
task_description: String
repo: String
authors: Author[]
}
class MethodInfo{
task_id: String
method_id: String
method_name: String
method_summary: String
method_description: String
is_baseline: Boolean
paper_reference: String[]?
code_url: String[]
implementation_url: String?
code_version: String?
commit_sha: String?
}
class MetricInfo{
task_id: String
metric_id: String
metric_name: String
metric_summary: String
metric_description: String
paper_reference: String[]?
implementation_url: String?
code_version: String?
commit_sha: String?
maximize: Boolean
}
class DatasetInfo {
task_id: String
dataset_id: String
dataset_name: String
dataset_summary: String
dataset_description: String
data_reference: String[]?
data_url: String?
date_created: Date
file_size: Long
}
class Results {
task_id: String
dataset_id: String
method_id: String
normalization_id: String?
metric_values: Map[String, Double | NA]
scaled_scores: Map[String, Double]
mean_score: Double
resources: Resources
}
class Resources {
exit_code: Integer
duration_sec: Integer
cpu_pct: Double
peak_memory_mb: Double
disk_read_mb: Double
disk_write_mb: Double
}
class Author {
name: String
roles: String[]
info: AuthorInfo
}
class AuthorInfo {
email: String?
github: String?
orcid: String?
}
class QualityControl {
task_id: String
category: String
name: String
value: Double
severity: Integer
severity_value: Double
code: String
message: String
}
TaskInfo --> Author
Author --> AuthorInfo
Results --> Resources
Results --> MetricInfo
Results --> MethodInfo
Results --> DatasetInfo
However, there are several issues with this format:
- Naming inconsistencies: e.g. some columns have a prefix of the specific data type, e.g. metric info data has
metric_idbut alsomaximize - Does not allow for different parameterisations of datasets, methods, or metrics -- other than creating a unique id for each parameterisation. The normalisation_id is an exception to this, as this is effectively a parameterisation of the dataset processing steps and has a separate field in the results.
- This only allows for very linear Dataset -> Method -> Metric graphs. However, several tasks involve many more steps in the processing:
- In some tasks, running a method is split up into multiple steps in the form of a subworkflow. This results in it being harder to track the resources used to run a specific method
- In the task_ist_preprocessing benchmark, it's not just one method being evaluation, but a combination of computational steps
- There is more metadata available in the Viash components than is currently shown here. Look into which fields are missing.
- All references should be updated to doi: string[]?, bibtex: string[]?.
- The new data format should allow task leader to specify how the results should be rendered on the website by default. For instance, the task_spatially_variable_genes has too many datasets, thus making the default results heatmap hard to interpret.
- Versioning of datasets, components (methods, metrics) and benchmarking results
- Traceability of how common datasets were produced
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request