Generic questions about the challenge #40

danpak94 · 2021-06-18T21:19:39Z

danpak94
Jun 18, 2021

Hello,

Thank you for organizing this nice challenge. I have a few generic questions about Task_1.

Can we find out how many teams are registered/participating for this challenge?
Is this the only forum for asking questions? (I want to make sure I'm not missing other sources of information)
I noticed that most of the heavy lifting seems to be performed in openfl and gandlf. I have a lot of questions about openfl, and I was wondering if we are expected to ask questions about them here or if we are expected to find out those answers ourselves through the openfl documentation. The two questions below might give you an sense of why I am interested in learning more about the openfl framework.

3b) How can I cut down the wall time spent per round? Running the run_challenge_experiment was very slow, even when I set collaborator training selection to just 1 collaborator at all rounds. For example, if I have some problem with my setup (e.g. not enough disk space to save checkpoints), I don't want to have to wait 48 hours before I find that out, and testing that wouldn't require having to train and validate with all collaborators.

3c) How do I plot the metrics displayed by collaborator.py (in openfl) during training? Can we only visualize the results after the entire thing has been trained? Again, this was problematic because training even the simplest test takes a very long time.

Thank you

Answered by brandon-edwards

Jun 21, 2021

Hi danpak94,
Thanks for your questions. Please find responses to 3b and 3c below.

Hello,

Thank you for organizing this nice challenge. I have a few generic questions about Task_1.

Can we find out how many teams are registered/participating for this challenge?

Is this the only forum for asking questions? (I want to make sure I'm not missing other sources of information)

I noticed that most of the heavy lifting seems to be performed in openfl and gandlf. I have a lot of questions about openfl, and I was wondering if we are expected to ask questions about them here or if we are expected to find out those answers ourselves through the openfl documentation. The two questions below might give…

View full answer

sarthakpati · 2021-06-19T03:28:45Z

sarthakpati
Jun 19, 2021
Maintainer

Hi @danpak94,

Thank you for your interest in the FeTS Challenge!

We cannot give this information out at this time, unfortunately. In case this changes, we will inform participants on a public forum so that everyone receives the information together.
This is the preferred forum to ask questions. The other method is via email, if you don't wish to make your query publicly visible.

3a. You can ask questions related to OpenFL/GaNDLF here itself, and the organizers will either answer directly or provide references to the documentation, whenever appropriate.

3b & c: tagging @brandon-edwards @msheller @psfoley @alexey-gruzdev

Cheers,
Sarthak on behalf of the FeTS Challenge Organizing Committee

1 reply

danpak94 Jun 22, 2021
Author

Thanks for the clarifications!

brandon-edwards · 2021-06-21T17:50:13Z

brandon-edwards
Jun 21, 2021
Maintainer

Hi danpak94,
Thanks for your questions. Please find responses to 3b and 3c below.

Hello,

Thank you for organizing this nice challenge. I have a few generic questions about Task_1.

Can we find out how many teams are registered/participating for this challenge?

Is this the only forum for asking questions? (I want to make sure I'm not missing other sources of information)

I noticed that most of the heavy lifting seems to be performed in openfl and gandlf. I have a lot of questions about openfl, and I was wondering if we are expected to ask questions about them here or if we are expected to find out those answers ourselves through the openfl documentation. The two questions below might give you an sense of why I am interested in learning more about the openfl framework.

3b) How can I cut down the wall time spent per round? Running the run_challenge_experiment was very slow, even when I set collaborator training selection to just 1 collaborator at all rounds. For example, if I have some problem with my setup (e.g. not enough disk space to save checkpoints), I don't want to have to wait 48 hours before I find that out, and testing that wouldn't require having to train and validate with all collaborators.

For testing purposes (ie, something like making sure you have enough disk space to save the checkpoints), you can set the value of 'challenge_metrics_validation_interval' to a large value in order to skip the validation step for most rounds (if your test does not require seeing much model validation). Limiting to one collaborator is also a good idea, and when doing so try to pick a smaller one. You can also make your own partitioning csv, in order to create institutions of whatever size you wish for such tests. Running on GPU, I get through a single round this way (single small institution) in minutes.

In addition, evaluating Hausdorff distance is computationally expensive and so adds quit a bit to the runtime. To address this (for those that do not care to collect this aspect of validation), we will soon be pushing a change that allows the removal of Hausdorff from validation altogether.

3c) How do I plot the metrics displayed by collaborator.py (in openfl) during training? Can we only visualize the results after the entire thing has been trained? Again, this was problematic because training even the simplest test takes a very long time.

I would suggest using checkpointing, then running your experiment with progressively greater values of rounds_to_train (like a run with 5 while enabling saving of a checkpoint, then run by restoring from that checkpoint with a new value of rounds_to_train of 10, ...). Each time you get a dataframe as the return of run_challenge_experiment, and you can plot those as they come in. A short script wrapping run_challenge_experiment could automate this.

Thank you

13 replies

Vishruth-Sham Jun 30, 2021

@Linardos, did you get a fix for this problem? I am facing the same issue.

Linardos Jun 30, 2021
Maintainer

Nope, still no solution

sarthakpati Jun 30, 2021
Maintainer

Hi @Linardos @Vishruth-Sham, we are investigating this and will get back to you ASAP.

brandon-edwards Jul 1, 2021
Maintainer

@Vishruth-Sham @Linardos we apologize for the delay in finding this solution, but setting db_store_rounds to a value greater than 1 should resolve the issue. Please let us know if this is not the case on your end.

Vishruth-Sham Jul 1, 2021

@brandon-edwards This fixed the issue. Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generic questions about the challenge #40

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 14 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Generic questions about the challenge #40

Uh oh!

danpak94 Jun 18, 2021

Replies: 2 comments · 14 replies

Uh oh!

sarthakpati Jun 19, 2021 Maintainer

Uh oh!

danpak94 Jun 22, 2021 Author

Uh oh!

brandon-edwards Jun 21, 2021 Maintainer

Uh oh!

Vishruth-Sham Jun 30, 2021

Uh oh!

Linardos Jun 30, 2021 Maintainer

Uh oh!

sarthakpati Jun 30, 2021 Maintainer

Uh oh!

Uh oh!

brandon-edwards Jul 1, 2021 Maintainer

Uh oh!

Vishruth-Sham Jul 1, 2021

danpak94
Jun 18, 2021

Replies: 2 comments 14 replies

sarthakpati
Jun 19, 2021
Maintainer

danpak94 Jun 22, 2021
Author

brandon-edwards
Jun 21, 2021
Maintainer

Linardos Jun 30, 2021
Maintainer

sarthakpati Jun 30, 2021
Maintainer

brandon-edwards Jul 1, 2021
Maintainer