Continue autograding if some cells are duplicated by tuncbkose · Pull Request #18 · AaltoSciComp/nbgrader

tuncbkose · 2023-01-31T12:38:59Z

Just putting this here to have it in the repo as well. This is not necessarily complete. I can also make this pr towards another dev branch, one that is numbered dev700 or something instead of dev999.

Changes

deduplicateids.py upon seeing a duplicated id, converts the cell into an "unimportant" cell that nbgrader can ignore, but leaves additional cell metadata under "nbgrader_local".
Added checkduplicateflag.py postprocessor that is called after autograding to allow logging a DuplicateCellError after autograding is complete.
Changed Converter base class to catch DuplicateCellErrors for each notebook, instead of each assignment.
Changed error message in checkcellmetadata.py to give more informative message (i.e. when validating).

Long Explanation

Currently, encountering multiple cells with the same grade_id (which cell duplication results in) leads to nbgrader deleting the content of the nbgrader metadata of one of the cells (I believe the one that occurs the first in the notebook). The problem is that the deletion happens like this

cell_metadata = {
    "key1": value1,
    "key2": value2,
    "nbgrader": {}
}

instead of

cell_metadata = {
    "key1": value1,
    "key2": value2
}

This then creates a problem in CheckCellMetadata, which sees the "nbgrader" dictionary being empty and interprets that as corrupted metadata.

Thus, the initial idea is to delete the "nbgrader" dictionary completely. The problem is that we will have no idea that something might have gone wrong, like a potential duplication.

Another complication is that we cannot take much action in the preprocessing stage in order to not interrupt the autograding of the rest of the assignment (for a given person).

Thus, I decided to change the cell into one that nbgrader can ignore (without deleting the basic metadata to differentiate it from other cells, this part might not be too necessary), and add additional metadata under "nbgrader_local" that we can detect later in post-processing.

cell_metadata = {
    "key1": value1,
    "key2": value2,
    # Unimportant cell
    "nbgrader": {"solution": False, "grade":False, "locked": False}
    "nbgrader_local": {"duplicate": True}
}

This "duplicate" flag can be incorporated into "nbgrader" dictionary as well, but that would require changing the schema in nbgraderformat, so it might create future merging headaches unless upstreamed.

With this change, autograding can continue without any problems and duplicate flags can be detected when each notebook is graded. CheckDuplicateFlag just looks if there is a flag added in DeduplicateIds, and if so, removes it and raises an error for Converter to log.

One significant change is that the general catching of errors during autograding happens on an assignment level, see converters/base.py::convert_notebooks. In order to not skip the rest of the assignment upon encountering a duplication error, I added a second try/except chunk inside the for loop over notebooks that only catches DuplicateCellError.

The changes under nbgraderformat are so that the required metadata for "unimportant" cells are not hard-coded into DeduplicateIds, so that it will be easier to update if the format ever changes.

…release feedback (cherry picked from commit e4801f5)

github-actions · 2023-01-31T12:39:22Z

👈 Launch a Binder on branch AaltoSciComp/nbgrader/duplicate-cells

tuncbkose · 2023-01-31T13:12:21Z

The failed checks seem to be mostly because of the changed DeduplicateIds behavior. I'll take a closer look at the tests next, both fixing the relevant ones to this pr and make them work in general.

rkdarst · 2023-01-31T15:32:48Z

Great explanation! And great work figuring stuff out. I think your analysis looks good, and it's a main question of what to do to make it reusable for everyone. The question of updating the global metadata is good - I don't know how hard that is to do or how disruptive it is.

I find it interesting how nbgrader tries to handle errors of duplicated cells, but that just makes a different error! With a worse error message.

Now my biggest thought is, long-term: what's the best way to handle this? Print a warning or fail completely. If it removes metadata for all later cells, will that fail autograding? Or is it that people are adding too much metadata that isn't needed (e.g. giving cells an ID when it's only some supporting stuff that may as well be split). Makes me wonder: should we recommend not giving cells an ID or any nbgrader metadata when there is no purpose to do so? Should it be configurable: command line option to fail or not? Should it look and see if it is an important cell type (autograded solution, etc.) and fail if this is duplicated, but not if it is an unimportant type?

These are all questions that the the bigger nbgrader community could answer, too...

Frans Welin and others added 3 commits January 17, 2023 12:56

fix an issue in release_feedback where groupshared user could not re-…

6dd6fd5

…release feedback (cherry picked from commit e4801f5)

Change version number to 0.8.1.dev999 to indicate fork

4872167

Continue autograding if some cells are duplicated

5a3e07d

Gehock force-pushed the dev-2023 branch 2 times, most recently from 2340d59 to 5b7acf9 Compare July 18, 2023 10:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continue autograding if some cells are duplicated#18

Continue autograding if some cells are duplicated#18
tuncbkose wants to merge 3 commits intodev-2023from
duplicate-cells

tuncbkose commented Jan 31, 2023

Uh oh!

github-actions bot commented Jan 31, 2023

Uh oh!

tuncbkose commented Jan 31, 2023

Uh oh!

rkdarst commented Jan 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tuncbkose commented Jan 31, 2023

Changes

Long Explanation

Uh oh!

github-actions bot commented Jan 31, 2023

Uh oh!

tuncbkose commented Jan 31, 2023

Uh oh!

rkdarst commented Jan 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants