When working with version control systems to maintain code, it is possible to want different versions of a file or even a whole repository. An example of this would be a teacher using GitHub to manage separate template repositories (repo) for code assignments, solutions, and unit tests.
Managing solutions and assignments separately becomes an issue when updates are made. If the assignment is updated, the solution and tests also have to be updated, and the teacher has to ensure that all repos are compatible, we call this drift. Working with separate repos simultaneously can be a pain as the teacher has to jump around and make changes in different places, essentially cluttering one's workplace.
To combat the issues mentioned above, we have developed Sanitizer, a plug-in for RepoBee that
allows the user to manage their assignments and solutions inside a single repository.
Sanitizer adds the commands sanitize file and sanitize repo that lets the user sanitize
files and git repositories. Sanitizer works by removing or replacing text by going through
files as text, looking for a certain markup described below. The most simple usage
consists of a start and end marker, where content between these two markers will be
removed, or "sanitized" as it were.
This solution allows teachers to safely work inside a single repository, without the chance of solutions reaching students when creating student repositories (human error not accounted for). The problem of drift is also removed, and as a bonus, teachers can easily create their assignments using solution driven development.
Use RepoBee's plugin manager to install and activate.
$ repobee plugin install
$ repobee plugin activate # persistent activation
Sanitizer only adds new commands, so we recommend activating it persistently after installing. For general instructions on installing and using plugins, see RepoBee's plugin docs.
For Sanitizer to work, marker syntax must be correct, this includes
spelling of the markers themselves, the markers currently are as follows:
- REPOBEE-SANITIZER-START
- REQUIRED: A block is not a block without a start marker
- Indicates the start of a block. Any text will be removed until reaching
a
REPLACE-WITHorENDmarker.
- REPOBEE-SANITIZER-REPLACE-WITH
- OPTIONAL: but requires a start and end block.
- Any text between this marker and the next
ENDmarker will remain.
- REPOBEE-SANITIZER-END
- REQUIRED: Must exist for each start block
- Indicates the end of a block.
- REPOBEE-SANITIZER-SHRED
- OPTIONAL: Can only exist on the first line of a file. If this exists, there cannot be any other markers of any type in the file
- Having this marker will remove the entire file when running the
sanitize-repoorsanitize-filecommands
If a marker is incorrectly spelled, repobee-sanitizer will report an error.
Consider the following code:
class StackTest {
@Test
public void topIsLastPushedValue() {
REPOBEE-SANITIZER-START
// Arrange
int value = 1338;
// Act
emptyStack.push(value);
stack.push(value);
int emptyStackTop = emptyStack.top();
int stackTop = stack.top();
// Assert
assertThat(emptyStackTop, equalTo(value));
assertThat(stackTop, equalTo(value));
REPOBEE-SANITIZER-END
}
}Example 1: The simplest usage of
Sanitizerusing a .java file
For this .java test file, the santize-file command will identify the START
and END markers, and proceed to remove the code between the markers. The result will look
like this:
class StackTest {
@Test
public void topIsLastPushedValue() {
}
}Sanitizer also supports the REPOBEE-SANITIZER-REPLACE-WITH marker.
By adding a replace marker, we can specify code that should replace the removed
code. Example as follows:
class StackTest {
@Test
public void topIsLastPushedValue() {
REPOBEE-SANITIZER-START
// Arrange
int value = 1338;
// Act
emptyStack.push(value);
stack.push(value);
int emptyStackTop = emptyStack.top();
int stackTop = stack.top();
// Assert
assertThat(emptyStackTop, equalTo(value));
assertThat(stackTop, equalTo(value));
REPOBEE-SANITIZER-REPLACE-WITH
fail("Not implemented");
REPOBEE-SANITIZER-END
}
}Example 2: The code is the same as for example 1, but we have added a
REPOBEE-SANITIZER-REPLACE-WITHmarker.
As we can see in Example 2, this lets us provide two versions of a function,
one that is current, and one that will replace it. Example 1 and 2 shows us a
piece of code used in the KTH course DD1338. This code is part of an assignment
where students are asked to implement a test function. The example shows a
finished solution that is available to the teachers of the course. However,
because of Sanitizer and the REPLACE-WITH marker, the code can be
reduced to the following:
class StackTest {
@Test
public void topIsLastPushedValue() {
fail("Not implemented");
}
}Example 3: Sanitized code that is provided to students.
We can see that the only code that remains inside the function is that of the
REPLACE-WITH marker. This gives us the main usage that Sanitizer was
developed for, it allows us to combine finished solutions with the
"skeletonized" solutions that are provided to students.
Sometimes (usually) we want code that can run, its a good thing then that
Sanitizer blocks can be commented out! Example 2 produces the same output as the following:
class StackTest {
@Test
public void topIsLastPushedValue() {
//REPOBEE-SANITIZER-START
// Arrange
int value = 1338;
// Act
emptyStack.push(value);
stack.push(value);
int emptyStackTop = emptyStack.top();
int stackTop = stack.top();
// Assert
assertThat(emptyStackTop, equalTo(value));
assertThat(stackTop, equalTo(value));
//REPOBEE-SANITIZER-REPLACE-WITH
//fail("Not implemented");
//REPOBEE-SANITIZER-END
}
}Example 4: Java code with
Sanitizerrelated syntax commented out
Sanitizer automatically detects if there is a prefix in front of any
markers. This way we can have java comments: //, python comments: # or
similar preceding our markers. This means code can still compile!
Prefixes are valid for a single block, and are defined on the same line as the REPOBEE-SANITIZER-START marker of that block. The prefix is defined as all characters occurring before the START marker, without leading and trailing whitespace. Whitespace within the prefix, such as in / /, counts as part of the prefix.
If a prefix is defined in a block, all lines in the remainder of the block that contain a marker, as well as all lines in REPLACE blocks, must contain the prefix before any other non-whitespace characters. There is no limit to the amount of whitespace that may appear before or after the prefix, however. All whitespace is also preserved when sanitizing, and only the first occurrence of a prefix on each line is removed, allowing code comments inside the REPLACE block to be preserved.
Sanitizer supports two main commands: sanitize file and sanitize repo
repobee sanitize file <infile> <outfile> performs the sanitize operation described below directly on a file infile and writes the output to outfile.
Running the following command will sanitize input.txt (given that it exists) and create the file sanitized.txt containing, you guessed it, the sanitized file. sanitized.txt will be overwritten if it already exists.
$ repobee sanitize file input.txt sanitized.txt
The --strip flag can also be used to reverse the operation, instead removing all Sanitizer syntax from the file.
sanitize repo performs the sanitization protocol on an entire repository. It's most basic usage looks like so:
$ repobee sanitize repo --no-commit
Assuming that you're currently at the root of a Git repository, this will sanitize the current branch without making a commit.
If you are not currently at the root of a repository, a path can be specified using the --repo-root <path> option.
Another important feature is working with branches, Sanitizer was essentially made to manage differing branches. For an example, a repo can have a solutions and a main branch, where the solutions branch contains an entire solution to an assignment, as well as Sanitizer markers (specified below). When we then sanitize the solutions branch, we create a slimmed-down version of our repo (that we would send to students), what we can then do is to specify which branch we want the sanitized version to end up on.
This is done using the --target-branch <branch-name> option. For example, if our repo is checked out to the branch solutions (that contains full solutions and Sanitizer markers), Running:
$ repobee sanitize repo --target-branch main
will sanitize the currently checked out branch (in this case solutions) and commit the result to the specified branch, in this case main. Successfully using the --target-branch feature allows us to essentially retain two concurrent repositories while only having to update one of them should any changes or improvements be made to our course tasks.
If your target branch is the main branch of your repository, it can be quite intimidating to force commit any changes to that branch. Therefore, we have added the --create-pr-branch (or -p) flag to Sanitizer that will create a new branch from the one specified by --target-branch and sanitize to the new branch from where a pull request can be created. This way, you can lock your main branch on your git platform, to ensure that your code is safe when using Sanitizer
Sanitizer does its best to ensure that nothing breaks while sanitizing. Therefore, when sanitizing a repo, Sanitizer will always make sure you have no uncommitted files in the repo as well as no syntax errors in your Sanitizer syntax. If you do have an error, Sanitizer will not sanitize anything until you fix any errors and run the command again. Sanitizer even prevents you from committing if no changes will be made to the repo!
If you are completely and utterly sure that computers are stupid things and that you are a far superior being, you may use the --force flag to ignore any warnings related to uncommitted files/changes in the repo (Jokes aside this is necessary sometimes, like if you want to commit when no changes were made).
See LICENSE for details.
