diff --git a/README.md b/README.md index 45e75e9..d051bcc 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,25 @@ # Reproducible Research -Booklet started with the Sigma team. Software engineering techniques applied to the reproductibility of scientific experiences and projects. +Software engineering techniques applied to the reproductibility of scientific experiences and projects. + +This booklet started with the [SigMA](https://www.cristal.univ-lille.fr/?rubrique29&eid=30) team. ## Building this book This booklet is build with [the latest version of pillar](https://github.com/pillar-markup/pillar/tree/newpipeline). - To build this book, follow the installation notes of the "newpipeline" branch. - - Make sure you have a latex installation with all required dependencies. + - Make sure you have a LaTeX installation with all required dependencies. - Finally on the root of the repository: ```bash $ pillar build pdf ``` -And you may have your results in the "_result/pdf" directory. +And you may have your results in the `_result/pdf directory. ## About Pillar Markup Syntax The pillar markup syntax has been described in the following [cheatsheet](https://www.cheatography.com/benjaminvanryseghem/cheat-sheets/pillar/). -More info can be found in the pillar tutorial here: -https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/PillarChap/Pillar.html \ No newline at end of file +More info can be found in the [pillar tutorial](https://ci.inria.fr/pharo-contribution/job/EnterprisePharoBook/lastSuccessfulBuild/artifact/book-result/PillarChap/Pillar.html). \ No newline at end of file diff --git a/continuous_integration.pillar b/continuous_integration.pillar index a0de1f9..0f532d2 100644 --- a/continuous_integration.pillar +++ b/continuous_integration.pillar @@ -4,27 +4,27 @@ This is always good practice to test some features of your application e.g. some functions, methods, etc. And it is even better practice to guaranty the project will run under a given configuration i.e. a given OS' and associated packages' version. -Imagine you are developing on a Mac machine but wish your application to be run on Linux, too. Besides, your project may rely on many dependencies and some could even require to be compiled beforehand. +Now, imagine you are developing on a Mac machine but wish your application to be run on Linux, too. Besides, your project may rely on many dependencies and some could even require to be compiled beforehand. -Continuous integration tools e.g. *Travis>https://travis-ci.com/* or *Jenkins>https://jenkins.io/* will allow to automate these tasks and certify that your project can not only be run on your own machine but also under different configurations i.e. it is portable and your code is reproducible. +Continuous integration tools e.g. *Travis>https://travis-ci.com/* or *Jenkins>https://jenkins.io/* will allow you to automate these tasks and certify that your project can not only be run on your own machine but also under different configurations i.e. it is portable and your code is reproducible. -Once the configuration is set you can run a bunch of tests on your code and report the results to identify some bugs at different scales. +Once the configuration is set you can run a bunch of tests on your code and report the results to identify some bugs at different levels and scales. +This whole list of operations can be tested after every push or pull request -All this is performed on an remote machine that may be physically far distant from your personal computer. -You can configure the integration tool to execute this whole list of operations after each commit :) +@@note Keep in mind that all this is performed on an remote machine that may be physically far distant from your personal computer. In this booklet we consider *Travis>https://travis-ci.com/* !!! Configuration of Travis -The associated configuration file ==.travis.yml== collects all info required for testing the project on various configurations +The associated configuration file ==.travis.yml== collects all the information required for testing the project on various configurations [[[language=yaml language: python sudo: require # In case you need to use apt-get -matrix: # matrix os/Python version +matrix: # matrix OS/Python version include: ## Linux - os: linux @@ -45,12 +45,13 @@ install: - pip install . # Install the project ]]] -@@note For more detailed information see the ''Getting started'' *tutorial>https://docs.travis-ci.com/user/getting-started/* +@@note For more detailed information see the Travis ''Getting started'' *tutorial>https://docs.travis-ci.com/user/getting-started/* !!! Usecases of Travis +@continuous_integration_travis_usecases - Check installation of the project of several machine configurations -- Test parts of code -- Build LaTex project +- Run unit tests and/or scripts +- Attach the ==.pdf== file to a ==tag== on GitHub after building the corresponding LaTeX project using Travis See section *@workflows_for_researchers* \ No newline at end of file diff --git a/figures/initialization_reproducible_paper.png b/figures/initialization_reproducible_paper.png new file mode 100644 index 0000000..cc9dd94 Binary files /dev/null and b/figures/initialization_reproducible_paper.png differ diff --git a/git_practical.pillar b/git_practical.pillar index 6a5e3db..432050c 100644 --- a/git_practical.pillar +++ b/git_practical.pillar @@ -99,26 +99,32 @@ $ git reset --hard $ git clean -df ]]] -The ==-d== option removes untracked directories in addition to untracked files, while the ==-f== option is a shortcut ==--force==, forcing the corresponding deletions. +The ==\-d== option removes untracked directories in addition to untracked files, while the ==\-f== option is a shortcut ==\-\-force==, forcing the corresponding deletions. -The reason for needing two commands instead of one relies on the fact that Git has several staging areas (such as the ones used to keep the tracked files), which we usually would like to clean when we discard the repository. Of course, experienced readers may search why they would need both in Git's documentation. +The reason for needing two commands instead of one relies on the fact that Git has several staging areas (such as the ones used to keep the tracked files), which we usually would like to clean when we discard the repository. +Of course, experienced readers may search why they would need both in Git's documentation. @@todo what about *this>https://stackoverflow.com/a/21718540* ? !!! Ignoring files (100%) @gitignore -Many times we will find that we do not want to commit some files that are in our repository's directory. This is mostly the case of generated or automatically downloaded files. For example, imagine you have a C project and some makefiles to compile it, generating a binary library. While it would be good to store the result of compilation from time to time, storing it in a Git repository (or SVN, or Bazar) may be a cause of headaches. First, as you will see in *@expert_git*, this may be a cause for conflicts. +Many times we will find that we do not want to commit some files that are in our repository's directory. +This is mostly the case of generated or automatically downloaded files. +For example, imagine you have a C project and some makefiles to compile it, generating a binary library. +While it would be good to store the result of compilation from time to time, storing it in a Git repository (or SVN, or Bazar) may be a cause of headaches. +First, as you will see in *@expert_git*, this may be a cause for conflicts. Second, since we should be able to generated such binary library from the sources, having the already compiled result in the repository does not add so much value. -This same ideas can be used to ignore any kind of generated file. For example, pdfs generated by document generation tools, meta-data files generated by IDEs and tools (e.g., Eclipse), compiled libraries (e.g., dll, so, or dylib files). +This same ideas can be used to ignore any kind of generated file. +For example, pdfs generated by document generation tools, meta-data files generated by IDEs and tools (e.g., Eclipse), compiled libraries (e.g., dll, so, or dylib files). In such cases, we can tell Git to ignore cetain files using the ==.gitignore== file. The ==.gitignore== file is an optional text file that we can write in the root of our repository with a list of file paths to ignore. -[[[ +[[[language=bash # Example of .gitignore file - +$ # Lines starting with hashtags are comments # A file name will ignore that file @@ -138,14 +144,15 @@ $ git add .gitignore $ git commit -m "Added gitignore" ]]] -From this moment on, all listed files will be ignored by ==git add== and ==git status==. And you will be able to perform further commands to add "all but ignored files": +From this moment on, all listed files will be ignored by ==git add== and ==git status==. +And you will be able to perform further commands to add "all but ignored files": [[[language=bash $ git add . ]]] @@note If a file or a file type is tracked but you want git to ignore its changes afterward, adding it to .gitignore file will not make the job i.e. git will continue to track it. -To avoid keeping track of it in the future, but secure it locally in your working directory, it must be removed from the tracking list using ==git rm --cached (.)==. +To avoid keeping track of it in the future, but secure it locally in your working directory, it must be removed from the tracking list using ==git rm \-\-cached file #or *.file_type==. Nevertheless, be aware that the file is still present in the past history! @@todo Link to *@expert_git_remove_file* to remove a (sensitive) file from history @@ -168,7 +175,9 @@ Think it this way: you need to tell the server who you are on every interaction Otherwise, Github will reject any operation against your repository. Such a setup requires the creation and uploading of SSH keys. -An SSH key works as a lock: a key is actually a pair of a public and a private key. The private key is meant to reside in your machine and not be published at all. A public key is meant to be shared with others to prove your identity. Whenever you want to prove your identity, SSH will exchange messages encrypted with your public key, and see if you are able to decrypt it using your private key. +An SSH key works as a lock: a key is actually a pair of a public and a private key. +The private key is meant to reside in your machine and not be published at all. +A public key is meant to be shared with others to prove your identity. Whenever you want to prove your identity, SSH will exchange messages encrypted with your public key, and see if you are able to decrypt it using your private key. To create an SSH key, in *nix systems you can simply type in your terminal @@ -176,16 +185,23 @@ To create an SSH key, in *nix systems you can simply type in your terminal $ ssh-keygen -t rsa -b 4096 -C "your_email@some_domain.com" ]]] -Follow the instructions in your terminal such as setting the location for your key pair (usually it is ==$HOME/.ssh==) and the passphrase (a kind of password). Finally, you'll end up with your public/private pair on the selected location. It is now time to upload it to Github. +Follow the instructions in your terminal such as setting the location for your key pair (usually it is ==$HOME/.ssh==) and the passphrase (a kind of password). +Finally, you'll end up with your public/private pair on the selected location. It is now time to upload it to Github. -Connect yourself to your Github settings (usually https://github.com/settings/profile) and go to the "SSH and GPG keys" menu. Import there the contents of your public key file. You should be now able to use your repository. +Connect yourself to your Github settings (usually https://github.com/settings/profile) and go to the "SSH and GPG keys" menu. +Import there the contents of your public key file. You should be now able to use your repository. !!! Rewriting the history (Guille 100%) Many times it happens that we accidentally commit something wrong. -Maybe we wanted to commit more or less things, maybe a completely different content, or we did a mistake in the commit's message. In these cases, we can rewrite Git's history, e.g, undo our current commit and go back to the previous commit, or rewrite the current commit with some new properties. +Maybe we wanted to commit more or less things, maybe a completely different content, or we did a mistake in the commit's message. +In these cases, we can rewrite Git's history, e.g, undo our current commit and go back to the previous commit, or rewrite the current commit with some new properties. -Be careful! Rewriting the history can have severe consequences. Imagine that the commit you want to undo was already pushed. This means that somebody else could have pulled this commit into her/his repository. If we undo this already publised commit, we are making everybody else's repositories obsolete! This can be indeed problematic depending on the number of users the project has, and their knowledge on Git to be able to solve this issue. +Be careful! Rewriting the history can have severe consequences. +Imagine that the commit you want to undo was already pushed. +This means that somebody else could have pulled this commit into her/his repository. +If we undo this already publised commit, we are making everybody else's repositories obsolete! +This can be indeed problematic depending on the number of users the project has, and their knowledge on Git to be able to solve this issue. !!!! Undo a commit using ==git reset \-\-hard== @@ -195,9 +211,11 @@ To undo the last commit, it is as easy as: $ git reset --hard HEAD~1 ]]] -==git reset \-\-hard [commitish]== makes your current branch point to [commitish]. ==HEAD== is your current head, and you can read ==~1== as \"minus one\". In other words, ==HEAD~1== is head minus one, which boils down to the parent of head, our previous commit. +==git reset \-\-hard [commitish]== makes your current branch point to [commitish]. ==HEAD== is your current head, and you can read ==~1== as \"minus one\". +In other words, ==HEAD~1== is head minus one, which boils down to the parent of head, our previous commit. -You can use this same trick to rewrite the history in any other way, since you can use any commitish expression to reset. For example, ==HEAD~17== means 17 versions before head, or ==someBranch~4== means four commits before the branch ==someBranch==. +You can use this same trick to rewrite the history in any other way, since you can use any commitish expression to reset. +For example, ==HEAD~17== means 17 versions before head, or ==someBranch~4== means four commits before the branch ==someBranch==. !!!! Update a commit's message using ==git commit \-\-amend== @@ -213,12 +231,15 @@ Or, if you don't use the ==\-m== option, a text editor will be prompt so you can $ git commit --amend ]]] -You can use the same trick not only to modify a commit's message but to modify your entire commit. Actually, just adding new things with ==git add== before an ==\-\-amend== will replace the current commit with a new commit merging the previous commit changes with what you just added. +You can use the same trick not only to modify a commit's message but to modify your entire commit. +Actually, just adding new things with ==git add== before an ==\-\-amend== will replace the current commit with a new commit merging the previous commit changes with what you just added. + +!!! How to overwrite/modify commits (Guillaume 100%) -!!! How to overwrite/modify commits (Guillaume 80%) +@@todo There's some overlap between this part and the previous one. Need to merge the 2 at some point WARNING: It is highly not recommended to rewrite the history of a repo especially when part of it has already been pushed to a remote. -Modifying the history will most likely break the history shared by the different collaborators and you may deal with an inextricable merge conflict. +Modifying the history will most likely break the history shared by the different collaborators and you may deal with inextricable merge conflicts. !!!! Change the last commit @@ -254,7 +275,7 @@ f039832 Old cca92f1 Even older ]]] -Then, you can interactively ==-i== focus on the last three ==HEAD~3== commit. +Then, you can interactively ==\-i== focus on the last three ==HEAD~3== commit. [[[language=bash $ git rebase -i HEAD~3 @@ -264,7 +285,7 @@ pick 71c0c64 Intermidiate pick eae7846 New ]]] -@@note Observe that the commits are displayed in the reversed order. +@@note The list of commits are displayed in the descending. Now you can squash the commit "Intermediate" into its parent commit "Old" @@ -302,7 +323,9 @@ cca92f1 Even older !!!! Pushing rewritten history -As soon as the history we have rewritten was never pushed before, we can continue working normally and pushing our changes then without problems. However, if we have already pushed the commit we want to undo, this means that we are potentially impacting all users of our repository. Because of the problems it can pose to other people, pushing a rewritten history is not a completely favoured by Git. Better said, it is not allowed by default and you'll be warned about it: +As long as the history we have rewritten has not been pushed yet, we can continue working normally and then think about pushing our changes with peace of mind. +However, if we have already pushed the commit we want to undo, we must face the dilemma of rewriting the history at the cost of potentially breaking other users' history and that would cause them many troubles (merge conflicts, ect.). +By default, ==git== does not allow this behaviour warns you about it: [[[ $ git push @@ -315,14 +338,14 @@ hint: 'git pull ...') before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details. ]]] -With this message Git means that you should not blindly overwrite the history. +With this message ==git== means that you should not blindly overwrite the history. Also, it suggests to pull changes from the remote repository. However, doing that will bring back to our repository the history we wanted to undo! What we want to do is to impose our current (undone) state in the remote repository. To do that, we need to ""force"" the push using the ==git push \-\-force== or the ==git push \-f== option. [[[ -$git push -f +$ git push -f Total 0 (delta 0), reused 0 (delta 0) To git@github.com:REPOSITORY_OWNER/YOUR_REPOSITORY.git + a1713f3...6e0c7bf YOUR_BRANCH -> YOUR_BRANCH (forced update) diff --git a/project_management.pillar b/project_management.pillar index 5eaeb78..75141d5 100644 --- a/project_management.pillar +++ b/project_management.pillar @@ -4,17 +4,18 @@ @private_repos Firstly, private repository does not entail, by any means, non collaborative repository. -They work exactly the same as public repositories except commits are only available to select collaborators. +They work exactly the same as public repositories except that they are only accessbile to select collaborators. In the early stages, your project may be geared towards a very specific goal and you may not want to display the incremental developments to the public but simply collaborate privately with some people. From this perspective, private repositories can be seen as hidden development branches of the project. -@@note You can always turn a public repository into a private one and vice versa -@@note Free private repositories for academics -@@todo Remark on Travis for private repos -@@todo Warning that when the repo goes public, all previous commits will be accessible to anyone, so be politically correct. +@@note You can always turn a public repository into a private one and vice versa. Nevertheless, when the repo goes public, all previous commits will be accessible to anyone... -!!!Setting +@@note Academics can enjoy unlimited: +-private GitHub repositories +-Travis builds for public GitHub repositories + +!!!!Setting As a first step, you could create a public repository that may simply be used as a showcase for the project, for the world to know you are working on something and be the place where you provide stable releases. @@ -23,7 +24,7 @@ As a first step, you could create a public repository that may simply be used as In parallel, you can develop the project in a private repository. To make releases available once ready, you must add the public repository to the remotes of your private repository. -!!!In practice +!!!!In practice Create the public and private repositories on GitHub. @@ -54,15 +55,15 @@ Start tracking the files you just pulled [[[language=bash $ git add -A ]]] -@@note "-A" for tracking all files tracked file +@@note ==-A== for tracking all files tracked file Commit the updated state of the private repository [[[language=bash $ git commit -a -m "Initial synchronization of public and private repositories" ]]] -@@note "-a" for gathering every (tracked) file in the same commit, "-m" for assigning a message to this commit +@@note ==-a== for gathering every (tracked) files in the same commit, ==-m== for assigning a message to this commit. This can be shortened using ==-am==. -Push i.e. save these changes in the history of the private repository +Push i.e. send these changes to the private repository hosted on GitHub [[[language=bash $ git push origin master ]]] @@ -70,12 +71,12 @@ $ git push origin master First stage is complete, public and private repositories are synchronized and both share the same initial history. Your private project can now begin. -Tell GIT not to track and version specific files/folders +Tell ==git== not to track and version specific files or folders. For some reasons you may not want specific folders, files or file types to be revealed to the public world. -One main reason is that some files are pointless to share (e.g. .aux or .log), another reason could be that they are heavy files or because you work with sensitive data that must be kept locally. +One main reason is that some files are pointless to share (e.g. ==.log== files), another reason could be that they are heavy files or because you work with sensitive data that must be kept locally. -For this purpose, you can tell GIT not to track a list of different files/folders. +For this purpose, you can tell ==git== not to track a list of different files/folders (see *@gitignore*). Create a .gitignore file [[[language=bash @@ -93,7 +94,7 @@ confidential/ *.ipynb """ -@@note You can find *more examples>­https://www.atlassian.com/git/tutorials/saving-changes/gitignore* about .gitignore files. +@@note You can find *more examples>­https://www.atlassian.com/git/tutorials/saving-changes/gitignore* about the ==.gitignore== file. @@note You could have used the "Add .gitignore" button at the creation of the repo on GitHub, to automatically exclude common useless files relative a language, e.g. ".log", ".aux" ... with LaTex. Good practice diff --git a/workflows.pillar b/workflows.pillar index 9dcd95b..e376d42 100644 --- a/workflows.pillar +++ b/workflows.pillar @@ -1,49 +1,123 @@ -!! Workflows for researchers (50% Remi, Guillaume) +!! Workflows for researchers (75% Remi, Guillaume) @workflows_for_researchers In this section, we detail example workflows for researchers. !!! Version control in research -Whenever working on code, packaging and versioning it before even writing the code is good practice. Once you are used to using version control tools, it adds minimum overhead while making collaboration, sharing, and testing much easier later on. Also it has been originally thought for code, we also find version control useful for writing papers. At the extreme, you may want to use version control for pretty much everything, even when you are the only one working locally on a project, so as to keep track of changes, try new features on new branches, etc. +Whenever working on code, it is good practice to think of versioning it and ways to package it before even writing the code. +Although it has been originally thought for code, we also find version control very useful when writing papers. -!!! Having private/public remotes -When collaboratively writing a scientific paper, you may want to keep the repositories that contain the source material of the paper and the related code private. You can then turn your repositories public when the paper is released. We note that explicitly sharing the sources of a paper is not yet common practice in all fields, but platforms like Arxiv nowadays strongly recommend it. +At the extreme, you may want to use version control for pretty much everything, even when you are the only one working locally on a project, so as to keep track of changes, try new features on new branches, etc. -There are also situations where researchers may need a private copy of a public repository. For instance, if you work on a toolbox, implementing various existing algorithms that fall under a common flag, while also developing novel algorithms that you may want to add to the toolbox after the related paper has been published. In that case, you may want to have a private copy of the toolbox, and push your changes to the public remote only once the paper describing the novel stuff is published. +Remember that, on the one hand ==git== a versioning tool, used to version your files locally and set milestones on the project with ==tags==. +On the other hand, GitHub is a hosting platform, used to host your project and collaborate with other contributors. -Overall, playing with private and public remotes allows researchers to mimic their usual workflow when working on papers and code. -See *@private_repos* +Once you are acquainted with version control tools like ==git==, testing becomes much easier later on, and it adds minimum overheads when it comes to collaborate or share with GitHub. +In particular, GitHub gives ==tags== a greater role than simple markers. +As we will see later, GitHub associates a release to a ==git== ==tag==. -!!! Writing a scientific paper +It is worth stressing again that you don't have to push the whole content of your local repository to a GitHub remote: +- you can have ==git== version or track some files of your local repository and ignore others (see *@gitignore*), +- you t have local branches that you do not even push to GitHub -In that case, your repository contains at least the source text of the paper, and typically also includes figures, style and bibliography files. As always, it is advised not to include final products such as pdfs directly in your repository, only source material that allows any user to clone the repo and produce the outputs himself. +Although you cannot have hidden or private branches on GitHub, you can collaborate with select people through a private GitHub remote and share only specific features via a public one. -You can use GitHub tags to mark important steps in the writing of your paper. For researchers, this typically means labeling the version of the paper that was submitted to an outlet, the version that was accepted, and the camera-ready version. -You can also link a version of the paper with other material: code, a submission on another platform like ArXiv, a demo. -Since papers and supplementary material now tend to spend some time online and undergo changes before a final version is printed or permanently stored by journals, it becomes important to label each version of the paper and its material correctly, so that the community can refer to specific versions when commenting on your research. +!!! Having private/public remotes + +Firstly, it is worth mentioning again that academics can enjoy unlimited private GitHub repositories (see *@private_repos*). +Only people listed as contributors of the project can view it and have the rights to participate. -Travis can be used to compile your LaTex project and output the pdf generated after each Git tags, creating milestones of the project. +Overall, playing with private and public remotes allows researchers to mimic their usual workflow when working on papers and code. + +In the early stages of a collaborative project, you may want to work with private GitHub project to keep your code and findings private but still allow for select collaborators to contribute. +This private GitHub project will play the role of private remote of your local repository. +Then, the two most common situations that can occur are the following. + +-In a first setting, based on the maturity of project, or more pragmatically based on the status of your scientific submission, you may wish to share the associated source code to the world. +To this end, at any time, you can turn your private repository to public. -@note See ==deploy:== to the ==.travis.yml== +@@note Everything you wrote in the past will be made public (a big mathematical blunder, a private joke, etc.), this might probably a bad idea. +In that case, you should rather create a new public repo and push there the latest version of your files. +-In the second first setting, you may want to showcase your project publicly but only partially. +In that case, you can create a public GitHub project and add it to list of remotes of your local repository. +Then, you can push some specific branches including the relevant features for a public showcase on GitHub. +@@todo add drawing showing the links between public/private repos and local computer. -*Here>https://github.com/CRIStAL-Sigma/latex-travis-test* is an example of such a repository. It contains -- a ==README==, -- a ==tex== file, -- a ==scripts== folder, -- a ==.travis.yml== configuration file. +!!! Writing a scientific paper -As always, the ==README== is a description of the content of the repository. The ==tex== file is the source material of the paper, and a minimal version of the repository should contain only this tex file and the ==README==. -The ==scripts== folder and the ==yml== file are optional advanced material: we use continuous integration tools to actually compile the paper and provide a pdf as a GitHub release. We will cover continuous integration and come back to this example in Chapter XXX. +!!!! Preliminaries +There are various ways to use tools like ==git== and GitHub when writing a paper, and there is no sanctified good practice yet. +Ideas include using ==git== to version your files locally and GitHub to host your project online and to collaborate with your coauthors. -There are various ways to use tools like git and GitHub when writing a paper, and there is no sanctified good practice yet. Ideas include using GitHub issues to keep track of subtasks and distribute writing to collaborators, and using GitHub versions to mark important steps in the writing of your paper. For researchers, this typically means labeling the version of the paper that was submitted to an outlet, the version that was accepted, and the camera-ready version. -You can also link a version of the paper with other material: code, a submission on another platform like ArXiv, a demo. Since papers and supplementary material now tend to spend some time online and undergo changes before a final version is printed or permanently stored by journals, it becomes important to label each version of the paper and its material correctly, so that the community can refer to specific versions when commenting on your research. -@@note If you turn a private repo into a public one, all your past commits will be public. So if you wrote anything in the past that you do not wish to be made public (a big mathematical blunder, a private joke, etc.), this is probably a bad idea. In that case, you rather should create a new public repo and push there the latest version of your files. -@@note Recall that binary files e.g. ==.pdf== files are listed in the ==.gitignore== file see *@gitignore* to avoid conflicts between users. -@@todo reference to git tags section +@@note Explicitly sharing the sources of a paper is not yet common practice in all fields, but platforms like *arXiv>https://arxiv.org* nowadays strongly recommend it. + +In this section we refer to the GitHub project *CRIStAL-Sigma/reproducible-paper >https://github.com/CRIStAL-Sigma/reproducible-paper* which serves as an illustrative example. + +!!!! Useful GitHub features +The main GitHub features you may like and use are: +- GitHub *issues >https://github.com/CRIStAL-Sigma/reproducible-paper/issues* to keep track of subtasks and distribute writing to collaborators. +- GitHub *pull requests >https://github.com/CRIStAL-Sigma/reproducible-paper/pulls* to propose improvements, to fix a bug or a previous issue. Note that each issue and pull request is assigned a unique ID that can be cross-referenced in commit messages, comments, etc. Such references appear in the time line of the related issue or pull request. +- GitHub *tags and releases >https://github.com/CRIStAL-Sigma/reproducible-paper/releases* to mark important steps in the writing of your paper. For researchers, this typically means labeling the version of the paper that was submitted to an outlet, the version that was accepted, and the camera-ready version. You can also link a version of the paper with other material: code, a submission on another platform like ArXiv, a demo. + +!!!! In practice +A minimal version of the repository should only contain source material that allows any user to produce the outputs by himself. + +At the creation of your GitHub project you are suggested to include a basic ==README== file, a ==LICENCE== file and you can also select a ==.gitignore== template that is tailored to your project (see *@gitignore*). + ++Initialization of GitHub project>figures/initialization_reproducible_paper.png|label=fig_initialization_reproducible_paper+ + +@@note As always, it is advised not to track binary files such as ==.pdf== to avoid inextricable merge conflicts later on. + +CRIStAL-Sigma/reproducible-paper/ +├── tex/ +│ ├── paper.tex +│ └── biblio.bib +├── README.md (.rst, .txt) +├── LICENCE +└── .gitignore + +Here, the *`TeX' template >https://github.com/github/gitignore/blob/master/TeX.gitignore* will indicate ==git== not to track specific file types such as ==.aux== ==.log==, ect. +However, in this template the line corresponding==.pdf== to files is commented by default. +You may have a look at the *==.gitignore== file >https://github.com/CRIStAL-Sigma/reproducible-paper/blob/master/.gitignore* of the illustrative example to see how to avoid tracking the output ==paper.pdf== while still tracking ==.pdf== images. + +CRIStAL-Sigma/reproducible-paper/ +├── tex/ +│ ├── paper.tex +│ ├── biblio.bib +│ └── images/ +│ ├── img_1.pdf +│ └── img_2.jpg +├── README.md (.rst, .txt) +├── LICENCE +└── .gitignore + +In a more advanced setting, you can use *Travis >https://travis-ci.org/CRIStAL-Sigma/reproducible-paper* to compile your LaTeX project and attach the generated ==.pdf== file to a GitHub release automatically in order to create a milestone of the project. +See *@continuous_integration_travis_usecases*. + +@@note For Travis to display the bibliography and hyperlinks, the ==.aux== and ==.bbl== files must be tracked. To this end you can add the corresponding files to the exception list of the *==.gitignore== file >https://github.com/CRIStAL-Sigma/reproducible-paper/blob/master/.gitignore*. + +CRIStAL-Sigma/reproducible-paper/ +├── tex/ +│ ├── paper.tex +│ ├── paper.aux +│ ├── paper.bbl +│ ├── packages.tex +│ ├── commands.tex +│ ├── biblio.bib +│ └── images/ +│ ├── img_1.pdf +│ └── img_2.jpg +├── README.md (.rst, .txt) +├── LICENCE +├── .gitignore +├── scripts/ +│ ├── ensure_latex.sh +│ └── ensure_book_dependencies.sh +└── .travis.yml !!! Packaging scientific code Your repository contains @@ -88,6 +162,10 @@ Next, you must certify that your project can be installed and run by users havin !!!! Certify your project runs +As presented in *@private_repos*, academics can enjoy unlimited: +-private GitHub repositories +-Travis builds for public GitHub repositories + Installation and testing can be automated using a *@continuous_integration* tool e.g. *Travis>https://travis-ci.com/* *Jenkins>https://jenkins.io/*. Here we propose to use *Travis>https://docs.travis-ci.com/user/getting-started/*