|
1 | | -How to contribute |
2 | | ------------------ |
| 1 | +This document describes the workflow on how to contribute to the openml-python package. |
| 2 | +If you are interested in connecting a machine learning package with OpenML (i.e. |
| 3 | +write an openml-python extension) or want to find other ways to contribute, see [this page](https://openml.github.io/openml-python/master/contributing.html#contributing). |
3 | 4 |
|
4 | | -The preferred workflow for contributing to the OpenML python connector is to |
| 5 | +Scope of the package |
| 6 | +-------------------- |
| 7 | + |
| 8 | +The scope of the OpenML Python package is to provide a Python interface to |
| 9 | +the OpenML platform which integrates well with Python's scientific stack, most |
| 10 | +notably [numpy](http://www.numpy.org/), [scipy](https://www.scipy.org/) and |
| 11 | +[pandas](https://pandas.pydata.org/). |
| 12 | +To reduce opportunity costs and demonstrate the usage of the package, it also |
| 13 | +implements an interface to the most popular machine learning package written |
| 14 | +in Python, [scikit-learn](http://scikit-learn.org/stable/index.html). |
| 15 | +Thereby it will automatically be compatible with many machine learning |
| 16 | +libraries written in Python. |
| 17 | + |
| 18 | +We aim to keep the package as light-weight as possible and we will try to |
| 19 | +keep the number of potential installation dependencies as low as possible. |
| 20 | +Therefore, the connection to other machine learning libraries such as |
| 21 | +*pytorch*, *keras* or *tensorflow* should not be done directly inside this |
| 22 | +package, but in a separate package using the OpenML Python connector. |
| 23 | +More information on OpenML Python connectors can be found [here](https://openml.github.io/openml-python/master/contributing.html#contributing). |
| 24 | + |
| 25 | +Reporting bugs |
| 26 | +-------------- |
| 27 | +We use GitHub issues to track all bugs and feature requests; feel free to |
| 28 | +open an issue if you have found a bug or wish to see a feature implemented. |
| 29 | + |
| 30 | +It is recommended to check that your issue complies with the |
| 31 | +following rules before submitting: |
| 32 | + |
| 33 | +- Verify that your issue is not being currently addressed by other |
| 34 | + [issues](https://github.com/openml/openml-python/issues) |
| 35 | + or [pull requests](https://github.com/openml/openml-python/pulls). |
| 36 | + |
| 37 | +- Please ensure all code snippets and error messages are formatted in |
| 38 | + appropriate code blocks. |
| 39 | + See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks). |
| 40 | + |
| 41 | +- Please include your operating system type and version number, as well |
| 42 | + as your Python, openml, scikit-learn, numpy, and scipy versions. This information |
| 43 | + can be found by running the following code snippet: |
| 44 | +```python |
| 45 | +import platform; print(platform.platform()) |
| 46 | +import sys; print("Python", sys.version) |
| 47 | +import numpy; print("NumPy", numpy.__version__) |
| 48 | +import scipy; print("SciPy", scipy.__version__) |
| 49 | +import sklearn; print("Scikit-Learn", sklearn.__version__) |
| 50 | +import openml; print("OpenML", openml.__version__) |
| 51 | +``` |
| 52 | + |
| 53 | +Determine what contribution to make |
| 54 | +----------------------------------- |
| 55 | +Great! You've decided you want to help out. Now what? |
| 56 | +All contributions should be linked to issues on the [Github issue tracker](https://github.com/openml/openml-python/issues). |
| 57 | +In particular for new contributors, the *good first issue* label should help you find |
| 58 | +issues which are suitable for beginners. Resolving these issues allow you to start |
| 59 | +contributing to the project without much prior knowledge. Your assistance in this area |
| 60 | +will be greatly appreciated by the more experienced developers as it helps free up |
| 61 | +their time to concentrate on other issues. |
| 62 | + |
| 63 | +If you encountered a particular part of the documentation or code that you want to improve, |
| 64 | +but there is no related open issue yet, open one first. |
| 65 | +This is important since you can first get feedback or pointers from experienced contributors. |
| 66 | + |
| 67 | +To let everyone know you are working on an issue, please leave a comment that states you will work on the issue |
| 68 | +(or, if you have the permission, *assign* yourself to the issue). This avoids double work! |
| 69 | + |
| 70 | +General git workflow |
| 71 | +-------------------- |
| 72 | + |
| 73 | +The preferred workflow for contributing to openml-python is to |
5 | 74 | fork the [main repository](https://github.com/openml/openml-python) on |
6 | 75 | GitHub, clone, check out the branch `develop`, and develop on a new branch |
7 | 76 | branch. Steps: |
@@ -109,75 +178,79 @@ following rules before you submit a pull request: |
109 | 178 | - If any source file is being added to the repository, please add the BSD 3-Clause license to it. |
110 | 179 |
|
111 | 180 |
|
112 | | -You can also check for common programming errors with the following |
113 | | -tools: |
114 | | - |
115 | | -- Code with good unittest **coverage** (at least 80%), check with: |
116 | | - |
| 181 | +First install openml with its test dependencies by running |
117 | 182 | ```bash |
118 | | - $ pip install pytest pytest-cov |
119 | | - $ pytest --cov=. path/to/tests_for_package |
| 183 | + $ pip install -e .[test] |
120 | 184 | ``` |
121 | | - |
122 | | -- No style warnings, check with: |
123 | | - |
| 185 | +from the repository folder. |
| 186 | +Then configure pre-commit through |
| 187 | + ```bash |
| 188 | + $ pre-commit install |
| 189 | + ``` |
| 190 | +This will install dependencies to run unit tests, as well as [pre-commit](https://pre-commit.com/). |
| 191 | +To run the unit tests, and check their code coverage, run: |
124 | 192 | ```bash |
125 | | - $ pip install flake8 |
126 | | - $ flake8 --ignore E402,W503 --show-source --max-line-length 100 |
| 193 | + $ pytest --cov=. path/to/tests_for_package |
127 | 194 | ``` |
128 | | - |
129 | | -- No mypy (typing) issues, check with: |
130 | | - |
| 195 | +Make sure your code has good unittest **coverage** (at least 80%). |
| 196 | + |
| 197 | +Pre-commit is used for various style checking and code formatting. |
| 198 | +Before each commit, it will automatically run: |
| 199 | + - [black](https://black.readthedocs.io/en/stable/) a code formatter. |
| 200 | + This will automatically format your code. |
| 201 | + Make sure to take a second look after any formatting takes place, |
| 202 | + if the resulting code is very bloated, consider a (small) refactor. |
| 203 | + *note*: If Black reformats your code, the commit will automatically be aborted. |
| 204 | + Make sure to add the formatted files (back) to your commit after checking them. |
| 205 | + - [mypy](https://mypy.readthedocs.io/en/stable/) a static type checker. |
| 206 | + In particular, make sure each function you work on has type hints. |
| 207 | + - [flake8](https://flake8.pycqa.org/en/latest/index.html) style guide enforcement. |
| 208 | + Almost all of the black-formatted code should automatically pass this check, |
| 209 | + but make sure to make adjustments if it does fail. |
| 210 | + |
| 211 | +If you want to run the pre-commit tests without doing a commit, run: |
131 | 212 | ```bash |
132 | | - $ pip install mypy |
133 | | - $ mypy openml --ignore-missing-imports --follow-imports skip |
| 213 | + $ pre-commit run --all-files |
134 | 214 | ``` |
| 215 | +Make sure to do this at least once before your first commit to check your setup works. |
135 | 216 |
|
136 | | -Filing bugs |
137 | | ------------ |
138 | | -We use GitHub issues to track all bugs and feature requests; feel free to |
139 | | -open an issue if you have found a bug or wish to see a feature implemented. |
140 | | - |
141 | | -It is recommended to check that your issue complies with the |
142 | | -following rules before submitting: |
143 | | - |
144 | | -- Verify that your issue is not being currently addressed by other |
145 | | - [issues](https://github.com/openml/openml-python/issues) |
146 | | - or [pull requests](https://github.com/openml/openml-python/pulls). |
147 | | - |
148 | | -- Please ensure all code snippets and error messages are formatted in |
149 | | - appropriate code blocks. |
150 | | - See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks). |
| 217 | +Executing a specific unit test can be done by specifying the module, test case, and test. |
| 218 | +To obtain a hierarchical list of all tests, run |
151 | 219 |
|
152 | | -- Please include your operating system type and version number, as well |
153 | | - as your Python, openml, scikit-learn, numpy, and scipy versions. This information |
154 | | - can be found by running the following code snippet: |
| 220 | + ```bash |
| 221 | + $ pytest --collect-only |
| 222 | +
|
| 223 | + <Module 'tests/test_datasets/test_dataset.py'> |
| 224 | + <UnitTestCase 'OpenMLDatasetTest'> |
| 225 | + <TestCaseFunction 'test_dataset_format_constructor'> |
| 226 | + <TestCaseFunction 'test_get_data'> |
| 227 | + <TestCaseFunction 'test_get_data_rowid_and_ignore_and_target'> |
| 228 | + <TestCaseFunction 'test_get_data_with_ignore_attributes'> |
| 229 | + <TestCaseFunction 'test_get_data_with_rowid'> |
| 230 | + <TestCaseFunction 'test_get_data_with_target'> |
| 231 | + <UnitTestCase 'OpenMLDatasetTestOnTestServer'> |
| 232 | + <TestCaseFunction 'test_tagging'> |
| 233 | + ``` |
155 | 234 |
|
156 | | - ```python |
157 | | - import platform; print(platform.platform()) |
158 | | - import sys; print("Python", sys.version) |
159 | | - import numpy; print("NumPy", numpy.__version__) |
160 | | - import scipy; print("SciPy", scipy.__version__) |
161 | | - import sklearn; print("Scikit-Learn", sklearn.__version__) |
162 | | - import openml; print("OpenML", openml.__version__) |
163 | | - ``` |
| 235 | +You may then run a specific module, test case, or unit test respectively: |
| 236 | +```bash |
| 237 | + $ pytest tests/test_datasets/test_dataset.py |
| 238 | + $ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest |
| 239 | + $ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest::test_get_data |
| 240 | +``` |
164 | 241 |
|
165 | | -New contributor tips |
166 | | --------------------- |
| 242 | +*NOTE*: In the case the examples build fails during the Continuous Integration test online, please |
| 243 | +fix the first failing example. If the first failing example switched the server from live to test |
| 244 | +or vice-versa, and the subsequent examples expect the other server, the ensuing examples will fail |
| 245 | +to be built as well. |
167 | 246 |
|
168 | | -A great way to start contributing to openml-python is to pick an item |
169 | | -from the list of [Good First Issues](https://github.com/openml/openml-python/labels/Good%20first%20issue) |
170 | | -in the issue tracker. Resolving these issues allow you to start |
171 | | -contributing to the project without much prior knowledge. Your |
172 | | -assistance in this area will be greatly appreciated by the more |
173 | | -experienced developers as it helps free up their time to concentrate on |
174 | | -other issues. |
| 247 | +Happy testing! |
175 | 248 |
|
176 | 249 | Documentation |
177 | 250 | ------------- |
178 | 251 |
|
179 | 252 | We are glad to accept any sort of documentation: function docstrings, |
180 | | -reStructuredText documents (like this one), tutorials, etc. |
| 253 | +reStructuredText documents, tutorials, etc. |
181 | 254 | reStructuredText documents live in the source code repository under the |
182 | 255 | doc/ directory. |
183 | 256 |
|
|
0 commit comments