@@ -71,105 +71,155 @@ to their interfaces (especially numpy):
7171* IPython / Jupyter: interactive work
7272
7373
74-
7574Core numerics libraries
7675~~~~~~~~~~~~~~~~~~~~~~~
7776
78- * `numpy <https://numpy.org/doc/stable/ >`__ - arrays and array math.
79- * `scipy <https://docs.scipy.org/doc/scipy/reference/ >`__ - software
77+ * `numpy <https://numpy.org/doc/stable/ >`__ - Arrays and array math.
78+ * `scipy <https://docs.scipy.org/doc/scipy/reference/ >`__ - Software
8079 for math, science, and engineering.
8180
8281
83-
8482Plotting
8583~~~~~~~~
8684
87- * `matplotlib <https://matplotlib.org/ >`__ - base plotting package,
85+ * `matplotlib <https://matplotlib.org/ >`__ - Base plotting package,
8886 somewhat low level but almost everything builds on it.
89- * `seaborn <https://seaborn.pydata.org/ >`__ - higher level plotting
87+ * `seaborn <https://seaborn.pydata.org/ >`__ - Higher level plotting
9088 interface; statistical graphics.
89+ * `Vega-Altair <https://altair-viz.github.io/ >`__ - Declarative Python
90+ plotting.
9191* `mayavi <https://docs.enthought.com/mayavi/mayavi/ >`__ - 3D plotting
92- * `PIL <https://python-pillow.org/ >`__ - image manipulation. The
93- original PIL is no longer maintained, the new "Pillow" is a drop-in
94- replacement.
95-
92+ * `Plotly <https://plotly.com/python/ >`__ - Big graphing library.
9693
9794
9895Data analysis and other important core packages
9996~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10097
101- * `pandas <https://pandas.pydata.org/docs/user_guide/ >`__ - columnar
102- data analysis
103- * `statsmodels <https://www.statsmodels.org/stable/ >`__ - just what it says
104- * `SymPy <https://www.sympy.org/ >`__ - symbolic math
105- * `networkx <https://networkx.org/ >`__ - graph and network analysis
106- * `h5py <https://www.h5py.org/ >`__ and `PyTables <https://www.pytables.org/ >`__ - interfaces to
107- the `HDF5 <https://en.wikipedia.org/wiki/Hierarchical_Data_Format >`__ on-disk file format
108- * `dateutil <https://dateutil.readthedocs.io/ >`__ and `pytz
109- <https://pythonhosted.org/pytz/> `__ - date arithmetic and handling,
110- timezone database and conversion
111-
98+ * `pandas <https://pandas.pydata.org/docs/user_guide/ >`__ - Columnar
99+ data analysi.
100+ * `polars <https://pola.rs/> ` - Alternative to pandas that uses similar
101+ API, but is re-imagined for more speed.
102+ * `Vaex <https://vaex.io/docs/index.html >`__ - Alternative for pandas
103+ that uses similar API for lazy-loading and processing huge DataFrames.
104+ * `Dask <https://www.dask.org/ >`__ - Alternative to Pandas that uses
105+ similar API and can do analysis in parallel.
106+ * `xarrray <https://docs.xarray.dev/en/stable/ >`__ - Framework for
107+ working with mutli-dimensional arrays.
108+ * `statsmodels <https://www.statsmodels.org/stable/ >`__ - Statistical
109+ models and tests.
110+ * `SymPy <https://www.sympy.org/ >`__ - Symbolic math.
111+ * `networkx <https://networkx.org/ >`__ - Graph and network analysis.
112+ * `graph-tool <https://graph-tool.skewed.de/ >`__ - Graph and network analysis
113+ toolkit implemented in C++.
112114
113115
114116Interactive computing and human interface
115117~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
116118* Interactive computing
117119
118- * `IPython <https://ipython.org/ >`__ - nicer interactive interpreter
119- * `Jupyter <https://jupyter.org/ >`__ (notebook, lab, hub, ...) -
120- web-based interface to IPython and other languages
120+ * `IPython <https://ipython.org/ >`__ - Nicer interactive interpreter
121+ * `Jupyter <https://jupyter.org/ >`__ - Web-based interface to IPython
122+ and other languages (includes projects such as jupyter notebook,
123+ lab, hub, ...)
121124
122125* Testing
123126
124- * `pytest <https://docs.pytest.org/ >`__ - automated testing interface
127+ * `pytest <https://docs.pytest.org/ >`__ - Automated testing interface
125128
126129* Documentation
127130
128- * `Sphinx <https://www.sphinx-doc.org/ >`__ - documentation generator
131+ * `Sphinx <https://www.sphinx-doc.org/ >`__ - Documentation generator
129132 (also used for this lesson...)
130133
131134* Development environments
132135
133- * `Spyder <https://www.spyder-ide.org/ >`__ - interactive Python
136+ * `Spyder <https://www.spyder-ide.org/ >`__ - Interactive Python
134137 development environment.
138+ * `Visual Studio Code <https://code.visualstudio.com/ >`__ - Microsoft's
139+ flagship code editor.
140+ * `PyCharm <https://www.jetbrains.com/pycharm/ >`__ - JetBrains's
141+ Python IDE.
135142
136143* `Binder <https://mybinder.org/ >`__ - load any git repository in
137144 Jupyter automatically, good for reproducible research
138145
139146
147+ Data format support and data ingestion
148+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149+
150+ * `pillow <https://python-pillow.org/ >`__ - Image manipulation. The
151+ original PIL is no longer maintained, the new "Pillow" is a drop-in
152+ replacement.
153+ * `h5py <https://www.h5py.org/ >`__ and `PyTables <https://www.pytables.org/ >`__ -
154+ Interfaces to the `HDF5 <https://en.wikipedia.org/wiki/Hierarchical_Data_Format >`__
155+ file format.
156+
140157
141158Speeding up code and parallelism
142159~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143- * `PyMPI <https://sourceforge.net/projects/pympi/ >`__ - Message
160+
161+ * `MPI for Python (mpi4py) <https://mpi4py.readthedocs.io/en/stable/ >`__ - Message
144162 Passing Interface (MPI) in Python for parallelizing jobs.
145163* `cython <https://cython.org/ >`__ - easily make C extensions for
146164 Python, also interface to C libraries
147165* `numba <https://numba.pydata.org/ >`__ - just in time compiling of
148166 functions for speed-up
149167* `PyPy <https://www.pypy.org/ >`__ - Python written in Python so that
150168 it can internally optimize more.
151- * `Dask <https://www.dask.org/ >`__ - distributed array data structure for
169+ * `Dask <https://www.dask.org/ >`__ - Distributed array data structure for
152170 distributed computation
153- * `Joblib <https://joblib.readthedocs.io/ >`__ - easy embarrassingly
171+ * `Joblib <https://joblib.readthedocs.io/ >`__ - Easy embarrassingly
154172 parallel computing
155- * `IPyParallel <https://ipyparallel.readthedocs.io/ >`__ - easy
156- parallel task engine
173+ * `IPyParallel <https://ipyparallel.readthedocs.io/ >`__ - Easy
174+ parallel task engine.
157175* `numexpr <https://numexpr.readthedocs.io/ >`__ - Fast evaluation of
158176 array expressions by automatically compiling the arithmetic.
159177
160178
161-
162179Machine learning
163180~~~~~~~~~~~~~~~~
164181
165- If you need some machine learning, you probably already know what you
166- need and this list is short and irrelevant.
182+ * `nltk <https://www.nltk.org/ >`__ - Natural language processing
183+ toolkit.
184+ * `scikit-learn <https://scikit-learn.org/ >`__ - Traditional
185+ machine learning toolkit.
186+ * `xgboost <https://xgboost.readthedocs.io/en/stable/ >`__ - Toolkit for
187+ gradient boosting algorithms.
188+
189+
190+ Deep learning
191+ ~~~~~~~~~~~~~
192+
193+ * `tensorflow <https://www.tensorflow.org/ >`__ - Deep learning
194+ library by Google.
195+ * `pytorch <https://pytorch.org/ >`__ - Currently the most popular
196+ deep learning library.
197+ * `keras <https://keras.io/ >`__ - Simple libary for doing deep learning.
198+ * `huggingface <https://huggingface.co >`__ - Ecosystem for sharing
199+ and running deep learning models and datasets. Incluses packages
200+ like ``transformers ``, ``datasets ``, ``accelerate ``, etc.
201+ * `jax <https://jax.readthedocs.io/en/latest/index.html >`__ - Google's
202+ Python library for running NumPy and automatic differentiation
203+ on GPUs.
204+ * `flax <https://flax.readthedocs.io/en/latest/ >`__ - Neural network
205+ framework built on Jax.
206+ * `equinox <https://docs.kidger.site/equinox/ >`__ - Another neural
207+ network framework built on Jax.
208+ * `DeepSpeed <https://www.deepspeed.ai/ >`__ - Algorithms for running
209+ massive scale trainings. Included in many of the frameworks.
210+ * `PyTorch Lightning <https://lightning.ai/docs/pytorch/stable/ >`__ -
211+ Framework for creating and training PyTorch models.
212+ * `Tensorboard <https://www.tensorflow.org/tensorboard/> ` - Tool
213+ for visualizing model training on a web page.
214+
215+
216+ Other packages for special cases
217+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218+
219+ * `dateutil <https://dateutil.readthedocs.io/ >`__ and `pytz
220+ <https://pythonhosted.org/pytz/> `__ - Date arithmetic and handling,
221+ timezone database and conversion.
167222
168- - `tensorflow <https://www.tensorflow.org/ >`__
169- - `pytorch <https://pytorch.org/ >`__
170- - `nltk <https://www.nltk.org/ >`__ - natural language processing
171- - `scikit-learn <https://scikit-learn.org/ >`__ - simple tools for
172- predictive data analysis
173223
174224
175225
@@ -198,7 +248,6 @@ support them very well. Read more: `Extending and embedding Python
198248<https://docs.python.org/extending/index.html> `__.
199249
200250
201-
202251Tools for interfacing with other languages
203252~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
204253
0 commit comments