Skip to content

Commit e7f552f

Browse files
author
Jencir Lee
committed
update README
1 parent 5cf6b82 commit e7f552f

File tree

1 file changed

+58
-33
lines changed

1 file changed

+58
-33
lines changed

README.md

Lines changed: 58 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,76 @@
11

22
# micrograd
3+
A tiny Autograd engine whose only dependency is NumPy. Implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural networks library on top of it with a PyTorch-like API. Both are tiny.
34

4-
![awww](assets/puppy.jpg)
5-
6-
A tiny Autograd engine (with a bite! :)). Implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural networks library on top of it with a PyTorch-like API. Both are tiny, with about 100 and 50 lines of code respectively. The DAG only operates over scalar values, so e.g. we chop up each neuron into all of its individual tiny adds and multiplies. However, this is enough to build up entire deep neural nets doing binary classification, as the demo notebook shows. Potentially useful for educational purposes.
7-
8-
### Installation
5+
This version is capable of working with matrices and higher-order tensors. For @karpathy's original scalar-based version, locate the code with tag `scalar`.
96

7+
## Installation
108
```bash
11-
pip install micrograd
9+
python3 -m venv venv
10+
. venv/bin/activate
11+
pip install .
1212
```
1313

14-
### Example usage
14+
## Get Started
15+
```python
16+
from micrograd import Value
17+
from numpy import array
18+
19+
a = Value(array([[2, 3], [5, 4]]))
20+
b = Value(array([1, -1]))
21+
c = (a @ b).relu()
22+
print(c) # Value(data=[0 1], grad=None)
23+
c.backward()
24+
print(c) # Value(data=[0 1], grad=[1. 1.])
25+
print(a) # Value(data=..., grad=[[0. 0.], [1. -1.]])
26+
print(b) # Value(data=..., grad=[5. 4.])
27+
```
1528

16-
Below is a slightly contrived example showing a number of possible supported operations:
29+
## Lazy evaluation
30+
When defining a tensor, one may just indicate `shape` and `name`, and later on provide the value.
1731

1832
```python
19-
from micrograd.engine import Value
20-
21-
a = Value(-4.0)
22-
b = Value(2.0)
23-
c = a + b
24-
d = a * b + b**3
25-
c += c + 1
26-
c += 1 + c + (-a)
27-
d += d * 2 + (b + a).relu()
28-
d += 3 * d + (b - a).relu()
29-
e = c - d
30-
f = e**2
31-
g = f / 2.0
32-
g += 10.0 / f
33-
print(f'{g.data:.4f}') # prints 24.7041, the outcome of this forward pass
34-
g.backward()
35-
print(f'{a.grad:.4f}') # prints 138.8338, i.e. the numerical value of dg/da
36-
print(f'{b.grad:.4f}') # prints 645.5773, i.e. the numerical value of dg/db
33+
from micrograd import Value
34+
from numpy import array
35+
36+
a = Value(shape=(2, 2), name='var1')
37+
b = Value(shape=(2,), name='var2')
38+
c = (a @ b).relu()
39+
c.forward(var1=array([[2, 3], [5, 4]]),
40+
var2=array([1, -1]))
41+
c.backward()
3742
```
3843

39-
### Training a neural net
44+
The **essential pattern** is to call `forward()` once with the values for the varialbes, then `backward()` once for the mathematical derivatives.
45+
46+
```python
47+
x.forward(var1=value1, var2=value2, ...)
48+
x.backward()
49+
```
4050

51+
Each time the `forward()` is called (e.g. for minibatch evaluation), the lazily defined variables have to be fed values in the function signature. Otherwise, it will take all `nan` as value. The final result will likely be `nan` to signal missing values for some variables.
52+
53+
## Efficient operator dependency topology computation
54+
The operator dependency topology computation is only calculated once then cached, supposing the topology is static once a variable is defined.
55+
56+
## Supported operators
57+
* `__pow__`
58+
* `__matmul__`
59+
* `tensordot` for tensor contraction
60+
* `relu`
61+
* `log`
62+
* `log1p`
63+
* `arctanh`
64+
* `T` for transpose
65+
* `sum`
66+
* `mean`
67+
68+
## Training a neural net
4169
The notebook `demo.ipynb` provides a full demo of training an 2-layer neural network (MLP) binary classifier. This is achieved by initializing a neural net from `micrograd.nn` module, implementing a simple svm "max-margin" binary classification loss and using SGD for optimization. As shown in the notebook, using a 2-layer neural net with two 16-node hidden layers we achieve the following decision boundary on the moon dataset:
4270

4371
![2d neuron](assets/moon_mlp.png)
4472

45-
### Tracing / visualization
46-
73+
## Tracing / visualization
4774
For added convenience, the notebook `trace_graph.ipynb` produces graphviz visualizations. E.g. this one below is of a simple 2D neuron, arrived at by calling `draw_dot` on the code below, and it shows both the data (left number in each node) and the gradient (right number in each node).
4875

4976
```python
@@ -56,14 +83,12 @@ dot = draw_dot(y)
5683

5784
![2d neuron](assets/gout.svg)
5885

59-
### Running tests
60-
86+
## Running tests
6187
To run the unit tests:
6288

6389
```bash
6490
python -m unittest tests/*.py
6591
```
6692

67-
### License
68-
93+
## License
6994
MIT

0 commit comments

Comments
 (0)