Skip to content

Commit ae5dee0

Browse files
committed
updated formatting
1 parent 37d8bf8 commit ae5dee0

File tree

1 file changed

+155
-151
lines changed
  • content/assignments/Assignment_1:Hopfield_Networks

1 file changed

+155
-151
lines changed
Lines changed: 155 additions & 151 deletions
Original file line numberDiff line numberDiff line change
@@ -1,107 +1,134 @@
11
# Assignment 1: Hopfield Networks
22

33
## Overview
4-
In this assignment, you will explore computational memory models by implementing a Hopfield network. In the original article ([Hopfield (1982)](https://www.dropbox.com/scl/fi/iw9wtr3xjvrbqtk38obid/Hopf82.pdf?rlkey=x3my329oj9952er68sr28c7xc&dl=1)), neuronal activations were set to either 0 ("not firing") or 1 ("firing"). Modern Hopfield networks nearly always follow an updated implementation, first proposed by [Amit et al. (1985)](https://www.dropbox.com/scl/fi/3a3adwqf70afb9kmieezn/AmitEtal85.pdf?rlkey=78fckvuuvk9t3o9fbpjrmn6de&dl=1). In Amit et al.'s framing, neurons take on activation values of either -1 ("down state") or +1 ("up state"). This has three important benefits over Hopfield's original implementation:
5-
- It provides a cleaner way to implement the Hebbian learning rule (i.e., without subtracting means or shifting values).
6-
- It avoids a bias towards 0 (i.e., +1 and -1 are equally "attractive" whereas 0-valued neurons have a stronger "pull").
7-
- The energy function (i.e., a description of the attractor dynamics of the network) can be directly mapped onto the [Ising model](https://en.wikipedia.org/wiki/Ising_model) from statistical physics.
84

9-
You should start by reading [Amit et al. (1985)](https://www.dropbox.com/scl/fi/3a3adwqf70afb9kmieezn/AmitEtal85.pdf?rlkey=78fckvuuvk9t3o9fbpjrmn6de&dl=1) closely. Then you should code up the model in a Google Colaboratory notebook. Unless otherwise noted, all references to "the paper" refer to Amit et al. (1985).
5+
In this assignment, you will explore computational memory models by implementing a Hopfield network.
6+
7+
In the original article ([Hopfield, 1982](https://www.dropbox.com/scl/fi/iw9wtr3xjvrbqtk38obid/Hopf82.pdf?rlkey=x3my329oj9952er68sr28c7xc&dl=1)), neuronal activations were set to either 0 ("not firing") or 1 ("firing"). Modern Hopfield networks nearly always follow an updated implementation, first proposed by [Amit et al. (1985)](https://www.dropbox.com/scl/fi/3a3adwqf70afb9kmieezn/AmitEtal85.pdf?rlkey=78fckvuuvk9t3o9fbpjrmn6de&dl=1). In their framing, neurons take on activation values of either –1 ("down state") or +1 ("up state"). This has three important benefits:
8+
9+
- It provides a cleaner way to implement the Hebbian learning rule (i.e., without subtracting means or shifting values).
10+
- It avoids a bias toward 0 (i.e., +1 and –1 are equally "attractive," whereas 0-valued neurons have a stronger "pull").
11+
- The energy function (which describes the attractor dynamics of the network) can be directly mapped onto the [Ising model](https://en.wikipedia.org/wiki/Ising_model) from statistical physics.
12+
13+
You should start by reading [Amit et al. (1985)](https://www.dropbox.com/scl/fi/3a3adwqf70afb9kmieezn/AmitEtal85.pdf?rlkey=78fckvuuvk9t3o9fbpjrmn6de&dl=1) closely. Then implement the model in a Google Colaboratory notebook. Unless otherwise noted, all references to "the paper" refer to Amit et al. (1985).
14+
15+
---
1016

1117
## Tasks
1218

1319
### 1. Implement Memory Storage and Retrieval
14-
- **Objective:** Write functions that implement the core operations of a Hopfield network:
15-
- **Memory Storage:** Implement the Hebbian learning rule to compute the weight matrix, given a set of network configurations (memories). This is described in **Equation 1.5** of the paper:
1620

17-
Let \($p$\) be the number of patterns, \($N$\) the number of neurons, and \($\xi_i^\mu \in \{-1, +1\}$\) the value of neuron \($i$\) in pattern \($\mu$\).
21+
**Objective:** Write functions that implement the core operations of a Hopfield network.
22+
23+
- **Memory Storage:** Implement the Hebbian learning rule to compute the weight matrix, given a set of network configurations (memories). This is described in **Equation 1.5** of the paper:
24+
25+
Let \( p \) be the number of patterns, \( N \) the number of neurons, and \( \xi_i^\mu \in \{-1, +1\} \) the value of neuron \( i \) in pattern \( \mu \). The synaptic coupling between neurons \( i \) and \( j \) is:
1826

19-
The synaptic coupling between neuron \($i$\) and \($j$\) is:
20-
2127
$$
22-
J_{ij} = \frac{1}{N} \sum_{\mu=1}^p \xi_i^\mu \xi_j^\mu.
28+
J_{ij} = \frac{1}{N} \sum_{\mu=1}^p \xi_i^\mu \xi_j^\mu
2329
$$
2430

25-
Note that the matrix is symmetric \($J_{ij} = J_{ji}$\), and there are no self-connections allowed (by definition $J_{ii} = 0$).
31+
Note that the matrix is symmetric \( J_{ij} = J_{ji} \), and there are no self-connections by definition \( J_{ii} = 0 \).
2632

27-
- **Memory Retrieval:** Implement the retrieval rule using **Equation (1.3)** and surrounding discussion.
33+
- **Memory Retrieval:** Implement the retrieval rule using **Equation 1.3** and surrounding discussion. At each time step, each neuron updates according to its **local field**:
2834

29-
At each time step, each neuron updates according to its **local field** \($h_i$\):
35+
$$
36+
h_i = \sum_{j=1}^N J_{ij} S_j
37+
$$
3038

31-
$$
32-
h_i = \sum_{j=1}^N J_{ij} S_j
33-
$$
39+
The neuron updates its state to align with the sign of the field:
3440

35-
The neuron updates its state to align with the sign of the field:
41+
$$
42+
S_i(t+1) = \text{sign}(h_i(t)) = \text{sign} \left( \sum_{j} J_{ij} S_j(t) \right)
43+
$$
3644

37-
$$
38-
S_i(t+1) = \text{sign}(h_i(t)) = \text{sign} \left( \sum_{j} J_{ij} S_j(t) \right)
39-
$$
45+
Here \( S_i \in \{-1, +1\} \) is the current state of neuron \( i \).
4046

41-
Note that \($S_i \in \{-1, +1\}$\) represents the state of neuron \($i$\).
47+
---
4248

4349
### 2. Test with a Small Network
4450

45-
Encode the following test memories in a small Hopfield network with \( N = 5 \) neurons:
46-
47-
$$
48-
\xi^1 = [+1, -1, +1, -1, 1]
49-
$$
50-
$$
51-
\xi^2 = [-1, +1, -1, +1, -1]
52-
$$
53-
54-
- Store these memories using the Hebbian rule.
55-
- Test memory retrieval by presenting the network with noisy versions of the stored patterns (e.g., flipping one neuron's sign, setting one or more activation values to 0).
56-
- Briefly discuss the results and provide insights into how the network behaves. You might find Figure 3 from [Hopfield (1984)](https://www.dropbox.com/scl/fi/7wktieqztt60b8wyhg2au/Hopf84.pdf?rlkey=yi3baegby8x6olxznsvm8lyxz&dl=1) useful! You can either write a paragraph or two, or sketch (or code up) a diagram or figure, or some combination. In particular:
57-
- Can you get a sense of how the network "works"-- i.e., why it stores memories?
58-
- Why do some memories interfere and others don't? Can you build up enough of an intuition that you can manually construct memories that can vs. can't be retrieved by a toy (small) network?
59-
- Can you build up any intuitions about what sorts of factors might affect the "capacity" of the network? (Capacity is the maximum number of memories that can be "successfully" retrieved).
51+
Encode the following test memories in a Hopfield network with \( N = 5 \) neurons:
52+
53+
$$
54+
\xi^1 = [+1, -1, +1, -1, +1] \\
55+
\xi^2 = [-1, +1, -1, +1, -1]
56+
$$
57+
58+
- Store these memories using the Hebbian rule.
59+
- Test retrieval by presenting the network with noisy versions (e.g., flipping a sign, or setting some entries to 0).
60+
- Briefly discuss your observations. You can write a few sentences, sketch/code a figure, or combine both.
61+
62+
Questions to consider:
63+
64+
- Can you tell how and why the network stores memories?
65+
- Why do some memories interfere while others don’t?
66+
- Can you construct memory sets that do or don’t work in a small network?
67+
- What factors do you think affect the **capacity** of the network?
68+
69+
---
6070

6171
### 3. Evaluate Storage Capacity
62-
- **Objective:** Determine how the ability to recover memories degrades as you vary:
63-
- **Network Size:** The total number of neurons in the network.
64-
- **Number of Stored Memories:** The number of patterns stored in the network.
72+
73+
**Objective:** Determine how memory recovery degrades as you vary:
74+
75+
- **Network Size** (number of neurons)
76+
- **Number of Stored Memories**
77+
78+
To generate \( m \) memories \( \xi_1, \dots, \xi_m \) for a network of size \( N \), use:
79+
80+
```python
81+
import numpy as np
82+
xi = 2 * (np.random.rand(m, N) > 0.5) - 1
83+
```
84+
85+
**Method:**
86+
87+
- For each configuration, run multiple trials.
88+
- For each trial, measure whether **at least 99%** of the memory is recovered.
6589

66-
To generate $m$ memories \($\xi_1, ... \xi_m$\) for a network of $N$ neurons, you can use the following Python code; each row of the resulting matrix `xi` contains a single memory:
67-
```python
90+
**Visualization 1:**
91+
Create a heatmap:
92+
- \( x \)-axis: network size
93+
- \( y \)-axis: number of stored memories
94+
- Color: proportion of memories retrieved with ≥99% accuracy
6895

69-
import numpy as np
70-
xi = 2 * (np.random.rand(m, N) > 0.5) - 1
71-
```
96+
**Visualization 2:**
97+
Plot the expected number of accurately retrieved memories vs. network size:
7298

73-
- **Method:**
74-
- For each configuration, run multiple trials to compute the proportion of times that at least 99% of a memory is accurately recovered.
75-
- **Visualization 1:** Create a heatmap where the $x$-axis represents the network size, the $y$-axis represents the number of stored memories, and the color indicates the recovery accuracy. Play around with this to decide on a range of network sizes and numbers of memories that adequately illustrates the system's behavior.
76-
- **Visualization 2:** For each network size ($x$-axis), plot the expected number of memories that can be retrieved accurately ($y$-axis). Let:
99+
Let:
100+
- \( P[m, N] \in [0, 1] \): proportion of \( m \) memories accurately retrieved in a network of size \( N \)
101+
- \( \mathbb{E}[R_N] \): expected number of successfully retrieved memories
102+
103+
Then:
104+
$$
105+
\mathbb{E}[R_N] = \sum_{m=1}^{M} m \cdot P[m, N]
106+
$$
77107

78-
- \($N$\): the number of neurons (network size)
79-
- \($m$\): the number of stored memories
80-
- \($P\left[m, N\right] \in \left[0, 1\right]$\): the empirically observed success rate (from your heatmap); i.e., the **proportion** of memories correctly retrieved with at least 99% accuracy, for a network of size \($N$\) and \($m$\) stored memories.
108+
Where \( M \) is the maximum number of memories tested.
81109

82-
Then the **expected number of successfully retrieved memories**, \($\mathbb{E}\left[R_N\right]$\), for each network size \($N$\) is given by:
110+
**Follow-Up:**
83111

84-
$$
85-
\mathbb{E}\left[R_N\right] = \sum_{m=1}^{M} m \cdot P[m, N],
86-
$$
87-
where \($M$\) is the maximum number of stored memories you tested.
112+
- What relationship (if any) emerges between network size and capacity?
113+
- Can you develop rules or intuitions that help predict a network's capacity?
88114

89-
- Is there any systematic relationship (between network size and capacity) that emerges? Can you describe any intuitions and/or develop any "rules" that might enable you to estimate a network's capacity solely from its size?
115+
---
90116

91117
### 4. Simulate Cued Recall
92-
**Objective:** Evaluate how well the network performs associative recall when presented with only a partial input (a cue), and must recover the corresponding response.
118+
119+
**Objective:** Evaluate how the network performs associative recall when only a **cue** is presented.
93120

94121
#### Setup: A–B Pair Structure
95122

96-
- Each stored memory is a concatenated pair of binary patterns:
97-
- The first half of the neurons represents the **cue** \($A$\)
98-
- The second half represents the **response** \($B$\)
123+
- Each memory consists of two parts:
124+
- First half: **Cue** (\( A \))
125+
- Second half: **Response** (\( B \))
99126

100-
- If the total number of neurons \($N$\) is odd:
101-
- Let the cue occupy the first $\lfloor N/2 \rfloor$ neurons
102-
- Let the response occupy the remaining $\lceil N/2 \rceil$ neurons
127+
If \( N \) is odd:
128+
- Let cue length = \( \lfloor N/2 \rfloor \)
129+
- Let response length = \( \lceil N/2 \rceil \)
103130

104-
Each full pattern \($\xi^\mu \in \{-1, +1\}^N$\) is defined as:
131+
Each full memory:
105132
$$
106133
\xi^\mu = \begin{bmatrix} A^\mu \\ B^\mu \end{bmatrix}
107134
$$
@@ -110,119 +137,96 @@ $$
110137

111138
For each trial:
112139

113-
- **Choose a stored memory** \( \xi^\mu \)
114-
115-
- **Construct the initial network state \( x \)**:
116-
- Set the cue half to match the stored pattern:
117-
$$
118-
x_i = A^\mu_i \quad \text{for } i = 1, \dots, \lfloor N/2 \rfloor
119-
$$
120-
- Set the response half to zero (i.e., no initial information):
121-
$$
122-
x_i = 0 \quad \text{for } i = \lfloor N/2 \rfloor + 1, \dots, N
123-
$$
124-
125-
- **Evolve the network** until it reaches a stable state using the usual update rule:
126-
$$
127-
x_i \leftarrow \text{sign} \left( \sum_j J_{ij} x_j \right)
128-
$$
129-
130-
You may choose whether to:
131-
- Let the cue neurons update along with the rest of the network
132-
- Or **clamp** the cue (i.e., keep \($x_i = A^\mu_i$\) fixed for the cue indices)
133-
134-
- **Evaluate accuracy**:
135-
- Extract the response portion from the final state \($x^*$\)
136-
- Compare it to the original response \($B^\mu$\)
137-
- Mark as a **successful recall** if at least 99% of the bits match:
138-
$$
139-
\frac{1}{|B|} \sum_{i \in \text{response}} \mathbb{1}[x^*_i = B^\mu_i] \geq 0.99
140-
$$
140+
1. **Choose a memory** \( \xi^\mu \)
141+
2. **Construct initial state** \( x \):
142+
- Cue half: set to \( A^\mu \)
143+
- Response half: set to 0
144+
3. **Evolve the network** using the update rule:
145+
$$
146+
x_i \leftarrow \text{sign} \left( \sum_j J_{ij} x_j \right)
147+
$$
148+
- Optionally: **clamp** the cue (i.e., hold cue values fixed)
149+
4. **Evaluate success**:
150+
- Compare recovered response to \( B^\mu \)
151+
- Mark as successful if ≥99% of bits match:
152+
$$
153+
\frac{1}{|B|} \sum_{i \in \text{response}} \mathbb{1}[x^*_i = B^\mu_i] \geq 0.99
154+
$$
141155

142156
#### Analysis
143157

144-
- Repeat the simulation for multiple stored $A$–$B$ pairs
145-
- For each network size \($N$\), compute the **expected number of correctly recalled responses**
146-
- Plot this value as a function of \($N$\)
147-
158+
- Repeat across many \( A \)\( B \) pairs
159+
- For each network size \( N \), compute the **expected number** of correctly retrieved responses
160+
- Plot this value as a function of \( N \)
148161

149162
#### Optional Extensions
150163

151-
- Compare performance with and without clamping the cue neurons
152-
- Test whether cueing with partial or noisy \($A$\) patterns still leads to correct retrieval of \($B$\)
164+
- Compare performance with and without clamping the cue
165+
- Try cueing with noisy or partial versions of \( A \)
166+
167+
---
153168

154169
### 5. Simulate Contextual Drift
155170

156-
**Objective:** Explore how gradual changes in context influence which memories are retrieved. This models how temporal or environmental drift might bias recall toward memories with similar contexts.
171+
**Objective:** Investigate how gradual changes in **context** influence which memories are recalled.
157172

158-
#### Setup: Item–Context Memory Representation
173+
#### Setup: Item–Context Representation
159174

160-
- Use a Hopfield network with **100 neurons**.
161-
- Each memory is a combination of:
162-
- **Item features**: 50 neurons (first half)
163-
- **Context features**: 50 neurons (second half)
175+
- Use a Hopfield network with 100 neurons.
176+
- Each memory:
177+
- First 50 neurons: **Item**
178+
- Last 50 neurons: **Context**
164179

165-
- Create a **sequence of 10 memories** \($\{\xi^1, \xi^2, \dots, \xi^{10}\}$\), each composed of:
166-
$$
167-
\xi^t = \begin{bmatrix} \text{item}^t \\ \text{context}^t \end{bmatrix}
168-
$$
180+
Create a **sequence of 10 memories**:
181+
$$
182+
\xi^t = \begin{bmatrix} \text{item}^t \\ \text{context}^t \end{bmatrix}
183+
$$
169184

170-
- Initialize the **context vector** for the first memory randomly (set it to a random vector of +1s and -1s).
171-
- For each subsequent memory \($t + 1$\), create a new context vector by:
172-
- **Copying** the previous context
173-
- **Perturbing** a small number of bits (e.g., 5% of context features flipped)
185+
Context drift:
174186

175-
This creates a **drifting context** across the sequence.
187+
- Set \( \text{context}^1 \) randomly
188+
- For each subsequent \( \text{context}^{t+1} \), copy \( \text{context}^t \) and flip ~5% of the bits
176189

177190
#### Simulation Procedure
178191

179-
1. **Train the network** on all 10 memory patterns using Hebbian learning (encode them into a weight matrix, $J$)
180-
181-
2. For each memory index \($i = 1, \dots, 10$\):
182-
183-
a. **Cue the network** with the **context vector** from \($\xi^i$\):
184-
- Set the **context neurons** (second half) to match \($\text{context}^i$\)
185-
- Set the **item neurons** (first half) to zero (i.e., no initial item input)
186-
187-
b. **Run the network dynamics** until convergence
188-
189-
c. **Compare the final state** to all 10 stored patterns:
190-
- For each stored pattern \($\xi^j$\), extract the item portion and compare it to the final item state
191-
- If the item portion of \($\xi^j$\) matches the recovered state with ≥99% accuracy, consider memory \( j \) to have been retrieved
192-
193-
d. Record the **retrieved index** (if any), and compute the **relative position**:
194-
$$
195-
\Delta = j - i
196-
$$
197-
(e.g., \($\Delta = 0$\) means the correct memory was retrieved; \($\Delta = 1$\) means the next one in the sequence was recalled, etc.)
192+
1. Store all 10 memories in the network.
193+
2. For each memory \( i = 1, \dots, 10 \):
194+
- Cue the network with \( \text{context}^i \)
195+
- Set item neurons to 0
196+
- Run until convergence
197+
- For each stored memory \( j \), compare recovered item to \( \text{item}^j \)
198+
- If ≥99% of bits match, record \( j \) as retrieved
199+
- Record \( \Delta = j - i \) (relative offset)
198200

199201
#### Analysis
200202

201-
- Repeat the simulation multiple times (e.g., 100 runs) to account for randomness
202-
- For each relative position \( \Delta \in [-9, +9] \), compute:
203-
- The **probability** that a memory at offset \($\Delta$\) was retrieved when cueing from index \($i$\)
204-
- A **95% confidence interval** for each probability estimate
203+
- Repeat the procedure (e.g., 100 trials)
204+
- For each \( \Delta \in [-9, +9] \), compute:
205+
- Probability of retrieval
206+
- 95% confidence interval
205207

206208
#### Visualization
207209

208-
Create a line plot where:
210+
Create a line plot:
209211

210-
- $x$-axis: Relative position in the sequence \($\Delta$\)
211-
- $y$-axis: Probability of retrieval
212+
- \( x \)-axis: Relative position \( \Delta \)
213+
- \( y \)-axis: Retrieval probability
212214
- Error bars: 95% confidence intervals
213215

214-
Write up a brief description of what you think is happening (and why).
216+
Write a brief interpretation of the observed pattern.
215217

216-
#### Optional extensions
218+
#### Optional Extensions
217219

218-
- Context drift simulates how memory might change over time or under shifting external conditions
219-
- You may adjust the context perturbation rate to see how sharply it affects retrieval
220-
- This model can be adapted to explore recency effects, intrusion errors, or generalization
220+
- Vary the drift rate and observe the effect
221+
- Try random (non-gradual) context changes
222+
- Explore links to recency effects or memory generalization
221223

224+
---
222225

223226
## Submission Instructions
224-
- [Submit](https://canvas.dartmouth.edu/courses/71051/assignments/517353) a single stand-alone Google Colaboratory notebook (or similar) that includes:
225-
- Your full model implementation.
226-
- Markdown (text) cells explaining your approach, methodology, any design decisions you want to draw attention to, and discussion points.
227-
- Plots and results for each simulation task.
228-
- Ensure that your notebook runs without errors in Google Colaboratory.
227+
228+
- [Submit](https://canvas.dartmouth.edu/courses/71051/assignments/517353) a single standalone Google Colaboratory notebook (or similar) that includes:
229+
- Your full model implementation
230+
- Markdown cells explaining your methods, assumptions, and findings
231+
- Plots and results for each section
232+
- Your notebook should run **without errors** in Google Colaboratory.

0 commit comments

Comments
 (0)