Skip to content
Closed
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d6609cb
Create README.md
Pritam3355 Oct 18, 2024
28b1f02
Update README.md
Pritam3355 Oct 18, 2024
998eed4
Add files via upload
Pritam3355 Oct 18, 2024
7019bf4
Delete llm_experiments directory
Pritam3355 Oct 18, 2024
f3d43e8
Create README.md
Pritam3355 Oct 19, 2024
2dad12b
Add files via upload
Pritam3355 Oct 19, 2024
4ecdca1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2024
f8510d7
Add files via upload
Pritam3355 Oct 19, 2024
fb102e6
Delete neural_network/chatbot/main.py
Pritam3355 Oct 20, 2024
6e7a428
Delete neural_network/chatbot/llm_service.py
Pritam3355 Oct 20, 2024
922a230
Delete neural_network/chatbot/chatbot.py
Pritam3355 Oct 20, 2024
a1d4cd9
Delete neural_network/chatbot/db.py
Pritam3355 Oct 20, 2024
508249e
Update README.md
Pritam3355 Oct 20, 2024
4af7a67
Add files via upload
Pritam3355 Oct 20, 2024
7c49052
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 20, 2024
276528d
Add files via upload
Pritam3355 Oct 20, 2024
789e975
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 20, 2024
3e4430d
Add files via upload
Pritam3355 Oct 20, 2024
322434d
Add files via upload
Pritam3355 Oct 20, 2024
5d91b30
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 20, 2024
30170fd
Delete neural_network/chatbot directory
Pritam3355 Oct 20, 2024
b23cc1a
Add files via upload
Pritam3355 Oct 20, 2024
b3c2a73
Add files via upload
Pritam3355 Oct 20, 2024
9e9a313
Add files via upload
Pritam3355 Oct 20, 2024
0415717
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions neural_network/sliding_window_attention.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
"""
- - - - - -- - - - - - - - - - - - - - - - - - - - - - -
Name - - sliding_window_attention.py
Goal - - Implement a neural network architecture using sliding window attention for sequence

Check failure on line 4 in neural_network/sliding_window_attention.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

neural_network/sliding_window_attention.py:4:89: E501 Line too long (92 > 88)
modeling tasks.
Detail: Total 5 layers neural network
* Input layer
* Sliding Window Attention Layer
* Feedforward Layer
* Output Layer
Author: Stephen Lee
Github: [email protected]
Date: 2024.10.20
References:
1. Choromanska, A., et al. (2020). "On the Importance of Initialization and Momentum in

Check failure on line 15 in neural_network/sliding_window_attention.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

neural_network/sliding_window_attention.py:15:89: E501 Line too long (91 > 88)
Deep Learning." *Proceedings of the 37th International Conference on Machine Learning*.

Check failure on line 16 in neural_network/sliding_window_attention.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

neural_network/sliding_window_attention.py:16:89: E501 Line too long (94 > 88)
2. Dai, Z., et al. (2020). "Transformers are RNNs: Fast Autoregressive Transformers
with Linear Attention." *arXiv preprint arXiv:2006.16236*.
3. [Attention Mechanisms in Neural Networks](https://en.wikipedia.org/wiki/Attention_(machine_learning))
- - - - - -- - - - - - - - - - - - - - - - - - - - - - -
"""

import numpy as np


class SlidingWindowAttention:
"""Sliding Window Attention Module.

This class implements a sliding window attention mechanism where the model
attends to a fixed-size window of context around each token.

Attributes:
window_size (int): The size of the attention window.
embed_dim (int): The dimensionality of the input embeddings.
"""

def __init__(self, embed_dim: int, window_size: int) -> None:
"""
Initialize the SlidingWindowAttention module.

Args:
embed_dim (int): The dimensionality of the input embeddings.
window_size (int): The size of the attention window.
"""
self.window_size = window_size
self.embed_dim = embed_dim
rng = np.random.default_rng()
self.attention_weights = rng.standard_normal((embed_dim, embed_dim))

def forward(self, input_tensor: np.ndarray) -> np.ndarray:
"""
Forward pass for the sliding window attention.

Args:
input_tensor (np.ndarray): Input tensor of shape (batch_size, seq_length,
embed_dim).

Returns:
np.ndarray: Output tensor of shape (batch_size, seq_length, embed_dim).

>>> x = np.random.randn(2, 10, 4) # Batch size 2, sequence length 10, embedding dimension 4

Check failure on line 61 in neural_network/sliding_window_attention.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

neural_network/sliding_window_attention.py:61:89: E501 Line too long (100 > 88)
>>> attention = SlidingWindowAttention(embed_dim=4, window_size=3)
>>> output = attention.forward(x)
>>> output.shape
(2, 10, 4)
>>> (output.sum() != 0).item() # Check if output is non-zero
True
"""
batch_size, seq_length, _ = input_tensor.shape
output = np.zeros_like(input_tensor)

for i in range(seq_length):
# Define the window range
start = max(0, i - self.window_size // 2)
end = min(seq_length, i + self.window_size // 2 + 1)

# Extract the local window
local_window = input_tensor[:, start:end, :]

# Compute attention scores
attention_scores = np.matmul(local_window, self.attention_weights)

# Average the attention scores
output[:, i, :] = np.mean(attention_scores, axis=1)

return output


if __name__ == "__main__":
import doctest

doctest.testmod()

# usage
rng = np.random.default_rng()
x = rng.standard_normal(
(2, 10, 4)
) # Batch size 2, sequence length 10, embedding dimension 4
attention = SlidingWindowAttention(embed_dim=4, window_size=3)
output = attention.forward(x)
print(output)
Loading