Multi-Head Attention Layer Implementation by MaximilianSchreff · Pull Request #2172 · apache/systemds

MaximilianSchreff · 2025-01-01T17:56:16Z

This PR introduces multi-head attention layers as a built in layer with forward and backward pass.

Description

The multi-head attention layer is the base layer of all most Transformer models, with many variations for different models. This implementation is in-line with the basic BERT attention layer. The functionality is currently kept to a minimum without features like attention masking, head masking, cross-attention, etc.

Testing

New testing module was implemented specifically for this layer, extending automated testing base
Tests execute forward/backward pass with given inputs and compares outputs against expected outputs
Implementation is compared against HuggingFace Transformer library implementation

Notes

This PR is the first in a number of PRs in an effort to support the BERT model in SystemDS and other transformer models in the future.

…in backward pass

codecov · 2025-01-01T20:02:45Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.86%. Comparing base (082cf89) to head (4ec2c80).
Report is 30 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2172      +/-   ##
============================================
- Coverage     71.97%   71.86%   -0.11%     
- Complexity    43855    44427     +572     
============================================
  Files          1441     1445       +4     
  Lines        166018   168173    +2155     
  Branches      32396    32827     +431     
============================================
+ Hits         119494   120865    +1371     
- Misses        37294    38019     +725     
- Partials       9230     9289      +59

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

phaniarnab · 2025-01-01T20:04:24Z

Thanks for the patch @MaximilianSchreff. Can you please add the missing license header to the java test file?

MaximilianSchreff · 2025-01-02T12:32:57Z

@phaniarnab, sorry forgot that. Now added.

phaniarnab · 2025-01-02T13:42:15Z

Thanks for the changes. I will merge it in. @MaximilianSchreff

Maximilian.S and others added 24 commits December 9, 2024 11:40

Basic multi-head attention layer forward pass

14aeef2

Initial testing file for multi attention forward

e6ac346

Call dropout only with p > 0

ce62cb4

DML test script for forward pass

443a12f

Test cases for multi-attention forward pass and bug fixing

72106c6

Backward pass function header

4909a30

Updated doc string of forward pass

af03e4f

Added doc string to backward pass function

2d63a6f

Added batch and head loop to backward pass

ae57f76

Returning dropout_mask in multi-attention forward pass for later use …

2e4b4e2

…in backward pass

Added dropout mask to input arguments of backward pass

52c40ea

Most backward calculations, added attention to backward inputs

2f5f26c

More backward calculations

6603fb3

Removed unused variable in forward pass

57186a4

Output handling in backward pass

4884070

Fixed bugs

3660773

Testing file for backward pass

4d2d1b2

Java testing logic for backwards pass

05f7259

Added test case function header

95074e1

Bug in backward pass computation

c253bdc

Bug in backward pass computation 2

434e277

Added data for backward pass test case

7f21af5

Large test for backward pass

53e4b59

Small test for backward pass

9082fb3

Added license header

4ec2c80

phaniarnab closed this in 85331dc Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Head Attention Layer Implementation#2172

Multi-Head Attention Layer Implementation#2172
MaximilianSchreff wants to merge 25 commits intoapache:mainfrom
MaximilianSchreff:multi-attention

MaximilianSchreff commented Jan 1, 2025

Uh oh!

codecov bot commented Jan 1, 2025 •

edited

Loading

Uh oh!

phaniarnab commented Jan 1, 2025

Uh oh!

MaximilianSchreff commented Jan 2, 2025

Uh oh!

phaniarnab commented Jan 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MaximilianSchreff commented Jan 1, 2025

Description

Testing

Notes

Uh oh!

codecov bot commented Jan 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

phaniarnab commented Jan 1, 2025

Uh oh!

MaximilianSchreff commented Jan 2, 2025

Uh oh!

phaniarnab commented Jan 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jan 1, 2025 •

edited

Loading