Skip to content
This repository was archived by the owner on Sep 28, 2024. It is now read-only.

Commit 82b0465

Browse files
authored
Merge pull request #39 from Abhishek-1Bhatt/deeponet
Deeponet
2 parents f47637f + d3e58f1 commit 82b0465

File tree

12 files changed

+334
-2
lines changed

12 files changed

+334
-2
lines changed

README.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,15 @@ It performs Fourier transformation across infinite-dimensional function spaces a
3636
With only one time step information of learning, it can predict the following few steps with low loss
3737
by linking the operators into a Markov chain.
3838

39+
**DeepONet operator** (Deep Operator Network)learns a neural operator with the help of two sub-neural net structures described as the branch and the trunk network. The branch network is fed the initial conditions data, whereas the trunk is fed with the locations where the target(output) is evaluated from the corresponding initial conditions. It is important that the output size of the branch and trunk subnets is same so that a dot product can be performed between them.
40+
3941
Currently, the `FourierOperator` layer is provided in this work.
4042
As for model, there are `FourierNeuralOperator` and `MarkovNeuralOperator` provided. Please take a glance at them [here](src/model.jl).
4143

4244
## Usage
4345

46+
### Fourier Neural Operator
47+
4448
```julia
4549
model = Chain(
4650
# lift (d + 1)-dimensional vector field to n-dimensional vector field
@@ -76,6 +80,32 @@ opt = Flux.Optimiser(WeightDecay(1f-4), Flux.ADAM(1f-3))
7680
Flux.@epochs 50 Flux.train!(loss, params(model), data, opt)
7781
```
7882

83+
### DeepONet
84+
85+
```julia
86+
#tuple of Ints for branch net architecture and then for trunk net, followed by activations for branch and trunk respectively
87+
model = DeepONet((32,64,72), (24,64,72), σ, tanh)
88+
```
89+
Or specify branch and trunk as separate `Chain` from Flux and pass to `DeepONet`
90+
91+
```julia
92+
branch = Chain(Dense(32,64,σ), Dense(64,72,σ))
93+
trunk = Chain(Dense(24,64,tanh), Dense(64,72,tanh))
94+
model = DeepONet(branch,trunk)
95+
```
96+
97+
You can again specify loss, optimization and training parameters just as you would for a simple neural network with Flux.
98+
99+
```julia
100+
loss(xtrain,ytrain,sensor) = Flux.Losses.mse(model(xtrain,sensor),ytrain)
101+
evalcb() = @show(loss(xval,yval,grid))
102+
103+
learning_rate = 0.001
104+
opt = ADAM(learning_rate)
105+
parameters = params(model)
106+
Flux.@epochs 400 Flux.train!(loss, parameters, [(xtrain,ytrain,grid)], opt, cb = evalcb)
107+
```
108+
79109
## Examples
80110

81111
PDE training examples are provided in `example` folder.
@@ -84,6 +114,10 @@ PDE training examples are provided in `example` folder.
84114

85115
[Burgers' equation](example/Burgers)
86116

117+
### DeepONet implementation for solving Burgers' equation
118+
119+
[Burgers' equation](example/Burgers/src/Burgers_deeponet.jl)
120+
87121
### Two-dimensional Fourier Neural Operator
88122

89123
[Double Pendulum](example/DoublePendulum)
@@ -113,3 +147,4 @@ PDE training examples are provided in `example` folder.
113147
- [Neural Operator: Graph Kernel Network for Partial Differential Equations](https://arxiv.org/abs/2003.03485)
114148
- [zongyi-li/graph-pde](https://github.com/zongyi-li/graph-pde)
115149
- [Markov Neural Operators for Learning Chaotic Systems](https://arxiv.org/abs/2106.06898)
150+
- [DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators](https://arxiv.org/abs/1910.03193)

docs/src/index.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Documentation for [NeuralOperators](https://github.com/foldfelis/NeuralOperators
1010
|:----------------:|:--------------:|
1111
| ![](https://github.com/foldfelis/NeuralOperators.jl/blob/master/example/FlowOverCircle/gallery/ans.gif?raw=true) | ![](https://github.com/foldfelis/NeuralOperators.jl/blob/master/example/FlowOverCircle/gallery/inferenced.gif?raw=true) |
1212

13-
The demonstration showing above is Navier-Stokes equation learned by the `MarkovNeuralOperator` with only one time step information.
13+
The demonstration shown above is Navier-Stokes equation learned by the `MarkovNeuralOperator` with only one time step information.
1414
Example can be found in [`example/FlowOverCircle`](https://github.com/foldfelis/NeuralOperators.jl/tree/master/example/FlowOverCircle).
1515

1616
## Abstract
@@ -30,6 +30,8 @@ It performs Fourier transformation across infinite-dimensional function spaces a
3030
With only one time step information of learning, it can predict the following few steps with low loss
3131
by linking the operators into a Markov chain.
3232

33+
**DeepONet operator** (Deep Operator Network)learns a neural operator with the help of two sub-neural net structures described as the branch and the trunk network. The branch network is fed the initial conditions data, whereas the trunk is fed with the locations where the target(output) is evaluated from the corresponding initial conditions. It is important that the output size of the branch and trunk subnets is same so that a dot product can be performed between them.
34+
3335
Currently, the `FourierOperator` layer is provided in this work.
3436
As for model, there are `FourierNeuralOperator` and `MarkovNeuralOperator` provided.
3537
Please take a glance at them [here](apis.html#Models).
@@ -44,6 +46,8 @@ pkg> add NeuralOperators
4446

4547
## Usage
4648

49+
### Fourier Neural Operator
50+
4751
```julia
4852
model = Chain(
4953
# lift (d + 1)-dimensional vector field to n-dimensional vector field
@@ -78,3 +82,31 @@ loss(𝐱, 𝐲) = sum(abs2, 𝐲 .- model(𝐱)) / size(𝐱)[end]
7882
opt = Flux.Optimiser(WeightDecay(1f-4), Flux.ADAM(1f-3))
7983
Flux.@epochs 50 Flux.train!(loss, params(model), data, opt)
8084
```
85+
86+
### DeepONet
87+
88+
```julia
89+
#tuple of Ints for branch net architecture and then for trunk net, followed by activations for branch and trunk respectively
90+
model = DeepONet((32,64,72), (24,64,72), σ, tanh)
91+
```
92+
93+
Or specify branch and trunk as separate `Chain` from Flux and pass to `DeepONet`
94+
95+
```julia
96+
branch = Chain(Dense(32,64,σ), Dense(64,72,σ))
97+
trunk = Chain(Dense(24,64,tanh), Dense(64,72,tanh))
98+
model = DeepONet(branch,trunk)
99+
```
100+
101+
You can again specify loss, optimization and training parameters just as you would for a simple neural network with Flux.
102+
103+
```julia
104+
loss(xtrain,ytrain,sensor) = Flux.Losses.mse(model(xtrain,sensor),ytrain)
105+
evalcb() = @show(loss(xval,yval,grid))
106+
107+
learning_rate = 0.001
108+
opt = ADAM(learning_rate)
109+
parameters = params(model)
110+
Flux.@epochs 400 Flux.train!(loss, parameters, [(xtrain,ytrain,grid)], opt, cb = evalcb)
111+
```
112+
A more complete example using DeepONet architecture to solve Burgers' equation can be found in the [examples](../../example/Burgers/src/Burgers_deeponet.jl)

example/Burgers/src/Burgers.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ using Flux
55
using CUDA
66

77
include("data.jl")
8+
include("Burgers_deeponet.jl")
89

910
__init__() = register_burgers()
1011

@@ -30,7 +31,7 @@ function train()
3031
Dense(128, 1),
3132
flatten
3233
) |> device
33-
34+
3435
loss(𝐱, 𝐲) = sum(abs2, 𝐲 .- m(𝐱)) / size(𝐱)[end]
3536

3637
loader_train, loader_test = get_dataloader()
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
function train_don()
2+
if has_cuda()
3+
@info "CUDA is on"
4+
device = gpu
5+
CUDA.allowscalar(false)
6+
else
7+
device = cpu
8+
end
9+
10+
x, y = get_data_don(n=300)
11+
xtrain = x[1:280, :]' |> device
12+
xval = x[end-19:end, :]' |device
13+
14+
ytrain = y[1:280, :] |> device
15+
yval = y[end-19:end, :] |> device
16+
17+
grid = collect(range(0, 1, length=1024))' |> device
18+
19+
learning_rate = 0.001
20+
opt = ADAM(learning_rate)
21+
22+
m = DeepONet((1024,1024,1024),(1,1024,1024),gelu,gelu)
23+
loss(xtrain,ytrain,sensor) = Flux.Losses.mse(model(xtrain,sensor),ytrain)
24+
evalcb() = @show(loss(xval,yval,grid))
25+
26+
Flux.@epochs 400 Flux.train!(loss, params(m), [(xtrain,ytrain,grid)], opt, cb = evalcb)
27+
= m(xval, grid)
28+
29+
diffvec = vec(abs.((yval .- ỹ)))
30+
mean_diff = sum(diffvec)/length(diffvec)
31+
return mean_diff
32+
end

example/Burgers/src/data.jl

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,15 @@ function get_data(; n=2048, Δsamples=2^3, grid_size=div(2^13, Δsamples), T=Flo
2727
return x_loc_data, y_data
2828
end
2929

30+
function get_data_don(; n=2048, Δsamples=2^3, grid_size=div(2^13, Δsamples))
31+
file = matopen(joinpath(datadep"Burgers", "burgers_data_R10.mat"))
32+
x_data = collect(read(file, "a")[1:n, 1:Δsamples:end])
33+
y_data = collect(read(file, "u")[1:n, 1:Δsamples:end])
34+
close(file)
35+
36+
return x_data, y_data
37+
end
38+
3039
function get_dataloader(; n_train=1800, n_test=200, batchsize=100)
3140
𝐱, 𝐲 = get_data(n=2048)
3241

example/Burgers/test/deeponet.jl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
@testset "DeepONet Training Accuracy" begin
2+
ϵ = Burgers.train_don()
3+
4+
@test ϵ < 0.4
5+
end

example/Burgers/test/runtests.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,5 @@ using Test
33

44
@testset "Burgers" begin
55
include("data.jl")
6+
include("deeponet.jl")
67
end

src/DeepONet.jl

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
"""
2+
`DeepONet(architecture_branch::Tuple, architecture_trunk::Tuple,
3+
act_branch = identity, act_trunk = identity;
4+
init_branch = Flux.glorot_uniform,
5+
init_trunk = Flux.glorot_uniform,
6+
bias_branch=true, bias_trunk=true)`
7+
`DeepONet(branch_net::Flux.Chain, trunk_net::Flux.Chain)`
8+
9+
Create an (unstacked) DeepONet architecture as proposed by Lu et al.
10+
arXiv:1910.03193
11+
12+
The model works as follows:
13+
14+
x --- branch --
15+
|
16+
-⊠--u-
17+
|
18+
y --- trunk ---
19+
20+
Where `x` represents the input function, discretely evaluated at its respective sensors.
21+
So the ipnut is of shape [m] for one instance or [m x b] for a training set.
22+
`y` are the probing locations for the operator to be trained. It has shape [N x n] for
23+
N different variables in the PDE (i.e. spatial and temporal coordinates) with each n distinct evaluation points.
24+
`u` is the solution of the queried instance of the PDE, given by the specific choice of parameters.
25+
26+
Both inputs `x` and `y` are multiplied together via dot product Σᵢ bᵢⱼ tᵢₖ.
27+
28+
You can set up this architecture in two ways:
29+
30+
1. By Specifying the architecture and all its parameters as given above. This always creates
31+
`Dense` layers for the branch and trunk net and corresponds to the DeepONet proposed by Lu et al.
32+
33+
2. By passing two architectures in the form of two Chain structs directly. Do this if you want more
34+
flexibility and e.g. use an RNN or CNN instead of simple `Dense` layers.
35+
36+
Strictly speaking, DeepONet does not imply either of the branch or trunk net to be a simple
37+
DNN. Usually though, this is the case which is why it's treated as the default case here.
38+
39+
# Example
40+
41+
Consider a transient 1D advection problem ∂ₜu + u ⋅ ∇u = 0, with an IC u(x,0) = g(x).
42+
We are given several (b = 200) instances of the IC, discretized at 50 points each and want
43+
to query the solution for 100 different locations and times [0;1].
44+
45+
That makes the branch input of shape [50 x 200] and the trunk input of shape [2 x 100]. So the
46+
input for the branch net is 50 and 100 for the trunk net.
47+
48+
# Usage
49+
50+
```julia
51+
julia> model = DeepONet((32,64,72), (24,64,72))
52+
DeepONet with
53+
branch net: (Chain(Dense(32, 64), Dense(64, 72)))
54+
Trunk net: (Chain(Dense(24, 64), Dense(64, 72)))
55+
56+
julia> model = DeepONet((32,64,72), (24,64,72), σ, tanh; init_branch=Flux.glorot_normal, bias_trunk=false)
57+
DeepONet with
58+
branch net: (Chain(Dense(32, 64, σ), Dense(64, 72, σ)))
59+
Trunk net: (Chain(Dense(24, 64, tanh; bias=false), Dense(64, 72, tanh; bias=false)))
60+
61+
julia> branch = Chain(Dense(2,128),Dense(128,64),Dense(64,72))
62+
Chain(
63+
Dense(2, 128), # 384 parameters
64+
Dense(128, 64), # 8_256 parameters
65+
Dense(64, 72), # 4_680 parameters
66+
) # Total: 6 arrays, 13_320 parameters, 52.406 KiB.
67+
68+
julia> trunk = Chain(Dense(1,24),Dense(24,72))
69+
Chain(
70+
Dense(1, 24), # 48 parameters
71+
Dense(24, 72), # 1_800 parameters
72+
) # Total: 4 arrays, 1_848 parameters, 7.469 KiB.
73+
74+
julia> model = DeepONet(branch,trunk)
75+
DeepONet with
76+
branch net: (Chain(Dense(2, 128), Dense(128, 64), Dense(64, 72)))
77+
Trunk net: (Chain(Dense(1, 24), Dense(24, 72)))
78+
```
79+
"""
80+
struct DeepONet
81+
branch_net::Flux.Chain
82+
trunk_net::Flux.Chain
83+
end
84+
85+
# Declare the function that assigns Weights and biases to the layer
86+
function DeepONet(architecture_branch::Tuple, architecture_trunk::Tuple,
87+
act_branch = identity, act_trunk = identity;
88+
init_branch = Flux.glorot_uniform,
89+
init_trunk = Flux.glorot_uniform,
90+
bias_branch=true, bias_trunk=true)
91+
92+
@assert architecture_branch[end] == architecture_trunk[end] "Branch and Trunk net must share the same amount of nodes in the last layer. Otherwise Σᵢ bᵢⱼ tᵢₖ won't work."
93+
94+
# To construct the subnets we use the helper function in subnets.jl
95+
# Initialize the branch net
96+
branch_net = construct_subnet(architecture_branch, act_branch;
97+
init=init_branch, bias=bias_branch)
98+
# Initialize the trunk net
99+
trunk_net = construct_subnet(architecture_trunk, act_trunk;
100+
init=init_trunk, bias=bias_trunk)
101+
102+
return DeepONet(branch_net, trunk_net)
103+
end
104+
105+
Flux.@functor DeepONet
106+
107+
#= The actual layer that does stuff
108+
x is the input function, evaluated at m locations (or m x b in case of batches)
109+
y is the array of sensors, i.e. the variables of the output function
110+
with shape (N x n) - N different variables with each n evaluation points =#
111+
function (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat)
112+
# Assign the parameters
113+
branch, trunk = a.branch_net, a.trunk_net
114+
115+
#= Dot product needs a dim to contract
116+
However, we perform the transformations by the NNs always in the first dim
117+
so we need to adjust (i.e. transpose) one of the inputs,
118+
which we do on the branch input here =#
119+
return branch(x)' * trunk(y)
120+
end
121+
122+
# Sensors stay the same and shouldn't be batched
123+
(a::DeepONet)(x::AbstractArray, y::AbstractArray) =
124+
throw(ArgumentError("Sensor locations fed to trunk net can't be batched."))
125+
126+
# Print nicely
127+
function Base.show(io::IO, l::DeepONet)
128+
print(io, "DeepONet with\nbranch net: (",l.branch_net)
129+
print(io, ")\n")
130+
print(io, "Trunk net: (", l.trunk_net)
131+
print(io, ")\n")
132+
end

src/NeuralOperators.jl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ module NeuralOperators
88
using Zygote
99
using ChainRulesCore
1010

11+
export DeepONet
12+
1113
include("fourier.jl")
1214
include("model.jl")
15+
include("DeepONet.jl")
16+
include("subnets.jl")
1317
end

src/subnets.jl

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
"""
2+
Construct a Chain of `Dense` layers from a given tuple of integers.
3+
4+
Input:
5+
A tuple (m,n,o,p) of integer type numbers that each describe the width of the i-th Dense layer to Construct
6+
7+
Output:
8+
A `Flux` Chain with length of the input tuple and individual width given by the tuple elements
9+
10+
# Example
11+
12+
```julia
13+
julia> model = NeuralOperators.construct_subnet((2,128,64,32,1))
14+
Chain(
15+
Dense(2, 128), # 384 parameters
16+
Dense(128, 64), # 8_256 parameters
17+
Dense(64, 32), # 2_080 parameters
18+
Dense(32, 1), # 33 parameters
19+
) # Total: 8 arrays, 10_753 parameters, 42.504 KiB.
20+
21+
julia> model([2,1])
22+
1-element Vector{Float32}:
23+
-0.7630446
24+
```
25+
"""
26+
function construct_subnet(architecture::Tuple, σ = identity;
27+
init=Flux.glorot_uniform, bias=true)
28+
# First, create an array that contains all Dense layers independently
29+
# Given n-element architecture constructs n-1 layers
30+
layers = Array{Flux.Dense}(undef, length(architecture)-1)
31+
@inbounds for i 2:length(architecture)
32+
layers[i-1] = Flux.Dense(architecture[i-1], architecture[i], σ;
33+
init=init, bias=bias)
34+
end
35+
36+
# Concatenate the layers to a string, chain them and parse them into
37+
# the Flux Chain constructor syntax
38+
return Meta.parse("Chain("*join(layers,",")*")") |> eval
39+
end

0 commit comments

Comments
 (0)