Mixed Tensor/NonTensor for Gen by jeremylt · Pull Request #1762 · CEED/libCEED

jeremylt · 2025-02-26T23:42:04Z

Last step in the *gen refactor. This will allow us to run operators that have a mix of tensor product and non-tensor bases. 2D is easier, 3D will take more thought.

jeremylt · 2025-02-26T23:44:33Z

Sample "flat" version

template <int P_1D, int Q_1D>
inline __device__ void WeightTensor2dFlattened(SharedData_Cuda &data, const CeedScalar *__restrict__ q_weight_1d, CeedScalar *w) {
  const int max = P_1D < Q_1D ? P_1D : Q_1D;

  WeightTensor2d_Core<Q_1D>(data, data.t_id_x % max, data.t_id_x / max, q_weight_1d, w);
}

data.t_id_x goes up to max(P_1D * P_1D, Q_1D * Q_1D) here, so to map back to the old version we need to pass t_id_x % max to use as the tensor t_id_x and t_id_y / max to use as the tensor t_id_y.

jeremylt · 2025-02-27T22:25:17Z

Ok, I need to check on the memory in the data.slice for this approach

Also.... I'm not sure how 3D will be tackled

jeremylt · 2025-02-28T17:25:17Z

Ok, there's going to need to be more disentangling, as the operator I'm trying to target has different dim for different nodal spaces

jeremylt · 2025-03-03T22:09:20Z

Ok, separate dims for each field now, but theres some bug that's giving wrong results

jeremylt · 2025-03-07T17:53:55Z

ugh, T_1D is wrong for this strategy. Pondering

Idea - move slice or stand up fully separate versions? Probably the fully separate is the way to go at this point

jeremylt · 2025-03-10T22:53:50Z

Getting closer. 2D Tensor + 3D NonTensor is working now. Need 3D Tensor + 3D NonTensor next.

jeremylt · 2025-03-11T22:20:10Z

I'm starting to think this was not worth any of the time spent here - I think the restrictions also need to be hacked around with. I'm going to abandon this effort for now and switch to AtPoints assembly

jeremylt · 2025-03-12T20:25:26Z

Holy crap, we've got non-tensor with 2D tensor working perfectly now. Just need non-tensor with 3D tensor.

jeremylt · 2025-03-17T21:41:29Z

Ok, ready for review. Plan is squash + merge

jeremylt added enhancement GPU performance CUDA 0-WIP HIP labels Feb 26, 2025

jeremylt self-assigned this Feb 26, 2025

jeremylt force-pushed the jeremy/gen-mixed branch from 9e04d4c to af7851a Compare February 28, 2025 23:46

jeremylt force-pushed the jeremy/gen-mixed branch from 474fba4 to d334e61 Compare March 7, 2025 17:18

jeremylt mentioned this pull request Mar 10, 2025

Make BASIS_T_1D explicit template parameter #1773

Merged

jeremylt force-pushed the jeremy/gen-mixed branch from d334e61 to 336ccc0 Compare March 10, 2025 22:39

jeremylt force-pushed the jeremy/gen-mixed branch 3 times, most recently from 292b18f to 8e1cf66 Compare March 12, 2025 20:23

jeremylt force-pushed the jeremy/gen-mixed branch 8 times, most recently from 4d404ab to 9a751b7 Compare March 14, 2025 20:58

jeremylt force-pushed the jeremy/gen-mixed branch 2 times, most recently from abac4c0 to f33ffa5 Compare March 17, 2025 21:18

jeremylt added 1-In Review and removed 0-WIP labels Mar 17, 2025

jeremylt added 16 commits March 19, 2025 09:36

gpu - add P_1D to template args for AtPoints

f725b54

gpu - isolate core 2D tensor logic to allow flat version

343e309

gpu - add 2d Flattened variants of functions

ca595be

gpu - use 2d Flat variants in gen

412e568

cuda - fix 2D flattening

c433aab

gen - add 3D mixed support

c8e372f

gpu - drop changes in AtPoints

f29bd07

gen - fix flattened indexing

259057e

gen - set default dim to max

8014c5e

hip - add flattened templates

9b91271

hip - add mixed gen

74398b5

fix - harmless warnings

efa41df

gen - skip unneeded pack/unpack

ce44184

gen - small fixes

a61b1c9

gen - trim down rstr buffer size

6de4054

gen - minor consistency

b8245c6

jeremylt force-pushed the jeremy/gen-mixed branch from f33ffa5 to b8245c6 Compare March 19, 2025 15:37

doc - add mixed gen support to release notes

8b89f79

jeremylt added this to the v0.13 milestone Mar 19, 2025

jeremylt merged commit 4b6745b into main Mar 21, 2025
29 checks passed

jeremylt deleted the jeremy/gen-mixed branch March 21, 2025 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed Tensor/NonTensor for Gen#1762

Mixed Tensor/NonTensor for Gen#1762
jeremylt merged 17 commits intomainfrom
jeremy/gen-mixed

jeremylt commented Feb 26, 2025

Uh oh!

jeremylt commented Feb 26, 2025 •

edited

Loading

Uh oh!

jeremylt commented Feb 27, 2025

Uh oh!

jeremylt commented Feb 28, 2025

Uh oh!

jeremylt commented Mar 3, 2025

Uh oh!

jeremylt commented Mar 7, 2025 •

edited

Loading

Uh oh!

jeremylt commented Mar 10, 2025

Uh oh!

jeremylt commented Mar 11, 2025

Uh oh!

jeremylt commented Mar 12, 2025

Uh oh!

jeremylt commented Mar 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jeremylt commented Feb 26, 2025

Uh oh!

jeremylt commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremylt commented Feb 27, 2025

Uh oh!

jeremylt commented Feb 28, 2025

Uh oh!

jeremylt commented Mar 3, 2025

Uh oh!

jeremylt commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremylt commented Mar 10, 2025

Uh oh!

jeremylt commented Mar 11, 2025

Uh oh!

jeremylt commented Mar 12, 2025

Uh oh!

jeremylt commented Mar 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jeremylt commented Feb 26, 2025 •

edited

Loading

jeremylt commented Mar 7, 2025 •

edited

Loading