You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
doing ggml_concat at on multiple layers can increase memory usage. one trick is to allocate one big tensor, then use ggml_set_rows to copy the intermediate result into the allocated tensor.