Replies: 1 comment 1 reply
-
|
Hi @dimon777, just assigning 2 device IDs through For multi-GPU benchmarking/simulations you need to make some small adjustments to
#ifdef BENCHMARK
#include "info.hpp"
void main_setup() { // benchmark; required extensions in defines.hpp: BENCHMARK, optionally FP16S or FP16C
// ################################################################## define simulation box size, viscosity and volume force ###################################################################
uint mlups = 0u; {
//LBM lbm( 32u, 32u, 32u, 1.0f);
//LBM lbm( 64u, 64u, 64u, 1.0f);
//LBM lbm(128u, 128u, 128u, 1.0f);
- LBM lbm(256u, 256u, 256u, 1.0f); // default
+ //LBM lbm(256u, 256u, 256u, 1.0f); // default
//LBM lbm(384u, 384u, 384u, 1.0f);
//LBM lbm(512u, 512u, 512u, 1.0f);
- //const uint memory = 31500u; // memory occupation in MB (for multi-GPU benchmarks: make this close to as large as the GPU's VRAM capacity)
- //const uint3 lbm_N = (resolution(float3(1.0f, 1.0f, 1.0f), memory)/4u)*4u; // input: simulation box aspect ratio and VRAM occupation in MB, output: grid resolution
+ const uint memory = 1488u; // memory occupation in MB (for multi-GPU benchmarks: make this close to as large as the GPU's VRAM capacity)
+ const uint3 lbm_N = (resolution(float3(1.0f, 1.0f, 1.0f), memory)/4u)*4u; // input: simulation box aspect ratio and VRAM occupation in MB, output: grid resolution
//LBM lbm(1u*lbm_N.x, 1u*lbm_N.y, 1u*lbm_N.z, 1u, 1u, 1u, 1.0f); // 1 GPU
- //LBM lbm(2u*lbm_N.x, 1u*lbm_N.y, 1u*lbm_N.z, 2u, 1u, 1u, 1.0f); // 2 GPUs
+ LBM lbm(2u*lbm_N.x, 1u*lbm_N.y, 1u*lbm_N.z, 2u, 1u, 1u, 1.0f); // 2 GPUs
//LBM lbm(2u*lbm_N.x, 2u*lbm_N.y, 1u*lbm_N.z, 2u, 2u, 1u, 1.0f); // 4 GPUs
//LBM lbm(2u*lbm_N.x, 2u*lbm_N.y, 2u*lbm_N.z, 2u, 2u, 2u, 1.0f); // 8 GPUs
// #########################################################################################################################################################################################
- for(uint i=0u; i<1000u; i++) {
- lbm.run(10u, 1000u*10u);
+ for(uint i=0u; i<100u; i++) {
+ lbm.run(10u, 100u*10u);
mlups = max(mlups, to_uint((double)lbm.get_N()*1E-6/info.runtime_lbm_timestep_smooth));
}
} // make lbm object go out of scope to free its memory
print_info("Peak MLUPs/s = "+to_string(mlups));
#if defined(_WIN32)
wait();
#endif // Windows
} /**/
#endif // BENCHMARKThen, run with
Thanks and kind regards, |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have Debian 12 system with two AMD R9700 GPUs. ./make.sh test runs fine on one GPU, but I can't run the built-in simulation on both GPUs:
$ ./bin/FluidX3D 0 1 reports:
And test still runs on one GPU. What do I do wrong?
Beta Was this translation helpful? Give feedback.
All reactions