Replies: 9 comments 2 replies
-
算子库bug自动筛查Q: 如何告诉新硬件公司具体哪个算子有bug?
|
Beta Was this translation helpful? Give feedback.
-
|
生成单测文件需要能独立运行,不可以依赖GraphNet仓库。 |
Beta Was this translation helpful? Give feedback.
-
|
nvidia机器:./gn-op-unittest-0.py --role=reference |
Beta Was this translation helpful? Give feedback.
-
|
./gn-op-unittest-0.py --role=reference 背后应该是一个server。整个结果对比的机制就表达成server/client架构 |
Beta Was this translation helpful? Give feedback.
-
|
server应该抽象成一个rpc调用。该rpc调用只接受一个参数:随机种子,它的输出为tuple[Tensor] |
Beta Was this translation helpful? Give feedback.
-
|
我们可以开发一个软件包:graph_net_toolkits # 在client端执行
import graph_net_toolkits as gntk
ref_rpc_call = gntk.ReferenceRpcCall(machine=xxx, port=xxx)
ret: tuple[Tensor] = ref_rpc_call(sample_model_path, randomseed=xxx) # 内部通过rpc机制在另一台机器上执行 |
Beta Was this translation helpful? Give feedback.
-
# 在client端执行
import graph_net_toolkits as gntk
sample_rpc_call = gntk.SampleRpcCall(machine=xxx, port=xxx)
ret: tuple[Tensor] = sample_rpc_call(sample_model_path, randomseed=xxx) # 内部通过rpc机制在另一台机器上执行 |
Beta Was this translation helpful? Give feedback.
-
|
四类角色:reference_device_server, reference_device_client, target_device_server, target_device_client。 reference_device_server <- reference_device_client 最多需要4个进程,[[reference_device_server], [reference_device_client], [target_device_server], [target_device_client]] |
Beta Was this translation helpful? Give feedback.
-
# 在client端执行
import graph_net_bench as gnb
sample_remote_executor = gnb.SampleRemoteExecutor(machine=xxx, port=yyy)
ret: tuple[Tensor] = sample_remote_executor(sample_model_path, random_seed=zzz) # 内部通过rpc机制在另一台机器上执行 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
discussion for AutoFaultLocator
Beta Was this translation helpful? Give feedback.
All reactions