added qnn issue

samuel100 · samuel100 · commit d3cfa3187f0a · 2025-05-16T13:46:53.000+01:00
diff --git a/articles/ai-foundry/foundry-local/reference/reference-best-practice.md b/articles/ai-foundry/foundry-local/reference/reference-best-practice.md
@@ -35,6 +35,7 @@ foundry model info <model> --license
 
 Foundry Local is designed for on-device inference and *not* distributed, containerized, or multi-machine production deployments.
 
+
 ## Troubleshooting
 
 ### Common issues and solutions
@@ -44,11 +45,13 @@ Foundry Local is designed for on-device inference and *not* distributed, contain
 | Slow inference | CPU-only model with large parameter count | Use GPU-optimized model variants when available |
 | Model download failures | Network connectivity issues | Check your internet connection and run `foundry cache list` to verify cache status |
 | The service fails to start | Port conflicts or permission issues | Try `foundry service restart` or [report an issue](https://github.com/microsoft/Foundry-Local/issues) with logs using `foundry zip-logs` |
+| Qualcomm NPU error (`Qnn error code 5005: "Failed to load from EpContext model. qnn_backend_manager."`) | Qualcomm NPU error | |
 
 ### Improving performance
 
 If you experience slow inference, consider the following strategies:
 
+- Simultaneously running ONNX models provided in the AI Toolkit for VS Code cause resource contention. Stop the AI Toolkit inference session before running Foundry Local.
 - Use GPU acceleration when available
 - Identify bottlenecks by monitoring memory usage during inference.
 - Try more quantized model variants (like INT8 instead of FP16)