Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,21 @@ kokoro-multi-lang-v1_0
sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16
cmake-build-debug
README-DEV.txt
*.rknn
*.jit

# WASM combined build artifacts
wasm/combined/*.wasm
wasm/combined/sherpa-onnx-wasm-combined.js
build-wasm-combined/
# Don't ignore the build script
!build-wasm-combined.sh

##clion
.idea
scripts/dotnet/examples/obj/Debug/net8.0/Common.AssemblyInfo.cs
scripts/dotnet/examples/obj/Debug/net8.0/Common.GeneratedMSBuildEditorConfig.editorconfig
scripts/dotnet/examples/obj/Debug/net8.0/Common.AssemblyInfoInputs.cache
wasm/asr/sherpa-onnx-wasm-main-asr.data
wasm/asr/sherpa-onnx-wasm-main-asr.js
wasm/asr/sherpa-onnx-wasm-main-asr.wasm

sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02
47 changes: 47 additions & 0 deletions build-wasm-combined.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/bin/bash
#
# Copyright (c) 2024 Xiaomi Corporation

# Exit on error and print commands
set -ex

echo "=== Starting build process for sherpa-onnx WASM combined ==="

# Set environment flag to indicate we're using this script
export SHERPA_ONNX_IS_USING_BUILD_WASM_SH=1

# Create build directory
mkdir -p build-wasm-combined
cd build-wasm-combined

echo "=== Running CMake configuration ==="
# Configure with CMake
emcmake cmake \
-DCMAKE_BUILD_TYPE=Release \
-DSHERPA_ONNX_ENABLE_WASM=ON \
-DSHERPA_ONNX_ENABLE_CHECK=OFF \
-DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \
-DSHERPA_ONNX_ENABLE_BINARY=OFF \
-DSHERPA_ONNX_ENABLE_PYTHON=OFF \
-DSHERPA_ONNX_ENABLE_JNI=OFF \
-DSHERPA_ONNX_ENABLE_C_API=ON \
-DSHERPA_ONNX_ENABLE_TEST=OFF \
-DSHERPA_ONNX_ENABLE_WASM_COMBINED=ON \
-DSHERPA_ONNX_INSTALL_TO_REPO=ON \
..

echo "=== Building the target ==="
# Build the target with full path to the target
emmake make -j $(nproc) sherpa-onnx-wasm-combined

echo "=== Installing the files ==="
# Install the files
emmake make install/strip

if [ $? -eq 0 ]; then
echo "=== Build completed successfully! ==="
echo "Files have been installed to bin/wasm/combined and copied to wasm/combined/"
else
echo "=== Build failed! Check the error messages above ==="
exit 1
fi
4 changes: 4 additions & 0 deletions wasm/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,7 @@ endif()
if(SHERPA_ONNX_ENABLE_WASM_NODEJS)
add_subdirectory(nodejs)
endif()

if(SHERPA_ONNX_ENABLE_WASM_COMBINED)
add_subdirectory(combined)
endif()
7 changes: 7 additions & 0 deletions wasm/combined/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Generated WASM files
*.wasm
sherpa-onnx-wasm-combined.js
sherpa-onnx-wasm-combined.data
# Local model files
*.onnx
*tokens.txt
220 changes: 220 additions & 0 deletions wasm/combined/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})
message(FATAL_ERROR "Please use ./build-wasm-combined.sh to build for wasm combined module")
endif()

# Check for asset directories
if(NOT IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/asr")
message(WARNING "ASR assets directory not found at ${CMAKE_CURRENT_SOURCE_DIR}/assets/asr")
endif()

if(NOT IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/vad")
message(WARNING "VAD assets directory not found at ${CMAKE_CURRENT_SOURCE_DIR}/assets/vad")
endif()

if(NOT IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/tts")
message(WARNING "TTS assets directory not found at ${CMAKE_CURRENT_SOURCE_DIR}/assets/tts")
endif()

if(NOT IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/kws")
message(WARNING "KWS assets directory not found at ${CMAKE_CURRENT_SOURCE_DIR}/assets/kws")
endif()

# Collect all exported functions from all modules
set(exported_functions
# Core utilities
CopyHeap
MyPrintOnlineASR
MyPrintVAD
MyPrintTTS
MyPrintSpeakerDiarization
MyPrintSpeechEnhancement
MyPrintKeywordSpotting
SherpaOnnxFileExists

# Online ASR
SherpaOnnxCreateOnlineRecognizer
SherpaOnnxCreateOnlineStream
SherpaOnnxDecodeOnlineStream
SherpaOnnxDestroyOfflineStreamResultJson
SherpaOnnxDestroyOnlineRecognizer
SherpaOnnxDestroyOnlineRecognizerResult
SherpaOnnxDestroyOnlineStream
SherpaOnnxDestroyOnlineStreamResultJson
SherpaOnnxGetOfflineStreamResultAsJson
SherpaOnnxGetOnlineStreamResult
SherpaOnnxGetOnlineStreamResultAsJson
SherpaOnnxIsOnlineStreamReady
SherpaOnnxOnlineStreamAcceptWaveform
SherpaOnnxOnlineStreamInputFinished
SherpaOnnxOnlineStreamIsEndpoint
SherpaOnnxOnlineStreamReset

# Offline ASR
SherpaOnnxCreateOfflineRecognizer
SherpaOnnxCreateOfflineStream
SherpaOnnxDecodeOfflineStream
SherpaOnnxDecodeMultipleOfflineStreams
SherpaOnnxDestroyOfflineRecognizer
SherpaOnnxDestroyOfflineRecognizerResult
SherpaOnnxDestroyOfflineStream
SherpaOnnxAcceptWaveformOffline
SherpaOnnxGetOfflineStreamResult

# TTS
SherpaOnnxCreateOfflineTts
SherpaOnnxDestroyOfflineTts
SherpaOnnxDestroyOfflineTtsGeneratedAudio
SherpaOnnxOfflineTtsGenerate
SherpaOnnxOfflineTtsGenerateWithCallback
SherpaOnnxOfflineTtsSampleRate
SherpaOnnxOfflineTtsNumSpeakers
SherpaOnnxWriteWave

# VAD
SherpaOnnxCreateCircularBuffer
SherpaOnnxDestroyCircularBuffer
SherpaOnnxCircularBufferPush
SherpaOnnxCircularBufferGet
SherpaOnnxCircularBufferFree
SherpaOnnxCircularBufferPop
SherpaOnnxCircularBufferSize
SherpaOnnxCircularBufferHead
SherpaOnnxCircularBufferReset
SherpaOnnxCreateVoiceActivityDetector
SherpaOnnxDestroyVoiceActivityDetector
SherpaOnnxVoiceActivityDetectorAcceptWaveform
SherpaOnnxVoiceActivityDetectorEmpty
SherpaOnnxVoiceActivityDetectorDetected
SherpaOnnxVoiceActivityDetectorPop
SherpaOnnxVoiceActivityDetectorClear
SherpaOnnxVoiceActivityDetectorFront
SherpaOnnxDestroySpeechSegment
SherpaOnnxVoiceActivityDetectorReset
SherpaOnnxVoiceActivityDetectorFlush

# KWS
SherpaOnnxCreateKeywordSpotter
SherpaOnnxDestroyKeywordSpotter
SherpaOnnxCreateKeywordStream
SherpaOnnxIsKeywordStreamReady
SherpaOnnxDecodeKeywordStream
SherpaOnnxResetKeywordStream
SherpaOnnxGetKeywordResult
SherpaOnnxDestroyKeywordResult
)

set(mangled_exported_functions)
foreach(x IN LISTS exported_functions)
list(APPEND mangled_exported_functions "_${x}")
endforeach()
list(JOIN mangled_exported_functions "," all_exported_functions)

include_directories(${CMAKE_SOURCE_DIR})
set(MY_FLAGS " -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1")
string(APPEND MY_FLAGS " -sSTACK_SIZE=10485760 ") # 10MB
string(APPEND MY_FLAGS " -sASYNCIFY=1 -sFETCH=1 ") # For async loading
string(APPEND MY_FLAGS " -sEXPORTED_FUNCTIONS=[_malloc,_free,${all_exported_functions}] ")
# No preloaded assets - all models will be loaded dynamically
string(APPEND MY_FLAGS " -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','FS'] ")

# Load precompiled assets using structured paths
if(IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/asr")
string(APPEND MY_FLAGS "--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets/asr@/sherpa_assets/asr ")
endif()

if(IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/vad")
string(APPEND MY_FLAGS "--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets/vad@/sherpa_assets/vad ")
endif()

if(IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/tts")
string(APPEND MY_FLAGS "--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets/tts@/sherpa_assets/tts ")
endif()

if(IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/kws")
string(APPEND MY_FLAGS "--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets/kws@/sherpa_assets/kws ")
endif()

if(IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/speakers")
string(APPEND MY_FLAGS "--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets/speakers@/sherpa_assets/speakers ")
endif()

if(IS_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/assets/enhancement")
string(APPEND MY_FLAGS "--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets/enhancement@/sherpa_assets/enhancement ")
endif()

message(STATUS "MY_FLAGS: ${MY_FLAGS}")

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${MY_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${MY_FLAGS}")
set(CMAKE_EXECUTBLE_LINKER_FLAGS "${CMAKE_EXECUTBLE_LINKER_FLAGS} ${MY_FLAGS}")

add_executable(sherpa-onnx-wasm-combined sherpa-onnx-wasm-combined.cc)
target_link_libraries(sherpa-onnx-wasm-combined sherpa-onnx-c-api)
install(TARGETS sherpa-onnx-wasm-combined DESTINATION bin/wasm/combined)

install(
FILES
"$<TARGET_FILE_DIR:sherpa-onnx-wasm-combined>/sherpa-onnx-wasm-combined.js"
"$<TARGET_FILE_DIR:sherpa-onnx-wasm-combined>/sherpa-onnx-wasm-combined.wasm"
"$<TARGET_FILE_DIR:sherpa-onnx-wasm-combined>/sherpa-onnx-wasm-combined.data"
"sherpa-onnx-core.js"
"sherpa-onnx-asr.js"
"sherpa-onnx-vad.js"
"sherpa-onnx-tts.js"
"sherpa-onnx-kws.js"
"sherpa-onnx-speaker.js"
"sherpa-onnx-enhancement.js"
"sherpa-onnx-combined.js"
DESTINATION
bin/wasm/combined
)

# Add option to install to original repo
option(SHERPA_ONNX_INSTALL_TO_REPO "Install compiled WASM files to original repo directory" OFF)
set(SHERPA_ONNX_REPO_PATH "${CMAKE_SOURCE_DIR}/wasm/combined" CACHE PATH "Path to original repo wasm directory")

if(SHERPA_ONNX_INSTALL_TO_REPO)
# Add a custom target that will run after the installation
add_custom_target(install_to_repo ALL
COMMAND ${CMAKE_COMMAND} -E echo "Installing to original repo at ${SHERPA_ONNX_REPO_PATH}..."
COMMAND ${CMAKE_COMMAND} -E make_directory ${SHERPA_ONNX_REPO_PATH}

# Copy the JS file
COMMAND ${CMAKE_COMMAND}
-DSRC_DIR=${CMAKE_BINARY_DIR}/bin
-DDEST_DIR=${SHERPA_ONNX_REPO_PATH}
-DCOPY_FILES="sherpa-onnx-wasm-combined.js"
-P ${CMAKE_CURRENT_SOURCE_DIR}/copy_with_confirm.cmake

# Copy the WASM file
COMMAND ${CMAKE_COMMAND}
-DSRC_DIR=${CMAKE_BINARY_DIR}/bin
-DDEST_DIR=${SHERPA_ONNX_REPO_PATH}
-DCOPY_FILES="sherpa-onnx-wasm-combined.wasm"
-P ${CMAKE_CURRENT_SOURCE_DIR}/copy_with_confirm.cmake

# Copy the DATA file
COMMAND ${CMAKE_COMMAND}
-DSRC_DIR=${CMAKE_BINARY_DIR}/bin
-DDEST_DIR=${SHERPA_ONNX_REPO_PATH}
-DCOPY_FILES="sherpa-onnx-wasm-combined.data"
-P ${CMAKE_CURRENT_SOURCE_DIR}/copy_with_confirm.cmake

# Copy the index.html file
COMMAND ${CMAKE_COMMAND}
-DSRC_DIR=${CMAKE_CURRENT_SOURCE_DIR}
-DDEST_DIR=${SHERPA_ONNX_REPO_PATH}
-DCOPY_FILES="index.html"
-P ${CMAKE_CURRENT_SOURCE_DIR}/copy_with_confirm.cmake

# Copy the JS library file
COMMAND ${CMAKE_COMMAND}
-DSRC_DIR=${CMAKE_CURRENT_SOURCE_DIR}
-DDEST_DIR=${SHERPA_ONNX_REPO_PATH}
-DCOPY_FILES="sherpa-onnx-combined.js"
-P ${CMAKE_CURRENT_SOURCE_DIR}/copy_with_confirm.cmake

DEPENDS sherpa-onnx-wasm-combined
COMMENT "Checking and installing WASM files to original repo"
)
endif()
42 changes: 42 additions & 0 deletions wasm/combined/ISSUE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Sherpa-ONNX WASM Combined Module Issue: Inconsistent Shared Module Context & HEAPF32 Access Failure

## Problem Description

The core issue is a fundamental limitation in the current Sherpa-ONNX WASM combined module architecture: **it fails to establish a reliably shared and synchronized WebAssembly (WASM) runtime context across the multiple, sequentially loaded JavaScript component files** (`sherpa-onnx-combined-core.js`, `sherpa-onnx-combined-asr.js`, etc.). Specifically, essential JavaScript views onto the WASM memory, like `HEAPF32`, are not consistently accessible across these script boundaries.

### Background: WASM Memory and HEAP Views

- **WASM Linear Memory**: WebAssembly modules operate on a contiguous block of memory.
- **Emscripten HEAP Views**: To allow JavaScript to interact with this memory, Emscripten (the compiler used) creates typed array views (e.g., `Float32Array`, `Int8Array`) pointing directly into this memory block. These views are assigned to the global `Module` object as properties like `Module.HEAPF32`, `Module.HEAP8`, `Module.HEAPU8`, etc.
- **Initialization**: These `HEAP*` views are crucial for JS-WASM communication. They are normally initialized by the main Emscripten glue code (`sherpa-onnx-wasm-combined.js` in this case) *after* the WASM memory buffer is allocated but *before* or *during* the `Module.onRuntimeInitialized` callback, signifying the runtime is ready.

### Detailed Explanation of the Failure

1. **Context/Scope Separation & HEAP Inaccessibility**: Despite all scripts referencing the global `window.Module`, they appear to operate within distinct execution contexts. Crucially, the standard `HEAP*` memory views (especially `HEAPF32`, essential for ASR audio data transfer) that *should* be initialized on `window.Module` by the main glue code are **not accessible or visible** within the context of the subsequently loaded component scripts (e.g., `sherpa-onnx-combined-core.js`). The repeated log messages `No suitable memory buffer found` and `HEAPF32 exists: false` from within `sherpa-onnx-combined-core.js` are direct evidence of this failure.

2. **Sequential Loading Barrier**: The architecture loads functional components (ASR, VAD, etc.) as separate JS files *after* the main WASM module and its memory are expected to initialize. This sequential loading creates context boundaries that prevent the component scripts from accessing the already initialized `HEAP*` views on the `Module` object.

3. **Initialization Callbacks Ineffective Across Contexts**: Callbacks like `onRuntimeInitialized` might fire in the main glue code's context, but this readiness state (including the availability of initialized `HEAP*` views) does not reliably propagate to the separate contexts of the component scripts.

4. **Runtime Errors**: Consequently, operations requiring direct JS interaction with WASM memory via these views fail. For example, `OnlineStream.acceptWaveform` in ASR needs to write to `HEAPF32`. Since `HEAPF32` is inaccessible in the `asr.js` or `core.js` context, this fails, leading to downstream errors like `TypeError: asr.createStream is not a function` (as the recognizer likely failed during its own initialization which might require memory access).

5. **Selective Functionality Failure (Evidence)**: Functionalities like TTS (`tts.html`) appear less affected. This suggests their JS-WASM interaction pattern doesn't critically rely on the *JavaScript context* having direct write access to `HEAPF32` in the same way streaming ASR does, further supporting that the issue is specific to the accessibility of these memory views across script contexts.

### Impact

- **Unreliable Functionality**: Core features requiring JS access to WASM memory views (like streaming ASR via `HEAPF32`) fail reliably.
- **Debugging Dead End**: Standard synchronization techniques are ineffective because the fundamental issue is the inaccessibility of necessary `HEAP*` views due to context separation.

### Architectural Root Cause

The multi-file JavaScript approach, combined with Emscripten's standard output, fails to guarantee that the essential `HEAP*` memory views initialized on the `Module` object are accessible from the separate JavaScript files loaded later. Each script effectively gets a view of the `Module` object that might lack these critical, dynamically initialized properties.

### Potential Solutions

1. **Unified Script (Likely Viable but with Drawbacks)**: Combine *all* JavaScript glue code (core, ASR, VAD, TTS, etc.) and the main Emscripten `Module` interaction into a **single, large JavaScript file**. This forces all code into the same execution context, ensuring consistent access to the initialized `Module` object and its `HEAP*` views. **Drawback**: Creates a potentially very large initial JS file, impacting load performance.

2. **WASM Module Re-architecture (Complex)**: Fundamentally change how the C++ code is compiled, perhaps using Emscripten features explicitly designed for better JS module interoperability (e.g., `MODULARIZE=1`, ES6 modules output) that might handle state sharing differently. This likely requires significant changes to the build process and C++/JS interface.

3. ~~Delayed Functionality Binding~~ (Proven Ineffective): Delaying execution doesn't solve the problem that the necessary `HEAP*` views are fundamentally inaccessible from within the component script contexts.

This issue highlights a significant architectural challenge. The **Unified Script** approach appears the most practical path forward within the existing build system, despite performance implications.
Loading