Skip to content

Avatar Video Streaming in Realtime #68

@ajay4202j

Description

@ajay4202j

I want to implement an avatar in a streaming setup, where I receive audio in chunks every 0.5 to 0.8 seconds without knowing the total length in advance. Do you have any suggestions on how to implement this? I am referring to stream_pipeline_online.py.

SDK.setup(source_path, output_path, **setup_kwargs)

audio, sr = librosa.core.load(audio_path, sr=16000)
num_f = math.ceil(len(audio) / 16000 * 25)

fade_in = run_kwargs.get("fade_in", -1)
fade_out = run_kwargs.get("fade_out", -1)
ctrl_info = run_kwargs.get("ctrl_info", {})
SDK.setup_Nd(N_d=num_f, fade_in=fade_in, fade_out=fade_out, ctrl_info=ctrl_info)

online_mode = SDK.online_mode
if online_mode:
    chunksize = run_kwargs.get("chunksize", (3, 5, 2))
    audio = np.concatenate([np.zeros((chunksize[0] * 640,), dtype=np.float32), audio], 0)
    split_len = int(sum(chunksize) * 0.04 * 16000) + 80  # 6480
    for i in range(0, len(audio), chunksize[1] * 640):
        audio_chunk = audio[i:i + split_len]
        if len(audio_chunk) < split_len:
            audio_chunk = np.pad(audio_chunk, (0, split_len - len(audio_chunk)), mode="constant")
        SDK.run_chunk(audio_chunk, chunksize)

In the current code, sending audio in chunks requires knowing the total duration of the audio file. However, in my real use case, I will be receiving audio chunks one by one, without having information about the complete audio in advance. How can I implement avatar streaming in this scenario so that streaming happens in real time?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions