Skip to content

feat-riva-client-ts-v01 #114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions riva-ts-client/.eslintrc.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
module.exports = {
parser: '@typescript-eslint/parser',
extends: [
'plugin:@typescript-eslint/recommended'
],
parserOptions: {
ecmaVersion: 2020,
sourceType: 'module'
},
rules: {
'@typescript-eslint/explicit-function-return-type': 'warn',
'@typescript-eslint/no-explicit-any': 'warn',
'@typescript-eslint/no-unused-vars': ['error', { 'argsIgnorePattern': '^_' }]
}
};
197 changes: 197 additions & 0 deletions riva-ts-client/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# NVIDIA Riva TypeScript Client

TypeScript implementation of the NVIDIA Riva client, providing a modern, type-safe interface for interacting with NVIDIA Riva services. This client is designed to be fully compatible with the Python implementation while leveraging TypeScript's type system for enhanced developer experience.

## Features

### Automatic Speech Recognition (ASR)
- Real-time streaming transcription with configurable chunk sizes
- Offline transcription with full audio files
- Word boosting and custom vocabulary
- Speaker diarization with configurable speaker count
- Custom endpointing configuration
- Model selection and listing
- Multi-language support
- WAV file handling and audio format utilities

### Text-to-Speech (TTS)
- High-quality speech synthesis
- Streaming and offline synthesis modes
- Custom dictionary support
- Multi-voice and multi-language support
- SSML support
- Audio format conversion utilities
- WAV file output handling

### Natural Language Processing (NLP)
- Text classification with confidence scores
- Token classification with position information
- Entity analysis with type and score
- Intent recognition with slot filling
- Text transformation
- Natural language query processing
- Language code support

### Neural Machine Translation (NMT)
- Text-to-text translation
- Language pair configuration
- Batch translation support

## Prerequisites

- Node.js (v18.x or later)
- npm (v6.x or later)
- Protocol Buffers compiler (protoc)
- TypeScript (v5.x or later)

## Installation

```bash
npm install nvidia-riva-client
```

## Building from Source

```bash
git clone https://github.com/nvidia-riva/python-clients
cd python-clients/riva-ts-client
npm install
npm run build
```

## Quick Start

### ASR Example
```typescript
import { ASRService } from 'nvidia-riva-client';

const asr = new ASRService({
serverUrl: 'localhost:50051'
});

// Streaming recognition
async function streamingExample() {
const config = {
encoding: AudioEncoding.LINEAR_PCL_16,
sampleRateHz: 16000,
languageCode: 'en-US',
audioChannelCount: 1
};

for await (const response of asr.streamingRecognize(audioSource, config)) {
console.log(response.results[0]?.alternatives[0]?.transcript);
}
}

// Offline recognition
async function offlineExample() {
const config = {
encoding: AudioEncoding.LINEAR_PCL_16,
sampleRateHz: 16000,
languageCode: 'en-US',
audioChannelCount: 1,
enableSpeakerDiarization: true,
maxSpeakers: 2
};

const response = await asr.recognize(audioBuffer, config);
console.log(response.results[0]?.alternatives[0]?.transcript);
}
```

### TTS Example
```typescript
import { SpeechSynthesisService } from 'nvidia-riva-client';

const tts = new SpeechSynthesisService({
serverUrl: 'localhost:50051'
});

async function synthesizeExample() {
const response = await tts.synthesize('Hello, welcome to Riva!', {
language: 'en-US',
voice: 'English-US-Female-1',
sampleRateHz: 44100,
customDictionary: {
'Riva': 'R IY V AH'
}
});

// Save to WAV file
await response.writeToFile('output.wav');
}
```

### NLP Example
```typescript
import { NLPService } from 'nvidia-riva-client';

const nlp = new NLPService({
serverUrl: 'localhost:50051'
});

async function nlpExample() {
// Text Classification
const classifyResult = await nlp.classifyText(
'Great product, highly recommend!',
'sentiment',
'en-US'
);
console.log(classifyResult.results[0]?.label);

// Entity Analysis
const entityResult = await nlp.analyzeEntities(
'NVIDIA is headquartered in Santa Clara, California.'
);
console.log(entityResult.entities);

// Intent Recognition
const intentResult = await nlp.analyzeIntent(
'What is the weather like today?'
);
console.log(intentResult.intent, intentResult.confidence);
}
```

### NMT Example
```typescript
import { NMTService } from 'nvidia-riva-client';

const nmt = new NMTService({
serverUrl: 'localhost:50051'
});

async function translateExample() {
const result = await nmt.translate(
'Hello, how are you?',
'en-US',
'es-ES'
);
console.log(result.translations[0]?.text);
}
```

## API Documentation

For detailed API documentation, please refer to the [API Reference](docs/api.md).

## Testing

```bash
# Run all tests
npm test

# Run tests with coverage
npm run test:coverage

# Run tests in watch mode
npm run test:watch
```

## Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

## License

This project is licensed under the terms of the [Apache 2.0 License](LICENSE).
83 changes: 83 additions & 0 deletions riva-ts-client/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
{
"name": "nvidia-riva-client",
"version": "2.18.0-rc0",
"description": "TypeScript implementation of the Riva Client API",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"scripts": {
"build": "tsc",
"test": "vitest run",
"test:watch": "vitest",
"test:coverage": "vitest run --coverage",
"proto:generate": "ts-node scripts/generate-protos.ts",
"lint": "eslint . --ext .ts",
"format": "prettier --write \"src/**/*.ts\"",
"clean": "rimraf dist",
"prebuild": "npm run clean",
"prepare": "npm run build",
"tts:talk": "ts-node scripts/tts/talk.ts"
},
"dependencies": {
"@grpc/grpc-js": "^1.8.0",
"@grpc/proto-loader": "^0.7.10",
"commander": "^9.4.1",
"google-protobuf": "^3.21.2",
"mic": "^2.1.2",
"node-wav": "^0.0.2",
"node-wav-player": "^0.2.0",
"pino": "^8.17.2",
"rxjs": "^7.8.1",
"wavefile": "^11.0.0",
"winston": "^3.11.0"
},
"devDependencies": {
"@eslint/eslintrc": "^3.0.0",
"@types/google-protobuf": "^3.15.12",
"@types/jest": "^29.5.11",
"@types/node": "^20.11.5",
"@types/node-wav": "^0.0.2",
"@typescript-eslint/eslint-plugin": "^6.19.0",
"@typescript-eslint/parser": "^6.19.0",
"@vitest/coverage-v8": "^1.6.0",
"eslint": "^8.56.0",
"jest": "^29.7.0",
"prettier": "^3.2.4",
"protoc-gen-ts": "^0.8.7",
"rimraf": "^5.0.5",
"ts-jest": "^29.1.1",
"ts-node": "^10.9.2",
"ts-proto": "^1.181.2",
"typescript": "^5.3.3",
"vitest": "^1.6.0"
},
"engines": {
"node": ">=18.0.0"
},
"keywords": [
"deep learning",
"machine learning",
"gpu",
"NLP",
"ASR",
"TTS",
"NMT",
"nvidia",
"speech",
"language",
"Riva",
"client"
],
"author": {
"name": "Anton Peganov",
"email": "[email protected]"
},
"repository": {
"type": "git",
"url": "https://github.com/nvidia-riva/python-clients"
},
"homepage": "https://github.com/nvidia-riva/python-clients",
"bugs": {
"url": "https://github.com/nvidia-riva/python-clients/issues"
},
"license": "MIT"
}
49 changes: 49 additions & 0 deletions riva-ts-client/proto/riva_asr.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
syntax = "proto3";

package nvidia.riva;

import "riva_services.proto";

service RivaSpeechRecognition {
rpc Recognize(RecognizeRequest) returns (RecognizeResponse);
rpc StreamingRecognize(stream StreamingRecognizeRequest) returns (stream StreamingRecognizeResponse);
}

message RecognizeRequest {
AudioConfig config = 1;
bytes audio = 2;
string model = 3;
}

message RecognizeResponse {
message Result {
string transcript = 1;
float confidence = 2;
repeated WordInfo words = 3;
}
repeated Result results = 1;
}

message StreamingRecognizeRequest {
oneof streaming_request {
AudioConfig config = 1;
bytes audio_content = 2;
}
}

message StreamingRecognizeResponse {
message Result {
string transcript = 1;
float confidence = 2;
bool is_final = 3;
repeated WordInfo words = 4;
}
repeated Result results = 1;
}

message WordInfo {
string word = 1;
float start_time = 2;
float end_time = 3;
float confidence = 4;
}
Loading