Skip to content

Commit 863e59e

Browse files
committed
Changes done for v2.2.9.
1 parent 1566763 commit 863e59e

File tree

21 files changed

+1472
-1391
lines changed

21 files changed

+1472
-1391
lines changed

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,7 @@ st submitjob -d <YOUR_STREAMS_DOMAIN> -i <YOUR_STREAMS_INSTANCE> output/co
142142
If you are planning to ingest the speech data from live voice calls, then you can invoke the **IBMVoiceGatewaySource** operator as shown below.
143143

144144
```
145-
(stream<BinarySpeech_t> BinarySpeechData as BSD;
146-
stream<EndOfCallSignal_t> EndOfCallSignal as EOCS) as VoiceGatewayInferface =
145+
(stream<BinarySpeech_t> BinarySpeechData as BSD) as VoiceGatewayInferface =
147146
IBMVoiceGatewaySource() {
148147
logic
149148
state: {

com.ibm.streamsx.sttgateway/CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Changes
22

3+
## v2.2.9
4+
* Feb/11/2021
5+
* Removed the EndOfCallSignal (EOCS) output stream completely to avoid port locks and out of order processing between the binary speech data (BSD) and the EOCS tuples. Now, a single output stream will deliver both the BSD and EOCS tuples in the correct sequence for downstream processing.
6+
* The change described above triggered foundational changes in the IBMVoiceGatewaySource operator and in the examples that invoke that operator.
7+
38
## v2.2.8
49
* Feb/07/2021
510
* Modified the IBMVoiceGatewaySource operator to handle the exception thrown when a given websocket connection handle can't be found in the connection metadata map.

com.ibm.streamsx.sttgateway/com.ibm.streamsx.sttgateway.watson/IBMVoiceGatewaySource/IBMVoiceGatewaySource.xml

Lines changed: 20 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,25 @@
304304
information and assign that meta data values to other optional attributes in this
305305
output port.
306306

307+
In addition to sending the binary speech data on this port, this operator will
308+
also send End Of Call Signal (EOCS) on this port whenever a particular
309+
voice channel of an ongoing call closes its WebSocket connection. So, this operator
310+
produces periodic output tuples to give an indication about the end of a
311+
specific speaker (i.e. channel) in a voice call that was in progress moments ago for
312+
the given IBM Voice Gateway session id. When it sends EOCS, it only sets values to
313+
certain attributes of the output stream as shown here.
314+
rstring vgwSessionId, boolean isCustomerSpeechData, int32 vgwVoiceChannelNumber, boolean endOfCallSignal
315+
This source operator will set the appropriate values for these attributes to
316+
indicate which particular speaker (i.e. voice channel number) of a given voice call
317+
(i.e. session id) just ended the conversation. This tuple also has an attribute
318+
(i.e. isCustomerSpeechData) to tell whether that recently ended voice channel
319+
carried the speech data of a customer or an agent. More importantly, it will set
320+
a value of true for the endOfCallSignal attribute to indicate that it is an EOCS message and not a
321+
binary speech message. It was decided to use the same output port to send both of these
322+
messages in order to avoid any port locks and/or tuple ordering issues that may happen if we choose to
323+
do it using two different output ports. Downstream operators can make use of this
324+
"End Of Voice Call" signal as they see fit.
325+
307326
**There are multiple available output functions**, and output attributes can also be
308327
assigned values with any SPL expression that evaluates to the proper type.
309328
</description>
@@ -319,31 +338,7 @@
319338
<tupleMutationAllowed>false</tupleMutationAllowed>
320339
<cardinality>1</cardinality>
321340
<optional>false</optional>
322-
</outputPortSet>
323-
324-
<outputPortSet>
325-
<description>
326-
This port produces periodic output tuples to give an indication about the end of a
327-
specific speaker (i.e. channel) in a voice call that was in progress moments ago for
328-
the given IBM Voice Gateway session id. The schema for this port must have these
329-
three attributes with their correct data types as shown here.
330-
rstring vgwSessionId, boolean isCustomerSpeechData, int32 vgwVoiceChannelNumber
331-
This source operator will set the appropriate values for these attributes to
332-
indicate which particular speaker (i.e. voice channel number) of a given voice call
333-
(i.e. session id) just ended the conversation. This tuple also has an attribute
334-
(i.e. isCustomerSpeechData) to tell whether that recently ended voice channel
335-
carried the speech data of a customer or an agent. Downstream operators can make
336-
use of this "End Of Voice Call" signal as they see fit.
337-
</description>
338-
<expressionMode>Expression</expressionMode>
339-
<autoAssignment>false</autoAssignment>
340-
<completeAssignment>false</completeAssignment>
341-
<rewriteAllowed>true</rewriteAllowed>
342-
<windowPunctuationOutputMode>Free</windowPunctuationOutputMode>
343-
<tupleMutationAllowed>false</tupleMutationAllowed>
344-
<cardinality>1</cardinality>
345-
<optional>false</optional>
346-
</outputPortSet>
341+
</outputPortSet>
347342
</outputPorts>
348343
</cppOperatorModel>
349344
</operatorModel>

com.ibm.streamsx.sttgateway/com.ibm.streamsx.sttgateway.watson/IBMVoiceGatewaySource/IBMVoiceGatewaySource_cpp.cgt

Lines changed: 70 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
/*
99
============================================================
1010
First created on: Sep/20/2019
11-
Last modified on: Feb/07/2021
11+
Last modified on: Feb/09/2021
1212

1313
Please refer to the sttgateway-tech-brief.txt file in the
1414
top-level directory of this toolkit to read about
@@ -87,6 +87,15 @@ using websocketpp::lib::bind;
8787
my $audioOutputAsBlob = undef;
8888
my $outputAttrs1 = $outputPort1->getAttributes();
8989
my $speechAttributeFound = 0;
90+
my $vgwSessionIdAsString = undef;
91+
my $vgwSessionIdAttributeFound = 0;
92+
my $isCustomerSpeechDataAsBoolean = undef;
93+
my $isCustomerSpeechDataAttributeFound = 0;
94+
my $vgwVoiceChannelNumberAsInt32 = undef;
95+
my $vgwVoiceChannelNumberAttributeFound = 0;
96+
my $endOfCallSignalAsBoolean = undef;
97+
my $endOfCallSignalAttributeFound = 0;
98+
9099

91100
foreach my $outputAttr (@$outputAttrs1) {
92101
my $outAttrName = $outputAttr->getName();
@@ -100,64 +109,59 @@ using websocketpp::lib::bind;
100109
$audioOutputAsBlob = 1;
101110
}
102111
}
103-
}
104-
105-
if ($speechAttributeFound == 0 ) {
106-
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_CHECK1("IBMVoiceGatewaySource", "speech"),
107-
$model->getContext()->getSourceLocation());
108-
}
109-
110-
if (!(defined($audioOutputAsBlob))) {
111-
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_TYPE_CHECK1("IBMVoiceGatewaySource", "speech", "blob"),
112-
$model->getContext()->getSourceLocation());
113-
}
114-
115-
# Check the output port number 1 i.e. the second output port.
116-
my $outputPort2 = $model->getOutputPortAt(1);
117-
my $outputTupleName2 = $outputPort2->getCppTupleName();
118-
my $vgwSessionIdAsString = undef;
119-
my $outputAttrs2 = $outputPort2->getAttributes();
120-
my $vgwSessionIdAttributeFound = 0;
121-
my $isCustomerSpeechDataAsBoolean = undef;
122-
my $isCustomerSpeechDataAttributeFound = 0;
123-
my $vgwVoiceChannelNumberAsInt32 = undef;
124-
my $vgwVoiceChannelNumberAttributeFound = 0;
125-
126-
foreach my $outputAttr2 (@$outputAttrs2) {
127-
my $outAttrName2 = $outputAttr2->getName();
128-
my $outAttrType2 = $outputAttr2->getSPLType();
129112

130-
if ($outAttrName2 eq "vgwSessionId") {
113+
if ($outAttrName eq "vgwSessionId") {
131114
$vgwSessionIdAttributeFound = 1;
132115

133-
if ($outAttrType2 eq "rstring") {
116+
if ($outAttrType eq "rstring") {
134117
# This tuple attribute will carry the Voice Gateway Session Id.
135118
$vgwSessionIdAsString = 1;
136119
}
137120
}
138121

139-
if ($outAttrName2 eq "isCustomerSpeechData") {
122+
if ($outAttrName eq "isCustomerSpeechData") {
140123
$isCustomerSpeechDataAttributeFound = 1;
141124

142-
if ($outAttrType2 eq "boolean") {
125+
if ($outAttrType eq "boolean") {
143126
# This tuple attribute will indicate whether the
144127
# given channel of a given voice call carried the
145128
# speech data of a customer or an agent.
146129
$isCustomerSpeechDataAsBoolean = 1;
147130
}
148131
}
149132

150-
if ($outAttrName2 eq "vgwVoiceChannelNumber") {
133+
if ($outAttrName eq "vgwVoiceChannelNumber") {
151134
$vgwVoiceChannelNumberAttributeFound = 1;
152135

153-
if ($outAttrType2 eq "int32") {
136+
if ($outAttrType eq "int32") {
154137
# This tuple attribute will indicate the
155138
# channel number of given voice call.
156139
$vgwVoiceChannelNumberAsInt32 = 1;
157140
}
158141
}
142+
143+
if ($outAttrName eq "endOfCallSignal") {
144+
$endOfCallSignalAttributeFound = 1;
145+
146+
if ($outAttrType eq "boolean") {
147+
# This tuple attribute will indicate whether the
148+
# given channel of a given voice call has ended
149+
# sending speech data by closing its WebSocket connection.
150+
$endOfCallSignalAsBoolean = 1;
151+
}
152+
}
153+
}
154+
155+
if ($speechAttributeFound == 0 ) {
156+
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_CHECK1("IBMVoiceGatewaySource", "speech"),
157+
$model->getContext()->getSourceLocation());
159158
}
160159

160+
if (!(defined($audioOutputAsBlob))) {
161+
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_TYPE_CHECK1("IBMVoiceGatewaySource", "speech", "blob"),
162+
$model->getContext()->getSourceLocation());
163+
}
164+
161165
if ($vgwSessionIdAttributeFound == 0 ) {
162166
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_CHECK2("IBMVoiceGatewaySource", "vgwSessionId"),
163167
$model->getContext()->getSourceLocation());
@@ -187,6 +191,16 @@ using websocketpp::lib::bind;
187191
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_TYPE_CHECK2("IBMVoiceGatewaySource", "vgwVoiceChannelNumber", "int32"),
188192
$model->getContext()->getSourceLocation());
189193
}
194+
195+
if ($endOfCallSignalAttributeFound == 0 ) {
196+
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_CHECK2("IBMVoiceGatewaySource", "endOfCallSignal"),
197+
$model->getContext()->getSourceLocation());
198+
}
199+
200+
if (!(defined($endOfCallSignalAsBoolean))) {
201+
SPL::CodeGen::exitln(SttGatewayResource::STTGW_OUT_ATTRIBUTE_TYPE_CHECK2("IBMVoiceGatewaySource", "endOfCallSignal", "boolean"),
202+
$model->getContext()->getSourceLocation());
203+
}
190204

191205
# Following are the operator parameters.
192206
my $tlsPort = $model->getParameterByName("tlsPort");
@@ -1035,11 +1049,12 @@ void MY_OPERATOR::on_message(EndpointType* s, websocketpp::connection_hdl hdl,
10351049
if (vgwSessionIdFoundInMap == true) {
10361050
// Send the "End of Voice Call" signal now for this
10371051
// vgwSessionId_vgwVoiceChannelNumber combo.
1038-
OPort1Type oTuple;
1052+
OPort0Type oTuple;
10391053
oTuple.set_vgwSessionId(con_metadata.vgwSessionId);
10401054
oTuple.set_isCustomerSpeechData(con_metadata.vgwIsCaller);
10411055
oTuple.set_vgwVoiceChannelNumber(con_metadata.vgwVoiceChannelNumber);
1042-
submit(oTuple, 1);
1056+
oTuple.set_endOfCallSignal(true);
1057+
submit(oTuple, 0);
10431058

10441059
if (vgwSessionLoggingNeeded == true) {
10451060
SPLAPPTRC(L_ERROR, "Operator " << operatorPhysicalName <<
@@ -1135,17 +1150,19 @@ void MY_OPERATOR::on_message(EndpointType* s, websocketpp::connection_hdl hdl,
11351150
// vgwSessionId_vgwVoiceChannelNumber combo.
11361151
// Send it for voice channel 1 which is an
11371152
// agent channel most of the time.
1138-
OPort1Type oTuple;
1153+
OPort0Type oTuple;
11391154
oTuple.set_vgwSessionId(*it);
11401155
oTuple.set_isCustomerSpeechData(false);
11411156
oTuple.set_vgwVoiceChannelNumber(1);
1142-
submit(oTuple, 1);
1157+
oTuple.set_endOfCallSignal(true);
1158+
submit(oTuple, 0);
11431159
// Do the same for voice channel 2 which is a
11441160
// customer channel most of the time.
11451161
oTuple.set_vgwSessionId(*it);
11461162
oTuple.set_isCustomerSpeechData(true);
11471163
oTuple.set_vgwVoiceChannelNumber(2);
1148-
submit(oTuple, 1);
1164+
oTuple.set_endOfCallSignal(true);
1165+
submit(oTuple, 0);
11491166

11501167
// We have a map where the agent and caller phone numbers of a given
11511168
// call session id are stored. Since this call has gone stale,
@@ -1205,11 +1222,12 @@ void MY_OPERATOR::on_message(EndpointType* s, websocketpp::connection_hdl hdl,
12051222
// do its own clean-up and release of the STT engines.
12061223
// Send the "End of Voice Call" signal now for this
12071224
// vgwSessionId_vgwVoiceChannelNumber combo.
1208-
OPort1Type oTuple;
1225+
OPort0Type oTuple;
12091226
oTuple.set_vgwSessionId(cmd.vgwSessionId);
12101227
oTuple.set_isCustomerSpeechData(cmd.vgwIsCaller);
12111228
oTuple.set_vgwVoiceChannelNumber(cmd.vgwVoiceChannelNumber);
1212-
submit(oTuple, 1);
1229+
oTuple.set_endOfCallSignal(true);
1230+
submit(oTuple, 0);
12131231

12141232
// Added this logic on Sep/04/2020.
12151233
// We have a map where the agent and caller phone numbers of a given
@@ -1368,6 +1386,7 @@ void MY_OPERATOR::on_message(EndpointType* s, websocketpp::connection_hdl hdl,
13681386
speechBlob.setData((unsigned char*)payloadBuffer, (uint64_t)payloadSize);
13691387
OPort0Type oTuple;
13701388
oTuple.set_speech(speechBlob);
1389+
oTuple.set_endOfCallSignal(false);
13711390

13721391
// Now let us set any attributes that the caller of this operator is trying to
13731392
// assign through this operator's output functions.
@@ -1565,11 +1584,12 @@ void MY_OPERATOR::on_close(websocketpp::connection_hdl hdl) {
15651584
if (vgwSessionIdFoundInMap == true && con_metadata.vgwVoiceChannelNumber > 0) {
15661585
// Send the "End of Voice Call" signal now for this
15671586
// vgwSessionId_vgwVoiceChannelNumber combo.
1568-
OPort1Type oTuple;
1587+
OPort0Type oTuple;
15691588
oTuple.set_vgwSessionId(con_metadata.vgwSessionId);
15701589
oTuple.set_isCustomerSpeechData(con_metadata.vgwIsCaller);
15711590
oTuple.set_vgwVoiceChannelNumber(con_metadata.vgwVoiceChannelNumber);
1572-
submit(oTuple, 1);
1591+
oTuple.set_endOfCallSignal(true);
1592+
submit(oTuple, 0);
15731593

15741594
if (vgwSessionLoggingNeeded == true) {
15751595
SPLAPPTRC(L_ERROR, "Operator " << operatorPhysicalName <<
@@ -1688,17 +1708,19 @@ void MY_OPERATOR::on_close(websocketpp::connection_hdl hdl) {
16881708
// vgwSessionId_vgwVoiceChannelNumber combo.
16891709
// Send it for voice channel 1 which is an
16901710
// agent channel most of the time.
1691-
OPort1Type oTuple;
1711+
OPort0Type oTuple;
16921712
oTuple.set_vgwSessionId(*it);
16931713
oTuple.set_isCustomerSpeechData(false);
16941714
oTuple.set_vgwVoiceChannelNumber(1);
1695-
submit(oTuple, 1);
1715+
oTuple.set_endOfCallSignal(true);
1716+
submit(oTuple, 0);
16961717
// Do the same for voice channel 2 which is a
16971718
// customer channel most of the time.
16981719
oTuple.set_vgwSessionId(*it);
16991720
oTuple.set_isCustomerSpeechData(true);
17001721
oTuple.set_vgwVoiceChannelNumber(2);
1701-
submit(oTuple, 1);
1722+
oTuple.set_endOfCallSignal(true);
1723+
submit(oTuple, 0);
17021724

17031725
// We have a map where the agent and caller phone numbers of a given
17041726
// call session id are stored. Since this call has gone stale,
@@ -1758,11 +1780,12 @@ void MY_OPERATOR::on_close(websocketpp::connection_hdl hdl) {
17581780
// do its own clean-up and release of the STT engines.
17591781
// Send the "End of Voice Call" signal now for this
17601782
// vgwSessionId_vgwVoiceChannelNumber combo.
1761-
OPort1Type oTuple;
1783+
OPort0Type oTuple;
17621784
oTuple.set_vgwSessionId(cmd.vgwSessionId);
17631785
oTuple.set_isCustomerSpeechData(cmd.vgwIsCaller);
17641786
oTuple.set_vgwVoiceChannelNumber(cmd.vgwVoiceChannelNumber);
1765-
submit(oTuple, 1);
1787+
oTuple.set_endOfCallSignal(true);
1788+
submit(oTuple, 0);
17661789

17671790
// Added this logic on Sep/04/2020.
17681791
// We have a map where the agent and caller phone numbers of a given

com.ibm.streamsx.sttgateway/info.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414

1515
**Note:** This toolkit requires c++11 support.
1616
</description>
17-
<version>2.2.8</version>
17+
<version>2.2.9</version>
1818
<requiredProductVersion>4.2.1.6</requiredProductVersion>
1919
</identity>
2020
<dependencies>

samples/STTGatewayUtils/com.ibm.streamsx.sttgateway.utils/STTGatewayUtils.spl

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
/*
22
==============================================
33
# Licensed Materials - Property of IBM
4-
# Copyright IBM Corp. 2018, 2020
4+
# Copyright IBM Corp. 2018, 2021
55
==============================================
66
*/
77

88
/*
99
==============================================
1010
First created on: Nov/24/2020
11-
Last modified on: Nov/26/2020
11+
Last modified on: Feb/09/2021
1212

1313
This is a utility composite that will get used in the following applications.
1414

@@ -25,7 +25,7 @@ namespace com.ibm.streamsx.sttgateway.utils;
2525
// Code for the C++ native functions can be found in the impl/include directory of this project.
2626
//
2727
// The following is the schema of the first output stream for the
28-
// IBMVoiceGatewaySource operator. The first four attributes are
28+
// IBMVoiceGatewaySource operator. The first five attributes are
2929
// very important and the other ones are purely optional if some
3030
// scenarios really require them.
3131
// blob speech --> Speech fragments of a live conversation as captured and sent by the IBM Voice Gateway.
@@ -36,6 +36,9 @@ namespace com.ibm.streamsx.sttgateway.utils;
3636
// Whoever (caller or agent) sends the first round of
3737
// speech data bytes will get assigned a voice channel of 1.
3838
// The next one to follow will get assigned a voice channel of 2.
39+
// boolean endOfCallSignal --> This attribute will be set to true by the IBMVoiceGatewaySource
40+
// operator when it sends an EOCS for a voice channel. It will be
41+
// set to false by that operator when it sends binary speech data.
3942
// rstring id --> This attribute is needed by the WatsonS2T operator.
4043
// It is set to vgwSessionId_vgwVoiceChannelNumber
4144
// rstring callStartDateTime --> Call start date time i.e. system clock time.
@@ -47,18 +50,11 @@ namespace com.ibm.streamsx.sttgateway.utils;
4750
// int32 speechEngineId --> This attribute will be set in the speech processor job. (Please, read the comments there.)
4851
// int32 speechResultProcessorId --> This attribute will be set in the speech processor job. (Please, read the comments there.)
4952
type BinarySpeech_t = blob speech, rstring vgwSessionId, boolean isCustomerSpeechData,
50-
int32 vgwVoiceChannelNumber, rstring id, rstring callStartDateTime,
53+
int32 vgwVoiceChannelNumber, boolean endOfCallSignal,
54+
rstring id, rstring callStartDateTime,
5155
rstring callerPhoneNumber, rstring agentPhoneNumber,
5256
int32 speechDataFragmentCnt, int32 totalSpeechDataBytesReceived,
5357
int32 speechProcessorId, int32 speechEngineId, int32 speechResultProcessorId;
54-
// The following schema is for the second output stream of the
55-
// IBMVoiceGatewaySource operator. It has three attributes indicating
56-
// the speaker channel (vgwVoiceChannelNumber) of a given voice call (vgwSessionId) who
57-
// got completed with the call as well as an indicator (isCustomerSpeechData) to
58-
// denote whether the speech data we received on this channel belonged
59-
// to a caller or an agent.
60-
type EndOfCallSignal_t = rstring vgwSessionId,
61-
boolean isCustomerSpeechData, int32 vgwVoiceChannelNumber;
6258

6359
// The following schema will be for the data being sent here by the
6460
// VgwDataRouter application. It sends us raw binary data which

0 commit comments

Comments
 (0)