|
| 1 | +# SQS |
| 2 | +SQS is amazon's managed message queue. Thus, it should follow the [Open Telemetry specification for Messaging systems](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/messaging.md). |
| 3 | + |
| 4 | +## Specific trace semantic |
| 5 | +Following methods needs specific attention: |
| 6 | + |
| 7 | +### sendMessage / sendMessageBatch |
| 8 | +- Add [message attributes](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/messaging.md#messaging-attributes) to span in addition to the default attributes. These attributes are covered by the library according to the spec. |
| 9 | +- Inject trace context as SQS MessageAttributes, so the service receiving the message can link cascading spans to the trace which created the message. This is not implemented yet. |
| 10 | + |
| 11 | +### receiveMessage |
| 12 | +- Add [message attributes](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/messaging.md#messaging-attributes) to span in addition to the default attributes. These attributes are covered by the library according to the spec. |
| 13 | +- Create additional "processing spans" for each message received by the application. So if an application called `receiveMessage`, and got back 10 messages, a single `messaging.operation` = `receive` span will be created for the `receiveMessage` operation, and 10 `messaging.operation` = `process` spans will be created, one for each message. Those processing spans are created by the library. This behavior is partially implemented, [See discussion below](#processing-spans). |
| 14 | +- Set the inter process context correctly, so that additional spans created from message receiving and message processing will be linked to parent spans correctly. This behavior is partially implemented, [See discussion below](#processing-spans). |
| 15 | +- Extract trace context from SQS MessageAttributes, and set span's `parent` and `links` correctly according to the spec. This is not implemented yet. |
| 16 | + |
| 17 | +#### Processing spans |
| 18 | +According to open telemetry specification (and to reasonable expectation for trace structure), user of this library would expect to see one span for the operation of receiving messages batch from sqs, and then, for each message, a span with it's own sub-tree for the processing of this specific message. |
| 19 | + |
| 20 | +For example, if a `receiveMessages` returned 2 messages: msg1 is storing something to a DB, and msg2 is calling an external http endpoint, we should link the db span under msg1, and the http span under msg2, instead of mixing all those operations under the single `receive` span, or start a new trace for each of them. |
| 21 | + |
| 22 | +Unfortunately, this is not so easy to implement in JS: |
| 23 | +1. The SDK is calling a single callback for the messages batch, and it's not straight forward to understand when each individual message processing starts and ends (and set the context correctly for cascading spans). |
| 24 | +2. If async/await is used, context can be lost when returning data from async functions, for example: |
| 25 | +```js |
| 26 | +async function asyncRecv() { |
| 27 | + const data = await sqs.receiveMessage(recvParams).promise(); |
| 28 | + // context of receiveMessage is set here |
| 29 | + return data; |
| 30 | +} |
| 31 | + |
| 32 | +async function poll() { |
| 33 | + const result = await asyncRecv(); |
| 34 | + // context is lost when asyncRecv returns. following spans are created with root context. |
| 35 | + await Promise.all(result.Messages.map((message) => this.processMessage(message))); |
| 36 | +} |
| 37 | +``` |
| 38 | + |
| 39 | +Current implementation partially solves this issue by patching the `map` \ `forEach` functions on the `Messages` array of `receiveMessage` result. This handles issues like the one above, but will not handle situations where the processing is done in other patterns (multiple map\forEach calls, index access to the array, other array operations, etc). This is currently an open issue in the plugin. |
0 commit comments