feat: extract text content from message blocks in conversations_history#190
feat: extract text content from message blocks in conversations_history#190derodero24 wants to merge 1 commit intokorotovsky:masterfrom
Conversation
Messages using Slack's Block Kit format (emails forwarded via Slack, bot notifications from Grafana/Datadog, app messages) return empty text when retrieved via conversations_history. This extracts text from block structures so these messages are no longer empty. Closes korotovsky#186
|
Could you please provide a couple of screenshots/examples that illustrate before/after results? |
|
@korotovsky Sure! Here are before/after examples based on real Slack messages. Example 1: CloudWatch Alarm (attachment with Block Kit only)This is a real message from an AWS monitoring bot. The message has Attachment JSON (simplified): {
"text": "",
"attachments": [{
"title": "",
"text": "",
"blocks": [
{"type": "section", "text": {"type": "mrkdwn", "text": "*<URL|:rotating_light: CloudWatch Alarm | MyAlarm | ap-northeast-1>*"}},
{"type": "section", "text": {"type": "mrkdwn", "text": "Threshold Crossed: 1 out of the last 1 datapoints [1.0] was greater than or equal to the threshold (1.0)"}},
{"type": "actions", "elements": [...]},
{"type": "section", "fields": [{"type": "mrkdwn", "text": "*Namespace*\nMyNamespace"}, {"type": "mrkdwn", "text": "*Metric*\nMyMetric"}]},
{"type": "section", "fields": [{"type": "mrkdwn", "text": "*Alarm State*\nALARM"}]},
{"type": "image", "...": "..."},
{"type": "context", "elements": [{"type": "mrkdwn", "text": "<!date^1738900000^{date_short_pretty} at {time}|2025-02-07>"}]}
]
}]
}Before — After — Block Kit content is extracted from Example 2: Datadog Alert (legacy attachment format — no change)A Datadog alert uses legacy attachment fields ( {
"text": "",
"attachments": [{
"title": "Triggered: ServerError - /ecs/my-service",
"text": "Host: /ecs/my-service\nLog status: error\nMore than 1 log event matched..."
}]
}Before and After — Same output (already extracted via legacy fields): Example 3: Top-level Block Kit message (bot with
|
|
@derodero24 thank you, and how does serialized CSV message look like in such cases? |
|
@korotovsky Here's the actual serialized CSV output generated by running the code against representative Block Kit messages: Example 1: CloudWatch Alarm (attachment with blocks only)Before: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770433087.532289,U001,aws-bot,Amazon Q Developer,C001,,,2026-02-07T02:58:02Z,Amazon Q Developer,After: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770433087.532289,U001,aws-bot,Amazon Q Developer,C001,,"Blocks: https://console.aws.amazon.com/cloudwatch - :rotating_light: CloudWatch Alarm MyAlarm ap-northeast-1 Account: 123456789012, Threshold Crossed: 1 out of the last 1 datapoints 1.0 07/02/26 02:57:00 was greater than or equal to the threshold 1.0 Namespace MyNamespace Metric MyMetric Alarm State ALARM",2026-02-07T02:58:02Z,Amazon Q Developer,Example 2: Bot message with top-level blocksBefore: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770400000.000001,U002,deploy-bot,Deploy Bot,C002,,,2026-02-07T10:00:00Z,Deploy Bot,After: MsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770400000.000001,U002,deploy-bot,Deploy Bot,C002,,Deploy Complete Service my-api deployed to production Deployed by CI/CD pipeline,2026-02-07T10:00:00Z,Deploy Bot,Example 3: Legacy attachment (no blocks) — unchangedMsgID,UserID,UserName,RealName,Channel,ThreadTs,Text,Time,BotName,Cursor
1770433155.658769,U003,datadog,Datadog,C001,,Title: Triggered: ServerError Text: Host: /ecs/my-service Log status: error More than 1 log event matched,2026-02-07T02:59:15Z,Datadog, |
Summary
Extract text from Slack Block Kit structures in messages retrieved via
conversations_historyandconversations_replies, so that block-only messages (emails, bot notifications, app messages) are no longer returned empty.Closes #186
Problem
Messages using Block Kit format (e.g., Slack Email integration, Grafana/Datadog alerts) have an empty
Textfield. The existingAttachmentsTo2CSVextracts text from attachment fields but ignores:blocksarrayattachments[].blocksarraySolution
BlocksToTextfunction inpkg/text/text_processor.goextracts text from common block types:header.textsection.textandsection.fieldsrich_text(sections, lists, quotes, preformatted)context.elements[].textmsg.Textis empty, fall back toBlocksToText(msg.Blocks)— avoids duplication since Slack typically populatestextas a plaintext fallback of blocksAttachmentToTextnow also extractsatt.BlocksChanges
pkg/text/text_processor.go: AddBlocksToTextand rich text helper functions; callBlocksToTextinAttachmentToTextpkg/handler/conversations.go: Use block text as fallback inconvertMessagesFromHistoryandconvertMessagesFromSearchpkg/text/text_processor_test.go: Add 15 test cases covering all supported block types, edge cases, and attachment integrationTesting
go test -run TestUnit ./...)