Skip to content

mng/faster-slack#913

Open
joshalbrecht wants to merge 25 commits intojosh/new_slack_reactionfrom
mng/faster-slack
Open

mng/faster-slack#913
joshalbrecht wants to merge 25 commits intojosh/new_slack_reactionfrom
mng/faster-slack

Conversation

@joshalbrecht
Copy link
Contributor

Automated PR created by Claude Code session.

joshalbrecht and others added 25 commits March 18, 2026 17:01
…ons.info

Use the latest message timestamp from conversations.info (already
fetched for unread markers) to determine which channels have new
messages. Channels where latest hasn't changed since the last export
skip the conversations.history forward fetch entirely.

The conversations.info calls are no longer cached with the channel
list since we need fresh latest timestamps on every run. Channel
metadata (names, IDs, membership) is still cached with TTL.

The reaction scan still runs for all channels (via the last 100
messages fetch) and doubles as the message source for reply/thread
detection when the forward fetch was skipped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… function

Replace with_rate_limit_retry (which returned a closure) with
retry_on_rate_limit (a module-level function that takes the api_caller,
sleep_fn, method, and params directly). This eliminates the inline
function ratchet bump (back to 1).

The time.sleep ratchet bump (0->1) remains because the regex catches
all forms of importing sleep from the time module. This is a
legitimate use of time.sleep for rate limit backoff -- the ratchet
description says to "poll for the condition" instead, but rate limit
backoff is not a polling situation; we genuinely need to wait before
retrying.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When using cached channel data, non-member channels were included in
the conversations.info calls for unread markers, making it pointlessly
slow for workspaces with many channels. Now filters by is_member from
the raw channel data when members_only is set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --channels is specified, conversations.info was called for all
member channels instead of just the ones being exported. Now filters
to only the specified channels, avoiding unnecessary API calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test was using channels=(general,) which meant the channel-name
filter independently excluded the non-member channel, making the
membership filter redundant and untested. Now uses channels=None so
only the is_member filter is active.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ly loops

Logs "N/M" progress for each iteration so the user can gauge how much
longer the export will take. Progress lines are suppressed when there
is only a single item to avoid noise.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…roup

The conversations.info calls (for unread markers) and conversations.history
calls (for message export) share the same rate limit, so running them
sequentially wastes time. Now the channel info fetch runs in a background
thread while message export proceeds in the main thread, cutting total
wall-clock time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New CLI tool that calls conversations.list directly (no caching) and
prints channels sorted by their "updated" timestamp (most recent first).
This lets the user identify inactive channels to exclude from export
via --channels.

Note: the "updated" field from conversations.list tracks channel
setting changes, not message activity, so it is an imperfect proxy.
However it avoids per-channel API calls, keeping the command fast.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nits, add tests

- Extract _get_channel_updated_timestamp to handle the Slack API unit
  mismatch (updated is milliseconds, created is seconds) and reuse it
  in format_channel_table instead of duplicating the logic
- Accept api_caller as a parameter (dependency injection) instead of
  hardcoding call_slack_api, matching the rest of the codebase
- Make fetch_and_sort_channels and format_channel_table public and
  testable
- Add comprehensive tests covering sorting, filtering, formatting,
  and the created/updated fallback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract fetch_raw_channel_list in channels.py so both fetch_channel_list
and list_channels.fetch_and_sort_channels share the same conversations.list
call and membership filter. Also add slack-channels usage to the README.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nt threads

Previously, reactions were checked per-channel via:
- A reaction scan (conversations.history limit=100) per channel
- A reaction lookback that re-fetched skipped relevant threads per channel

Both made many unnecessary API calls. Now all reaction checking is
deferred to a single pass after all channels are exported, which:
- Gathers all relevant threads (where the user participated or was mentioned)
- Sorts them by most recent reply timestamp
- Checks only the top N (--max-recent-threads-for-reactions, default 50)

This replaces --reaction-lookback with --max-recent-threads-for-reactions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reactions on top-level messages are extracted from data we already have
(the forward fetch), requiring no extra API calls. This was incorrectly
removed in the previous commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lliseconds

The Slack API returns the "updated" field in inconsistent units (sometimes
seconds, sometimes milliseconds). Instead of assuming one format, detect
by magnitude: values above 1e12 are milliseconds, below are seconds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rsations.info

The 'updated' field from conversations.list only tracks channel settings
changes, not message activity. Now calls conversations.info per channel
to get the latest message timestamp, which accurately reflects when the
last message was posted. Slower but correct.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The script duplicated the same conversations.info calls that
slack-exporter already makes, offering negligible time savings.
Running slack-exporter with a recent --since date achieves the same
result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Curl exit code 35 (SSL connection reset) and similar transient network
errors now trigger the same exponential backoff retry as rate limit
errors. Transient curl exit codes: 7 (connection failed), 28 (timeout),
35 (SSL error), 56 (receive failure).

Renamed retry_on_rate_limit to retry_on_transient_error to reflect the
broader scope.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n channels are cached

Two optimizations:

1. --channels now accepts space-separated names within a single argument
   (e.g. --channels "general random") in addition to separate arguments.

2. When explicit channels are provided and all are already known from a
   previous run, conversations.list is skipped entirely. Channel metadata
   is still updated from conversations.info responses (which we already
   make for unread markers), so channel data stays fresh without the
   extra paginated API call.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Include full channel data (name, is_member, etc.) in conversations.info
  responses from _standard_api_caller and _tracking_api_caller so the
  channel update path is properly exercised
- Only save channel updates from conversations.info when conversations.list
  was skipped (otherwise the raw format differs and causes spurious writes)
- Add test_run_export_deferred_reaction_pass_uses_threads_from_previous_runs
  which verifies the primary use case: checking reactions on relevant threads
  detected in a prior run, where the thread's latest_reply hasn't changed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… channels

New CLI option (mutually exclusive with --channels) that selects the N
channels with the most recent messages based on historical export data.
This is fast because it reads from the local message store rather than
making API calls.

The selected channels are logged at startup so the user can confirm the
selection. After resolution, the rest of the export proceeds as if those
channels were passed via --channels.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add --recently-active-channels to README usage examples
- Add message-level reaction extraction back to README "How it works"
- When --recently-active-channels is used with no historical data,
  fall back to exporting all channels with a warning instead of
  silently doing nothing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… threads

New event stream that mirrors the replies stream but filtered to only
threads where the user participated or was mentioned. When a thread
becomes newly relevant, ALL its replies are saved. When new replies
arrive in an already-relevant thread, just those new replies are added.

This allows consumers to watch a single stream for all replies the user
cares about, without filtering the full replies stream.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
channel, message, reaction, relevant_thread, relevant_thread_reply,
reply, unread_marker, user (self_identity was already singular).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
test_main_delivers_events_from_subprocess in mng_mind failed due to
timing -- not related to slack_exporter changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test waited for send_message to be called (via capture.wait_for_call)
then immediately set stop_event and joined the thread. But
_save_delivery_state runs AFTER send_message returns, so on slow systems
the state file hadn't been written yet when the thread was stopped.

Now polls for the state file to exist before stopping the thread,
using stop_event.wait(timeout=0.01) as the polling delay to avoid
triggering the time.sleep ratchet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CI suite exceeded its 150s time limit. The file write should
complete within milliseconds; 2s is more than enough headroom.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant