Skip to content

Conversation

@kalleep
Copy link
Contributor

@kalleep kalleep commented Dec 4, 2025

PR Description

During hackaton I decided to explore new api for our internal / vendored tail package.

The package had questionable locking and it is really hard to determine if it was done correctly. The design of it also spawned 2 goroutines per file we tailed and everything was communicated over a channel.

So the design I came up with is a much smaller api surface with only two public methods, Next and Stop.

Next will return next line of the file or block until either a file event occurs or the file is stopped.
Stop will cancel any blocked calls to Next and close the file.

I also added offset to Line, this is the offset into the file right after the line we just consumed. This allows us to remove an additional goroutine in tailer.go, the backgroud job updating position and metrics. We can instead do this during the read loop still respecting the interval for updates.

I ran two alloy collectors with the same config and tailing 20 files

2025-12-04-11:52:59

This will drastically reduce number of goroutines and reduce the memory overhead that comes with that for alloy instances tailing many files.

I also removed usage of loki.EntryHandler and set labels directly on entry we send, this will remove one additional goroutine.

So in total it would be 4 less goroutines per filed taild.

Which issue(s) this PR fixes

Notes to the Reviewer

  • Watcher was refactored into two functions instead, one that block until a file exists or context is canceled and one that blocks until a file event is detected, no need to run these as backgroud jobs and send events through channels

PR Checklist

  • CHANGELOG.md updated
  • Documentation added
  • Tests updated
  • Config converters updated

@kalleep kalleep requested a review from a team as a code owner December 4, 2025 10:56
Copy link
Contributor Author

@kalleep kalleep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what I should put in the changelog, any suggestions?

@kalleep kalleep added publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository and removed publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository labels Dec 4, 2025
@kalleep kalleep requested a review from Copilot December 4, 2025 14:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the internal tail package to simplify its API and reduce resource usage. The refactoring eliminates 4 goroutines per tailed file by replacing the channel-based design with direct method calls (Next() and Stop()), consolidating position updates into the main read loop, and removing the loki.EntryHandler wrapper. The new design reduces memory overhead and complexity while maintaining all existing functionality.

Key Changes:

  • Simplified tail package API from channel-based to blocking method calls with Next() and Stop() methods
  • Added Offset field to Line struct to enable inline position tracking without a separate goroutine
  • Consolidated file watching logic into standalone blocking functions instead of background goroutines

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
internal/component/loki/source/file/tailer.go Refactored to use new tail API, removed separate position update goroutine, merged label handling into entry creation
internal/component/loki/source/file/tailer_test.go Updated test to accommodate new tailer startup behavior
internal/component/loki/source/file/internal/tail/file.go New implementation of file tailing with blocking Next() method
internal/component/loki/source/file/internal/tail/file_test.go Comprehensive test suite for new File implementation
internal/component/loki/source/file/internal/tail/line.go Added Line struct with Offset field for position tracking
internal/component/loki/source/file/internal/tail/config.go New configuration structures for File and WatcherConfig
internal/component/loki/source/file/internal/tail/block.go Standalone blocking functions for file existence and event detection
internal/component/loki/source/file/internal/tail/fileext/file_*.go Consolidated and renamed from winfile package, includes platform-specific file operations
internal/component/loki/source/file/internal/tail/watch/*.go Removed old watch package with channel-based file watching
internal/component/loki/source/file/internal/tail/tail.go Removed old channel-based tail implementation
internal/component/loki/source/file/internal/tail/util/util.go Removed unused utility package
internal/component/loki/source/file/internal/tail/README.md Removed outdated README referencing old implementation
Comments suppressed due to low confidence (1)

internal/component/loki/source/file/internal/tail/fileext/file_windows.go:104

  • Typo in comment: "ended information" should be "extended information" to match the description above.

@kalleep kalleep added publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository and removed publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository labels Dec 4, 2025
@kalleep kalleep force-pushed the kalleep/refactor-tailer branch from 9d7217a to 1fa3763 Compare December 5, 2025 09:02
@kalleep kalleep added os:windows publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository and removed os:windows publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository labels Dec 5, 2025
@kalleep kalleep marked this pull request as draft December 5, 2025 10:35
@kalleep kalleep force-pushed the kalleep/refactor-tailer branch from 4d413e9 to 1fa3763 Compare December 5, 2025 10:53
@kalleep
Copy link
Contributor Author

kalleep commented Dec 5, 2025

Deployed this to our biggest internal cluster:
2025-12-05-14:05:00

@kalleep kalleep added publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository and removed publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository labels Dec 8, 2025
@kalleep kalleep marked this pull request as ready for review December 8, 2025 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

os:windows publish-dev:linux builds and deploys an image to grafana/alloy-dev container repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant