-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Enables DefaultListFilesCache by default #19366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
73c6672
da08e25
ab4c602
de5ec9f
80f7d63
fef57a3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -45,15 +45,23 @@ directly on the `Field`. For example: | |||||
| In prior versions, `ListingTableProvider` would issue `LIST` commands to | ||||||
| the underlying object store each time it needed to list files for a query. | ||||||
| To improve performance, `ListingTableProvider` now caches the results of | ||||||
| `LIST` commands for the lifetime of the `ListingTableProvider` instance. | ||||||
| `LIST` commands for the lifetime of the `ListingTableProvider` instance or | ||||||
| until a cache entry expires. | ||||||
|
|
||||||
| Note that by default the cache has no expiration time, so if files are added or removed | ||||||
| from the underlying object store, the `ListingTableProvider` will not see | ||||||
| those changes until the `ListingTableProvider` instance is dropped and recreated. | ||||||
|
|
||||||
| You will be able to configure the maximum cache size and cache expiration time via a configuration option: | ||||||
| You can configure the maximum cache size and cache entry expiration time via configuration options: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||||||
|
|
||||||
| See <https://github.com/apache/datafusion/issues/19056> for more details. | ||||||
| `datafusion.runtime.list_files_cache_limit` | ||||||
|
||||||
| `datafusion.runtime.list_files_cache_limit` | |
| * `datafusion.runtime.list_files_cache_limit` |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those do not render well at the moment - https://github.com/BlakeOrth/datafusion/blob/73c667216ee2fcb85196694cc5a36633f9928c19/docs/source/library-user-guide/upgrading.md#listingtableprovider-now-caches-list-commands
Also it would be good to mention the units of their values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the technical "unit" and a quick explanation for both of these, but since the actual user input is a string that undergoes parsing (e.g. "1m30s") I also linked out to the user guide to direct users to the more detailed information.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `datafusion.runtime.list_files_cache_ttl` | |
| * `datafusion.runtime.list_files_cache_ttl` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* made prettiercomplain so I had to go for - instead. In either case these are formatting better now, thanks!
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Caching can be disable by setting the limit to 0: | |
| Caching can be disabled by setting the limit to 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user has disabled caching with
list_files_cache_limit = "0K"thenNonewill be returned here, but in this caseget_list_files_cache_limit()will return "1M"