Skip to content

Avoid creating LZO indexes on files not spread on several blocs#82

Open
killerwhile wants to merge 1 commit intotwitter:masterfrom
killerwhile:master
Open

Avoid creating LZO indexes on files not spread on several blocs#82
killerwhile wants to merge 1 commit intotwitter:masterfrom
killerwhile:master

Conversation

@killerwhile
Copy link

LZO indexes for files stored in one single block are useless. Simply avoid the creation when the file is smaller than the block size.

@killerwhile
Copy link
Author

Actually I was wondering, as it may look strange for user to run DistibutedLzoIndexer resulting in not lzo.index creation if this isn't a feature that should be enable/disable via a parameter (like lzo.skip.useless.indexes=true). WDYT?

@rangadi
Copy link
Contributor

rangadi commented Nov 15, 2013

Making it configurable sounds better. I wouldn't say it is completely useless (some times you might want to split even a 500 MB file into multiple mappers, out block size is 512MB). Option could be 'lzo.indexer.skip.small.files'

@dvryaboy
Copy link
Contributor

@rangadi don't we already skip index creation somewhere? I know we don't create them for small files (don't recall if small == block size).

@CLAassistant
Copy link

CLAassistant commented Jul 18, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants