Skip to content

Use map/filter with built-in functions for performance#1237

Merged
bact merged 9 commits intodevfrom
copilot/replace-map-reduce-with-comprehension
Jan 30, 2026
Merged

Use map/filter with built-in functions for performance#1237
bact merged 9 commits intodevfrom
copilot/replace-map-reduce-with-comprehension

Conversation

Copy link
Contributor

Copilot AI commented Jan 30, 2026

What does this changes

Replaces list comprehensions with map() and filter() when calling built-in functions or methods. Removes redundant falsy value checks.

Optimized patterns:

[str(x) for x in items]           → list(map(str, items))
[x.strip() for x in items]        → list(map(str.strip, items))
[x.lower() for x in items]        → list(map(str.lower, items))
[x for x in items if x]           → list(filter(None, items))
[x for x in items if x is not None] → list(filter(None, items))  # when empty strings should also be filtered

Files modified:

  • benchmarks/word_tokenization.py
  • corpus/core.py
  • soundex/prayut_and_somchaip.py
  • spell/words_spelling_correction.py
  • tokenize/thaisumcut.py (also removed redundant empty string check before filter)
  • ulmfit/preprocess.py
  • util/date.py, emojiconv.py, normalize.py

What was wrong

List comprehensions with built-in functions have Python bytecode overhead. Built-in callables (str, int, str.strip, str.lower) are C implementations—passing them directly to map() avoids loop machinery. Similarly, filter(None, ...) is faster than comprehensions for removing falsy values.

In tokenize/thaisumcut.py, manually checking for empty strings before filter(None, ...) was redundant since filter(None) already removes them.

How this fixes it

Direct function references to map() and filter() execute in C with lazy evaluation. No lambda needed—performance gain without semantic change. List comprehensions retained only where lambda would be required or complex logic involved.

Your checklist for this pull request

  • Passed code styles and structures
  • Passed code linting checks and unit test
Original prompt

Scan the repo, replace map/reduce/lambda with simple list comprehension where it makes sense


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits January 30, 2026 01:27
…dules

Co-authored-by: bact <128572+bact@users.noreply.github.com>
Co-authored-by: bact <128572+bact@users.noreply.github.com>
Copilot AI changed the title [WIP] Replace map, reduce, and lambda with list comprehension Replace map/filter with list comprehensions Jan 30, 2026
Copilot AI requested a review from bact January 30, 2026 01:35
@bact bact added the refactoring a technical improvement which does not add any new features or change existing features. label Jan 30, 2026
Copilot AI and others added 2 commits January 30, 2026 01:39
Co-authored-by: bact <128572+bact@users.noreply.github.com>
…ormance

Co-authored-by: bact <128572+bact@users.noreply.github.com>
@bact bact requested a review from Copilot January 30, 2026 01:49
Copilot AI changed the title Replace map/filter with list comprehensions Use map/filter with built-in functions for performance optimization Jan 30, 2026
@coveralls
Copy link

coveralls commented Jan 30, 2026

Coverage Status

coverage: 53.534% (+0.001%) from 53.533%
when pulling b29d965 on copilot/replace-map-reduce-with-comprehension
into 03b92ab on dev.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR converts simple list comprehensions to list(map()) calls with built-in functions in four files. However, the PR title "Replace map/filter with list comprehensions" contradicts the actual direction of changes (which replace comprehensions WITH map).

Changes:

  • Converted [str(i) for i in range(...)] to list(map(str, range(...))) in date utilities
  • Converted [i.strip() for i in ...] patterns to list(map(str.strip, ...)) in spell correction and sentence segmentation
  • Converted [tok.lower() for tok in toks] to list(map(str.lower, toks)) in text preprocessing

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
pythainlp/util/date.py Changed date range generation from list comprehension to map() for converting numbers to strings, creating inconsistency with the adjacent list comprehension on line 92
pythainlp/ulmfit/preprocess.py Converted lowercase transformation from list comprehension to map() in the lowercase_all function
pythainlp/tokenize/thaisumcut.py Changed sentence stripping operation from list comprehension to map() in sentence segmentation logic
pythainlp/spell/words_spelling_correction.py Converted word list reading from list comprehension to map() when loading spelling correction data

Copilot AI and others added 2 commits January 30, 2026 01:56
Co-authored-by: bact <128572+bact@users.noreply.github.com>
Co-authored-by: bact <128572+bact@users.noreply.github.com>
Copilot AI changed the title Use map/filter with built-in functions for performance optimization Optimize performance: use map/filter with built-in functions Jan 30, 2026
…rings

Co-authored-by: bact <128572+bact@users.noreply.github.com>
Copilot AI changed the title Optimize performance: use map/filter with built-in functions Revert incorrect filter(None) optimization in ulmfit preprocessing Jan 30, 2026
Co-authored-by: bact <128572+bact@users.noreply.github.com>
@sonarqubecloud
Copy link

Copilot AI changed the title Revert incorrect filter(None) optimization in ulmfit preprocessing Use map/filter with built-in functions for performance Jan 30, 2026
@bact bact marked this pull request as ready for review January 30, 2026 02:16
@bact bact added this to the 5.3 milestone Jan 30, 2026
@bact bact merged commit 1a2998e into dev Jan 30, 2026
30 checks passed
@bact bact deleted the copilot/replace-map-reduce-with-comprehension branch January 30, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactoring a technical improvement which does not add any new features or change existing features.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants