Question regarding pixel mean calculation and RGBA background handling

Description:
Hi there,
I was debugging the data generation process and noticed the code block responsible for checking the contrast between the text and the background. I have a few questions regarding the logic used to calculate the average pixel values.
[Current Code](https://github.com/Belval/TextRecognitionDataGenerator/blob/master/trdg/data_generator.py#L188)

# 1. Sums first 2 channels (R, G) but divides by 3?
resized_img_px_mean = sum(resized_img_st.mean[:2]) / 3
# 2. Sums all channels (including Alpha if RGBA) and divides by 3?
background_img_px_mean = sum(background_img_st.mean) / 3
My Questions:
Missing Blue Channel: For resized_img_px_mean, the code only sums [:2] (Red and Green) but divides by 3. This seems to artificially lower the brightness value. Is there a specific reason to ignore the Blue channel?
Alpha Channel Interference: When background_img is in RGBA mode, stat.mean returns 4 values. Summing them all includes the Alpha channel (usually 255), which significantly skews the background brightness calculation (making dark backgrounds appear gray).
Proposed Solution (Minimal Change):
I suggest using the standard Luminosity Formula (Psychological Grayscale: 0.299*R + 0.587*G + 0.114*B). This approach aligns better with human perception and automatically handles RGBA images correctly by only using the first 3 channels (RGB).
Here is a minimal modification that fixes the logic without changing the image modes:
Python

            resized_img_st = ImageStat.Stat(resized_img, resized_mask.split()[2])
            background_img_st = ImageStat.Stat(background_img)

            # Helper lambda for Luminosity Formula (0.299R + 0.587G + 0.114B)
            # It only uses the first 3 elements, safely ignoring Alpha channel in RGBA
            calc_luma = lambda x: 0.299 * x[0] + 0.587 * x[1] + 0.114 * x[2]

            resized_img_px_mean = calc_luma(resized_img_st.mean)
            background_img_px_mean = calc_luma(background_img_st.mean)

            if abs(resized_img_px_mean - background_img_px_mean) < 15:
                # ...
This change ensures that:
All RGB channels are considered.
The Alpha channel in the background image is ignored.
The brightness comparison is more accurate to human vision.
What do you think about this adjustment?
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding pixel mean calculation and RGBA background handling #366

1. Sums first 2 channels (R, G) but divides by 3?

2. Sums all channels (including Alpha if RGBA) and divides by 3?

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question regarding pixel mean calculation and RGBA background handling #366

Description

1. Sums first 2 channels (R, G) but divides by 3?

2. Sums all channels (including Alpha if RGBA) and divides by 3?

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions