Skip to content

Conversation

@mazulo
Copy link

@mazulo mazulo commented Nov 3, 2022

Describe the change

Currently we're using some random data to feed the fields. In this PR I'm proposing an approach where the user can enable an extra feature by adding _use_faker_generator=True when using baker.make. When enabled, we will use Faker to generate the data. For now we only have a small list but it can grow as needed:

  • username
  • email
  • first_name
  • last_name
  • name
  • fullname
  • full_name
  • ip
  • ipv4
  • ipv6

I think this is a really interesting feature because it will provide the users to have a more realistic data to work on.

PR Checklist

Copy link
Collaborator

@timjklein36 timjklein36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern with these changes is that the new faker_generator_mapping circumvents the existing generator mapping mechanism, which already allows for overriding or providing new custom generators for fields of specific types.

At the very least, the new behavior should coexist with the existing mechanism (meaning it should still use the existing generator mapping in any case that it can). Otherwise, there would be no way to override the generation of, say, EmailFields where the name of the field is in the new list.

I like the idea, but I think it deserves some extra thought/planning to make sure it can be integrated nicely with the existing features.

@@ -0,0 +1,18 @@
from typing import Callable, Dict

from faker import Faker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import (and its subsequent behavior) should be resilient to ImportError, since we don't want to require users to install faker if they are not going to use it.

model.
"""
field_name = field.attname
if self._use_faker_generator and field_name in faker_generator_mapping:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment about ImportError. I think that if we fail to import faker we should output a warning message (when _use_faker_generator is True) and continue past this logic so that it does not break, but still gives the user feedback that they are trying to use an optional feature for which they do not have all the requirements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cleaner to fail when someone tries to use the feature without installing faker.
But not using the feature should still be possible without faker being installed.

moving the from .faker_gen import faker_generator_mapping inside this if clause should do the trick...?

@mazulo
Copy link
Author

mazulo commented Nov 7, 2022

Hey @timjklein36 thank you for your review! I'm going to read and better understand about the changes you requested (mainly the one for the new feature). If you have any other idea please let me know! More than happy to go through them. 😄

@mazulo mazulo force-pushed the mazulo/adds-faker-generator-feature branch from d6ad3f3 to 77221e9 Compare November 7, 2022 12:07
model.
"""
field_name = field.attname
if self._use_faker_generator and field_name in faker_generator_mapping:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cleaner to fail when someone tries to use the feature without installing faker.
But not using the feature should still be possible without faker being installed.

moving the from .faker_gen import faker_generator_mapping inside this if clause should do the trick...?

@@ -1 +1,2 @@
django>=3.2
Faker==15.1.3
Copy link
Contributor

@hiaselhans hiaselhans Dec 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be moved to extras_require["faker"]

@v3ss0n
Copy link

v3ss0n commented Jan 16, 2023

Any chance of merging this? in 2023 there are no working faker baked in fixture gens.

@amureki
Copy link
Member

amureki commented Jul 25, 2025

Hi @mazulo! 👋

This PR has been open for quite a while (which is also my oversight as a maintainer). Are you still interested in pursuing this Faker integration feature?

Since there wasn't any follow-up after the initial reviews, the following points are still open for discussion on how the implementation should work:

  1. Automatic field detection: When Faker is installed, it could automatically take over generation for fields it recognizes (by extending Baker.generate_value)
  2. Explicit opt-in: Keep the original approach where users explicitly pass _use_faker_generator=True to recipes/fixtures
  3. Field name mapping: Integrate with the existing generator system but add field name-based mapping alongside field type mapping

We'd want to keep this feature optional since model_bakery serves its original needs perfectly right now, but this could be a nice optional improvement for those who need more realistic test data.

Could you also help us understand your actual use case? What specific need drove you to want Faker integration - was it just more realistic-looking data, or something more specific?

Let us know if you'd like to continue working on this!

@amureki
Copy link
Member

amureki commented Jul 25, 2025

Any chance of merging this? in 2023 there are no working faker baked in fixture gens.

@v3ss0n Any help is welcome! This is an open-source project maintained by volunteers in their spare time. If you or your company are using model_bakery and would like to contribute, we'd appreciate useful suggestions, reviews, ideas, or financial support to help maintain the project.

@amureki amureki added the question Further information is requested label Jul 25, 2025
@mazulo
Copy link
Author

mazulo commented Jul 25, 2025

Hi @mazulo! 👋

This PR has been open for quite a while (which is also my oversight as a maintainer). Are you still interested in pursuing this Faker integration feature?

Since there wasn't any follow-up after the initial reviews, the following points are still open for discussion on how the implementation should work:

  1. Automatic field detection: When Faker is installed, it could automatically take over generation for fields it recognizes (by extending Baker.generate_value)
  2. Explicit opt-in: Keep the original approach where users explicitly pass _use_faker_generator=True to recipes/fixtures
  3. Field name mapping: Integrate with the existing generator system but add field name-based mapping alongside field type mapping

We'd want to keep this feature optional since model_bakery serves its original needs perfectly right now, but this could be a nice optional improvement for those who need more realistic test data.

Hi @amureki! I need to say that I'm kinda embarassed because... for some reason I haven't see/noticed any notification from the review comments here 😬 but yes, I'm still interested! I can see that this is something that might add a really good value. This weekend I will get back to this PR, solve the conflicts and check both comments and your suggestions.

Could you also help us understand your actual use case? What specific need drove you to want Faker integration - was it just more realistic-looking data, or something more specific?

Let us know if you'd like to continue working on this!

Yes, I had a lot of cases where I needed more realistic data for tests (integrations and E2E). Another reason for that was that there were several times where visualizing why a test failed was hard to see because of the cluttered data genereated (e.g. an email being apisugrbapeirfgubpeaijvaojdfvnpaidfjnvpiasdjnfvpsdfnvps29348294@kjnadkfvjbsdifvjbisdfb.com instead of something simpler like [email protected]). Does it make sense?

@amureki
Copy link
Member

amureki commented Jul 25, 2025

Yes, I had a lot of cases where I needed more realistic data for tests (integrations and E2E). Another reason for that was that there were several times where visualizing why a test failed was hard to see because of the cluttered data genereated (e.g. an email being apisugrbapeirfgubpeaijvaojdfvnpaidfjnvpiasdjnfvpsdfnvps29348294@kjnadkfvjbsdifvjbisdfb.com instead of something simpler like [email protected]). Does it make sense?

Yeah, totally makes sense! 👍
Custom generators can solve this temporarily, but I understand the appeal of having Faker integration built-in. No rush though - thanks for picking this back up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants