-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
We are currently using a models.JSONField for the Build._config attribute. This field can be used in two different ways:
- store the whole YAML configuration file
- save a build ID where the full/whole YAML configuration file is stored
This is basically to avoid duplicating the data on our database. However, it only avoids duplicated config on the same projects and only for consecutive builds.
I found this approach pretty confusing when exploring Build objects because we have to do extra operations to find out the real YAML used for that build (see for example the config @property at
readthedocs.org/readthedocs/builds/models.py
Lines 313 to 331 in 12cbb3d
| @property | |
| def config(self): | |
| """ | |
| Proxy to the configuration of the build. | |
| :returns: The configuration used in the last successful build. | |
| :rtype: dict | |
| """ | |
| last_build = ( | |
| self.builds(manager=INTERNAL).filter( | |
| state=BUILD_STATE_FINISHED, | |
| success=True, | |
| ).order_by('-date') | |
| .only('_config') | |
| .first() | |
| ) | |
| if last_build: | |
| return last_build.config | |
| return None |
Also, it makes hard to answer questions like "What are the projects using build.os: ubuntu-22.04?" in an easy way and many other similar questions that require querying the config file.
New approach proposal
Due to these limitations and complications, I want to propose creating a new model called BuildConfig that's used as a foreign key from the Build model:
class Build(models.Model):
# ...
readthedocs_yaml_data = models.ForeignKey(
"BuildConfig"
null=True,
blank=True,
)
# ...Then the new model would be:
class BuildConfig(TimeStampedModel):
# ...
data = models.JSONField(unique=True)
# ...Benefits
- the "avoid duplicated" data is solved automatically by the database at a table level among all the projects, instead of per-project (e.g.
BuildConfig.objects.get_or_create(data=...)) - the previous point will reduce the size of the table considerably, since there won't be any duplicated config
- getting the YAML for a particular build is going to be pretty fast since we have direct access to it
- we can reduce the number of queries performed by using
.select_related("readthedocs_yaml_data") - answer interesting/complex questions pretty easily using Django ORM (e.g.
BuildConfig.objects.filter(data__build__os="ubuntu-22.04").projects.count()to answer my previous example question) - remove the requirement of a helper
@propertyto get the YAML for a particular build - remove the requirement for a
@config.setterhelper - remove the requirement of having extra logic at
Build.save - remove the need of using
self._config_changed - remove having two different meanings for the same field
- allows us to quickly show badges on "Build details page" as Anthony suggested for ext-theme
Summarizing, this approach keeps the same features but simplifies the application and modelling and gives us more features to analyze in an easier way platform usage and make better decisions based on them.
Rollout proposal
- add a
Build.readthedocs_yaml_datafield and create the newBuildConfigmodel - start saving both fields
Build.readthedocs_yaml_dataandBuild._configwhile doing the migration - make a data migration to convert
Build._configintoBuildConfigand linkBuild.readthedocs_yaml_datato them - make the application to use the new
Build.readthedocs_yaml_datafield - remove the old code that uses
Build._configfield
Metadata
Metadata
Assignees
Labels
Type
Projects
Status