Taking a long time in source and transform steps even in incremental build in gatsby v3. #31627
Replies: 4 comments 1 reply
-
Source plugins themselves must also be written to fetch incrementally i.e. on subsequent builds, they only source what's changed. So or MySQL, you could store the time of the last build and query for everything that's been updated since then. |
Beta Was this translation helpful? Give feedback.
-
So I dig deeper, I found out that the gatsby-source-mysql plugin uses gatsby-source-filesystem under the hood and uses the createRemoteFileNode function of gatsby-source-filesystem for fetching remote image URL. Does the gatsby-source-mysql plugin need to keep the cache or check the cache before fetching the remote image URL from the MySQL data source or gatsby-source-filesystem itself handle it? |
Beta Was this translation helpful? Give feedback.
-
You'll also need to keep the |
Beta Was this translation helpful? Give feedback.
-
After digging some more I found out, that the plugins gatsby-plugin-sharp and gatsby-transform-sharp which I am using for the image blur up and transform the process and all images being cached initially in the .cache folder. When I do subsequent build (incremental build), those plugins will again process all the images from scratch even though it's in the cache. Is there any way to stop reprocessing all images, what is being done by plugins in every build? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Description
I built a blog site of about 4000 articles and 12000 images, let's say 1 article has 3 images. My data source is MySQL and I am using this gatsby-source-mysql plugin to fetch data and also pass down my URLs to remoteImageFieldNames for image processing. remoteImageFieldNames will download images and utilize Gatsby's image processing capability using gatsby-plugin-sharp and gatsby-transform-sharp plugin. Now check out the stats of the build given below.
Steps to reproduce along with Actual result
For Fresh build (which is acceptable for 1st time)
Now for actual data which is 4000 articles and 12000 images (1 article has 3 images) it took 2.46 Hrs for the source and transform process and 2.49 Hrs for the whole build.
I.e 2.49 Hrs and source and transform plugin took 2.46 Hrs.
Now I had changed the author name in the article table from the database and build it again. So here I expect less time as it will undergo incremental build, but it took
I.e 2.56 Hrs (changed pages 3) and again maximum time is consumed by source and transform plugin around 2.55 Hrs.
So I believe that it is carrying out the whole source and transform process again even though it's an incremental build or is there any caching mechanism implemented within the plugin?
Moreover during the source and transform step CPU utilization remains between 3-5% only.
Expected result
Build time should be very less in the incremental build. Especially when you are working with large data.
What should happen?
Can you please help me out to resolve this issue?
Environment
I was using a AWS T3 medium EC2 instance with 4GB RAM and 2 cores machine. Overall CPU utilization was about 3-5% all the time.
Flags
System Info
Beta Was this translation helpful? Give feedback.
All reactions