GitLabPage YML Breakdown
Rather than establish an understanding of CI/CD (continuous integration / continuous deployment). In this post, I would rather skip over that and talk about this specific example. There is a benefit in understanding the why, however, sometimes we just need to get something done. Regardless of our understanding.
First of all the full configuration file:
image: node:10.15.3
cache:
paths:
- node_modules/
before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
- hexo generate
pages:
script:
- hexo generate
artifacts:
paths:
- public
only:
- master
image
Breakdown
image: node:10.15.3
To build a project on a build agent, specifically a build agent we do not own. We must first define the environment and dependencies needed to build the project. Remember this is not our local development environment anymore. So we have to define everything. The image
property allows us to specify which Docker container to use as a build agent. Luckily, in this example the build agent configuration is quite simple. node:10.15.3
is readily available via DockerHub.
cache
Breakdown
cache:
paths:
- node_modules/
When building a project, the project is likely to reference similar, if not the same, items between builds. To save on repeat download time. Increasing the speed of the build. We want to store these repeat files somewhere off to the side for the next build. For this we use a cache
, in the configuration, we define the path we want to cache is node_modules/
and everything under it. The first build that is executed will prime the cache with all the required node modules being referenced. As the node modules change over time, the cache will be updated. Only fetching and adding the newly required/changed node modules to the cache.
before_script
Breakdown
before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
- hexo generate
Now that we have a brand new build agent. We have told the build agent there is a cache to use. Next, we need to prepare the build agent to build. This is where before_script
comes in. We pass in an array of commands to be executed in the order they appear. First, it must install Hexo CLI tools to generate the static files. Then, the build agent will test for the existence of a package.json
file. If that is true, the build agent will install all the necessary node modules. Finally, the build agent will generate the static site from the repository. The output of the static site is referred to as a deployment artifact. This is what will be deployed to GitLabPages in the final step.
pages
Breakdown
pages:
script:
- hexo generate
artifacts:
paths:
- public
only:
- master
The final step in the process, deployment to the webserver. pages
is a special node in the yml file. This is unique to GitLab pages. It indicates to the build agent the details to build and deploy to the GitLabPages webserver(s). The first argument is script
which defines the command to execute to build the deployment artifact. Wait, didn't we just do this in the previous step. Funny story, yeah we did. I will get to this later for a more clear explanation and not take away from the deployment configuration. Now that we have an artifact to deploy, we need to define which directory should be deployed onto the webserver. That is done by defining artifacts
> paths
. If you have ever executed the hexo generate
command locally, you will see that the output is placed in a directory named public
. Finally, only
limits what branch or branches will trigger the execution. In my case, master
is the only branch. However, if we had other branches for experimentation, I may add additional steps here to deploy the experiments to some other location.
Why is hexo generate
executed twice
As mentioned above, hexo generate
is executed 2 times. First, it is executed as part of the before_script
. Then as part of pages
> scripts
. But why? At first, this was not abundantly clear to me. Remember this is all pretty new, so this seems like learn by experiment opportunity.
Experiment #1 - Remove the command from before_script
before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
pages:
script:
- hexo generate
artifacts:
paths:
- public
only:
- master
Removing the execution of hexo generate
from build_script
is a logical separation. Now, before_script
has the sole responsibility of preparing the build agent for the build and deployment. Committing this change, everything worked as expected. It even took ~5 seconds off the pipeline total time. Not bad, but is this the only solution?
Experiment #2 - Remove the command from pages
> script
before_script:
- npm install hexo-cli -g
- test -e package.json && npm install
- hexo generate
pages:
artifacts:
paths:
- public
only:
- master
The above was an immediate failure. Come to find script
is required under the pages
configuration. Time to roll back the yml configuration back to experiment #1.
This is great, in detailing yml I found an unnecessary configuration in the yml. I guess this is a prime opportunity to contribute to the open-source community. Opening a pull request to the example project, hopefully, it will be accepted.
For additional resources check out this reference.