Free Use of Private Repositories for GitHub Pages Publishing

✍🏼 Written on Oct 31, 2020   
❗️ Note: it has been days since this article was written, please be aware of its timeliness

Preface

Recently, I explored using GitHub Pages in conjunction with GitHub Actions and found it highly suitable for my personal blog scenario, hence this article.

Previously, my blog directly stored the source code in the repository, using GitHub Pages’ default Jekyll setup with a custom domain. However, this approach had several issues:

  1. Inability to hide source code. The articles under your _post directory could be freely copied and repurposed elsewhere as someone else’s content.
  2. Inability to hide revision history. If you suddenly wanted to remove certain content from your blog and deleted parts of it, others could still access the file’s history through the repository’s commit records, leaving your modifications fully exposed.
  3. Inability to use Gitalk. The first two issues could be resolved by upgrading to GitHub Pro or higher, as these paid accounts allow private repositories to use GitHub Pages. However, this would prevent the use of tools like Gitalk, which require third-party write access to issues for comments.
  4. Inconsistencies or untraceable errors between local and production builds. Since GitHub Pages’ Jekyll is a black box for users, troubleshooting becomes impossible.
  5. Inability to use certain custom/third-party plugins. GitHub Pages only supports [a limited set of plugins]](https://pages.github.com/versions/), meaning tools like jekyll-paginate v2—which enables pagination for categorized posts beyond just the homepage—cannot be used.

For the reasons mentioned above, I decided to use the compiled source code as the content for Github Pages instead of the original source code. Since the blog repository already had some Gitalk-generated issue comments, I created a new repository to store the source code privately, while keeping the old repository as the one for Github Pages deployment—only now publishing the compiled source code. This approach preserves the original comment data while avoiding the aforementioned issues. Since I use force-push every time I compile and publish to Github Pages, file modification history cannot be viewed, providing some privacy protection and increasing the cost of replication. Below is an introduction to this process.

Overall Workflow

The basic workflow is as follows: assume the old repository is called A, and the new private repository for storing the source code is called B. When a new push occurs, it triggers Github Actions in repository B. These Actions then push the compiled source code (i.e., the contents of the _site folder) to repository A. Since repository A is already set up as Github Pages with a custom domain, no additional configuration is needed.

Detailed Process

In Github Actions, several key concepts need to be clarified:

Hierarchy

From largest to smallest:

  1. The script itself: This refers to the ci.yml file that will be executed by the actions.
  2. Jobs: Configured tasks under jobs. By default, jobs run in parallel, but dependencies can be set using the needs keyword.
  3. Steps: Defined under steps, these are executed sequentially within a job. Each step runs in its own environment context, and a job can have an unlimited number of steps.
  4. Actions: Not all steps run actions, but actions are executed within steps. Actions are specific commands, such as printing the current directory or installing dependencies.

Using Others’ Steps

Actions allow you to use pre-written steps created by others, eliminating the need to write them yourself. For example, if you need to checkout a branch, simply use:

1
2
3
4
- uses: actions/checkout@v2
with:
persist-credentials: fasle # false 是用 personal token,true 是使用 GitHub token
fetch-depth: 0

The with parameter contains related settings, which can be checked in the step’s documentation.

Encrypted Data

You certainly wouldn’t want sensitive data like Personal Token or other private information exposed in your CI files for everyone to see. Therefore, you need an alternative way to use such data—by encrypting it and referencing it by name. Here, it’s important to understand the differences between types of encrypted data:

  1. GITHUB_TOKEN. If you’re only performing builds or actions on your own repository, you don’t need to do anything in the repository settings. When the action runs, GitHub automatically generates an environment variable called GITHUB_TOKEN in the context. It can be used for authentication purposes, such as pushing code to the current repository.
  2. Personal Token. If your current CI is in Repository A but needs to perform tasks in Repository B, you’ll require a Personal Token generated by Repository B’s administrator, with the necessary permissions assigned.
  3. Custom Encryption. After obtaining the Personal Token for Repository B in the second step, how do you use it in Repository A? This is where secrets—custom encryption—come into play. In Repository A’s settings, you need to define a custom variable name, such as B_REPO_TOKEN, and paste the Personal Token you just acquired. You can then reference it using secrets.B_REPO_TOKEN.

Detailed Steps

The above explanation is quite thorough. Below, we directly list the files and analyze them one by one. When needed, you can simply copy this content and make minor adjustments:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# 该 ci 的名字,可以在仓库的 Github Actions tab 看到
name: Blog Generator

# 触发时机
on:
# 代码 push 到 master 分支的时候运行该 workflow
# TODO:不运行 commit 信息中包含特定关键词的 push
push:
branches: [master]

jobs:
# build-and-push 是 jobs 名字,随便取,可以有多个 jobs 默认并行
build-and-push:
runs-on: ubuntu-latest # 该 jobs 运行的环境
steps:
# 首先一般都是 checkout 当前的仓库代码,用官方的 actions/checkout@v2
- uses: actions/checkout@v2
with: # with 表示所需要的参数
persist-credentials: fasle # false 是用 personal token,true 是使用 GitHub token
fetch-depth: 0 # 保证能够 push 成功

# 设置 ruby 环境,这里用的也是官方的 actions
- name: Set up Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: 2.6

# 安装依赖
- name: Install dependencies # 操作名字,会显示在 Github Actions 的任务输出界面,方便你 debug
run: bundle install # 运行的命令

# 打包静态资源
- name: Build Pages
run: bundle exec jekyll build

- name: Add Message
working-directory: ./_site # jekyll 默认 build 到 _site 目录,因此设置命令执行的目录为 ./_site
run: | # run 后面加个 ‘|’ 然后换行可以同时执行多个命令,每行一个
echo "www.xheldon.com" > CNAME
echo -e "# [Xheldon's Tech blog](https://www.xheldon.com)" > README.md

- name: Commit and Push # 将打包后的位于 _site 目录的文件 push 到 A 仓库即可
working-directory: ./_site
run: |
git init
git checkout -b master
git add -A
git -c user.name='github actions by ${{github.actor}}' -c user.email='NO' commit -m 'update'
git push "https://{%raw%}${{github.actor}}:${{secrets.X_BLOG_SITE}}{%endraw%}@github.com/Xheldon/x_blog.git" HEAD:master -f -q

Here, the final step requires some explanation: X_BLOG_SITE represents my custom encrypted data, with the value being the configured Personal Token. Additionally, when pushing to the remote repository, if the CI only operates on the current repository, there’s no need to hardcode the repository name. Instead, you can use environment variables in the following format:

1
git push "https://{%raw%}${{github.actor}}:${{secrets.GITHUB_TOKEN}}@github.com/${{github.repository}}{%endraw%}.git" HEAD:master -f -q

This means using github.XXX to reference relevant information. You can check here for more context on parameter descriptions.

Differences from Travis

I’ve used Travis before, and the core concepts are largely similar. The most notable difference is that GitHub Actions allows referencing pre-written actions from others—similar to ‘requiring’ someone else’s package. You don’t need to maintain this package; you can use it directly. This offers more flexibility, requires less configuration, and enhances reusability.

- EOF -
Originally published at: Free Use of Private Repositories for GitHub Pages Publishing - Xheldon Blog