Webpack Asynchronous On-Demand Loading

✍🏼 Written on May 2, 2016   
❗️ Note: it has been days since this article was written, please be aware of its timeliness

Introduction

Webpack aims to achieve asynchronous loading, where the main module is loaded first, and additional requests are sent to load specific modules (i.e., bundled chunks) only when they are needed.

The purpose of this approach is undoubtedly to speed up the initial page load time, but it inevitably involves sending extra requests. These two aspects are inherently a trade-off—you can’t have your cake and eat it too. Here, we’ll delve into the details of asynchronous loading.

Main Content

The implementation primarily relies on require.ensure([], callback). Honestly, I noticed this because in the output field of webpack.config.js, there’s a field called chunkFilename. My nitpicking nature couldn’t help but wonder how this differs from the filename field. After some research, I found that filename (assuming it’s bundle.js) bundles all the js required by the page, ultimately generating the final js (though for multi-page setups, common modules can be extracted, which isn’t the focus here). On the other hand, chunkFilename refers to files generated by bundling non-entry-point chunk files (those listed in the entry field), mainly used for on-demand asynchronous module loading.

These files are not bundled into bundle.js and are only depended on by some (not all) modules. Since they also need to be loaded asynchronously, they are bundled into additional js using require.ensure. These js are still loaded into the page by the final bundle.js creating script tags and then being append:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// This file contains only the entry chunk.
// The chunk loading function for additional chunks
__webpack_require__.e = function requireEnsure(chunkId, callback) {
// "0" is the signal for "already loaded"
if (installedChunks[chunkId] === 0)
return callback.call(null, __webpack_require__);

// an array means "currently loading".
if (installedChunks[chunkId] !== undefined) {
installedChunks[chunkId].push(callback);
} else {
// start chunk loading
installedChunks[chunkId] = [callback];
var head = document.getElementsByTagName('head')[0];
var script = document.createElement('script');
script.type = 'text/javascript';
script.charset = 'utf-8';
script.async = true;

script.src = __webpack_require__.p + '' + ({}[chunkId] || chunkId) + '.js';
head.appendChild(script);
}
};

Okay, this is all easy to understand, but when reviewing the official documentation], I noticed a few details worth paying attention to.

Differences Between CommonJS and AMD require

CommonJS uses require.ensure([''], callback) to handle asynchronously loaded modules. AMD, like regular AMD modules, is processed using require in the form of an array dependency require([''], callback).

However, when CommonJS loads modules in the array, it only loads them without executing them unless they are require again within callback:

1
2
3
4
5
6
7
require(['./other/ensure.js', './other/ensure2.js'], function () {
var ensure = require('./other/ensure.js');
var ensure2 = require('./other/ensure2.js');

module1();
module2();
}, chunkFilename);

The require.ensure method ensures that every dependency in dependencies can be synchronously required when calling the callback. An implementation of the require function is sent as a parameter to the callback.

Moreover, this callback parameter is a function that implements the require interface (if I’m not mistaken, it should just be a reference to the require function).

This chunkFilename will be overridden by the chunkFilename setting in output.

On the other hand, AMD, being the usual dependency preloading, executes the module at require:

1
2
3
4
5
6
7
require(['./other/ensure.js', './other/ensure2.js'], function (
ensure,
ensure2
) {
ensure();
ensure2();
});

Okay, the AMD example isn’t familiar, so let’s use CommonJS to illustrate some details.

First, if you pass a callback to require.ensure, all modules require in the callback function will also be bundled into the final asynchronously loaded file.

chunk Bundling Optimization Strategies

  1. If two chunk contain the same module, they will be merged into one.
  2. If a module is available in all parent chunk of a chunk, that module will be removed from the chunk.
  3. If a chunk contains all the modules of another chunk, the final bundle will include the chunk with more module. This rule also applies when a chunk contains all the module of multiple other chunk.

The second point is a bit tricky to understand. Essentially, it describes a scenario where an entry file A.js contains the b module, and a chunk.js file generated using require.ensure also contains this b module. Since require.ensure is called in the A.js file, A.js is considered the parent chunk of this chunk.js. As a result, the b module content in the final bundled chunk.js will be removed. Meanwhile, 在所有父级 chunk 都可用 refers to the first scenario: if several chunk contain the same module, only one final bundle.js will be generated. However, this might result in the chunk having multiple parent chunk (i.e., the chunk files corresponding to entry).

Let’s verify this:

The code for the entry file app.js:

1
2
3
4
5
6
7
8
require('../other/if_be_remove.js')();
require.ensure(
['../other/ensure.js'],
function () {
require('../other/ensure.js')();
},
'love'
);

The code for another entry file app2.js:

1
2
3
4
5
6
7
8
require('../other/if_be_remove.js')();
require.ensure(
['../other/ensure2.js'],
function () {
require('../other/ensure2.js')();
},
'hate'
);

The code for ensure.js:

1
2
3
4
require('./if_be_remove.js')();
module.exports = function () {
console.log("i'm be ensure!");
};

The code for ensure2.js:

1
2
3
4
require('./if_be_remove.js')();
module.exports = function () {
console.log("i'm be ensure2!");
};

Finally, in the code where both the child chunk and parent chunk exist in if_be_remove.js:

1
2
3
module.exports = function () {
console.log('im be removed!');
};

Let’s examine the content of js loaded in Network within the Chrome browser console (using the naming convention of [id].[name].js):

app.js page:

webpack-async

app2.js page:

webpack-async

As we can see, because if_be_remove.js is referenced in both chunk, namely 1-love.js and 3.hate.js, while simultaneously being referenced by the parent of these two chunk, that is, app.js and app2.js, the code for if_be_remove.js does not appear in these two chunk.

Supplement: chunk Concepts and Definitions

To further clarify, the so-called chunk refers to one or several module that form an independent js file. chunk can be categorized into the following types:

  1. Entry Chunks: Entry Chunks is the most common type of Chunks, containing our business logic code (typically unique code that won’t be extracted into shared chunks). It usually executes after Initial Chunks finishes loading (or when encountering a module module with ID 0).
  2. Normal Chunks: Normal Chunks primarily refers to modules dynamically loaded during application runtime. Webpack creates appropriate loaders like JSONP for dynamic loading.
  3. Initial Chunks: Initial Chunks is essentially still Normal Chunks, but it loads during application initialization. This type of Chunks is often generated by CommonsChunkPlugin and contains global module location information. Code execution in Entry chunks depends on this chunk, so it should be loaded first as js.

In our previous example, the bundule.js packaged as a shared js for all or partial pages is Initial Chunks. Page-specific chunk like app.xxxxxx.js is Entry Chunks, while asynchronously loaded chunk via require.ensure, such as 3-hate 1-love, is Normal Chunks.

- EOF -
Originally published at: Webpack Asynchronous On-Demand Loading - Xheldon Blog