31 Jul 2018, 20:37

A Quick Look at WorkerDOM

In the browser, many operations occur on a single main thread. Due to the number of things we need to handle as web developers such as styling, DOM updates, fetches, data transforms, timers and so forth we can end up with a lot going on that thread. Unfortunately, this thread is also responsible for handling user inputs and rendering to the screen, which are user critical requirements and have large impacts on their experience. Here if we have tasks that run for long periods this can block responding to user input and rendering new frames.

In the browser we have access to threading via Web Workers. Web Workers allow us to run JavaScript in a separate thread whilst also being able to message back to the main thread. The limitations of Web Workers are that they have transfer times for data between the main and worker thread, limited transfer types (i.e. Functions and Errors aren’t allowed) and also perhaps most importantly no DOM access (i.e. document.getElementByID for example). I have experimented and written about transfer times for workers, showing that for small loads the fear is probably over played. However a core aspect of frontend development is, maybe somewhat obviously, trying to update what is happening on the screen (render new content). This makes the DOM limitation pretty painful for those using Web Workers, as it mostly limits it to computational logic which we then proxy back to the main thread.

On mobile devices, where performance perhaps matters the most, we’re now seeing low-end devices with multiple cores (for example a Nokia 1 and the Micromax Bharat Go both have quadcore processors). Arguably this is where multiple threaded code would probably have the largest benefit. However, we still have a particularly single-threaded approach to writing browser code. This is why it’s interesting to see the emergence of the recently announced WorkerDOM library from the AMP team, which aims to let developers leverage workers more easily, by removing the aforementioned limitation of lack of DOM access (or at least perceived lack of DOM access).

What is WorkerDOM?

WorkerDOM is a library which you can include in your applications and pages. It provides much of the regular DOM API and in turn proxied access to the real DOM on the main thread, passing mutations to the DOM over to the main-thread to be handled. This encourages a lot of logic that was previously happening on the main thread to be handled by the Web Worker, leaving the main thread to handle user input and DOM changes. The library itself is written in TypeScript and is compiled down to both global variable and module formats.

Setting up WorkerDOM

You can include WorkerDOM by installing it via NPM like so:

npm install @ampproject/worker-dom

or if you use Yarn then:

yarn add @ampproject/worker-dom

Alternatively you could use a script tag directly, using a CDN, in the following fashion:

<script src="https://unpkg.com/@ampproject/worker-dom@0.1.2/dist/index.js"></script>

We can actually take this a step further as WorkerDOM is distributed as both a global (aliased as MainThread) and an ES Module, allowing for the import syntax. This means you can do the following;

<script src="https://unpkg.com/@ampproject/worker-dom@0.1.2/dist/index.mjs" type="module"></script>
<script src="https://unpkg.com/@ampproject/worker-dom@0.1.2/dist/index.js" nomodule defer></script>

Here only browsers that understand modules (all modern browsers, except Samsung Internet now support modules) will receive the module script.

First Steps

WorkerDOM allows us to expose a specific part of the DOM to be upgraded, we can do this by doing something we would probably not normally do; we set the src attribute on the containing element we want to upgraded to work with WorkerDOM. The src attribute is updated to be the name of the script we’re interested in running as our worker script. For the purposes of this blog, we are going to generate some prime numbers and render them into the DOM. For our index.html file, we might start with something like this in our case:

<body>
    <div src="primes.js" id="primes">
    </div>
</body>

So now we have an element that we can upgrade in our body. We need to declare explicitly that we want to upgrade the element using the upgradeElement method. We can do that like so:


<script type="module">
    import {upgradeElement} from '/dist/index.mjs';
    upgradeElement(document.getElementById('primes'), '/dist/primes.mjs');
</script>

<script nomodule async=false defer>
    document.addEventListener('DOMContentLoaded', function() {
        MainThread.upgradeElement(document.getElementById('primes'), '/dist/primes.js');
    }, false);
</script>

You can see we take both the module and none module code paths allowing us to handle both scenarios. This means we can now begin looking at our actual worker logic!

The Worker File

The coolest part about WorkerDOM is that it allows us to behave as if we have DOM access in the worker thread. That means we have access to properties like document.createElement for example. Continuing on with the concept that we want to generate primes and add them to the DOM, we could do something like this:


const startNumber = 1;

function generatePrimes() {

    while (document.body.firstChild) {
        document.body.removeChild(document.body.firstChild);
    }

    const numDivs = 1000;
    const limit = startNumber + numDivs;
    const primes = sieveOfEratosthenes(limit); // An algorithm for generating primes up to 'limit'

    const div = document.createElement('div');
    div.className = 'parent';

    for (let i = startNumber; i < limit; i++) {
        const numberDiv = document.createElement('div');
        numberDiv.className = "number";
        const numberText = document.createTextNode(i);
        if (primes.has(i)) {
        numberDiv.style.fontWeight = 'bold';
        numberDiv.style.color = '#240098';
        }
        numberDiv.appendChild(numberText);
        div.appendChild(numberDiv);
    }

    document.body.appendChild(div);
    startNumber += numDivs;

}

setTimeout(generatePrimes, 0); // Not sure why we need this in a timeout?
document.body.addEventListener('click', generatePrimes);

This code will create divs with numbers in them, with prime numbers being highlighted and put in bold. The numbers will update on click on the document. Notice how we can behave as if this worker code is on the main thread, with access to DOM APIs. It’s worth pointing out the main difference here is that the "primes" div from the index.html is considered our document.body here.

As it stands, WorkerDOM is currently in alpha, and as such is still being worked on. Hopefully, this post has given you a reasonable overview of the library, and if you are interested in learning more about WorkerDOM I would recommend these resources:

22 May 2018, 19:37

Examining Web Worker Performance

Recently I’ve been writing about Web Workers and various options developers have for leveraging them in their applications. For those unfamiliar, Web Workers allow you to create a separate thread for execution in a web browser. This is powerful as it allows work to be done off the main thread which is responsible for rendering and responding to user events. Over recent times we’ve seen a growth in Web Worker libraries such as greenlet, Workerize and Comlink to name a few.

One thing that’s been swirling around in my head is the question of what is the tradeoff of using a Web Worker? They are great because we can leave the main thread for rendering and responding to user interactions, but at what cost? The purpose of this post is to examine empirically where Web Workers make sense and where they might improve an application.

Benchmarking Web Workers

I set out trying to benchmark the performance of Web Workers within the browser. This data was collected based on code I wrote which manifested as a hosted app which can be found here. All the performance numbers specified are on my Dell XPS, Intel Core i7-4500 CPU @ 1.80GHz, 8GB of RAM, running Xubuntu. References to Chrome are version 66 and for Firefox version 59. To preface there is some possibility that numbers are slightly skewed due to garbage collection which is automated by the browser.

Web Worker Performance

At a high level creation and termination of Web Workers is relatively cost free depending on your tolerance for main thread worker:

  • Creation normally comes in at sub 1ms on Chrome
  • Termination is sub 0.5ms on Chrome

The real cost of Web Workers comes from the transfer of a data from the main thread to the Web Worker (worker.postMessage) and the return of data to the main thread (postMessage - onmessage).

This graph reflects this cost. We can see that increased data transfer sizes result in increased transfer times. More usefully we can deduce:

  • Sending an object of 1000 keys or less via postMessage comes in sub-millisecond, and 10,000 is ~2.5ms on Chrome
  • Over this we have more noticeable transfer costs to the worker; 100,000 comes in at ~35ms and 1,000,000 at ~550ms again on Chrome
  • onmessage timings are fairly comparable to this, although coming in slightly higher

Greenlet Performance

There has been an open issue on Jason Miller’s greenlet library for a while now which asks about the performance implications of using the library. As such I extended my research to also explore the library.

Overall Greenlet performance is slower than inlined Web Workers when you combine posting to and from the worker thread. This comes to ~850ms vs ~1700ms (i.e. around double) in Chrome at the 1,000,0000 key level but is slightly less pronounced in Firefox at ~1500ms vs ~2300ms. It’s difficult to deduce why this is the case and may have something to do with the ES6+ to ES5 transpilation process in Webpack or some other factor that I am unaware of (please feel free to let me know if you have an idea!). Overall, however, it’s a substantially easier abstraction for developers to deal with, so this needs to be taken into consideration. The main takeaways for people interested in using greenlet in anger are:

  • Objects with sub 10,000 entries greenlet appear to be sub 50ms for Chrome and Firefox
  • Increase fairly linear after that point (~150ms for 100,000 vs ~1500ms for 1,000,000)

Data Transfer Using JSON.stringify and JSON.parse

Using stringify and parse appears to yield fairly comparable results on Chrome but generally performs better than passing raw objects on Firefox. As such I would recommend having a look for yourself at the demo here to make your own conclusions or test with your own data.

Browser Differences

On average Chrome outperforms Firefox especially under heavier data transfers. Interestingly it performs substantially better at postMessage back to the main thread from the worker by up to a factor of three, although I am unsure as to why. I would love to hear more about how this works on Safari, Edge and other browsers. Again here JSON.stringify and JSON.parse might behave differently on these browsers.

Transferables

Transferables behave more or less as expected with a near constant transfer cost; transfers of all sizes to and from the worker were sub 10ms.

The underlying idea here you can transfer values of type ArrayBuffer, MessagePort, ImageBitmap comparatively cheaply, which is a going to be a large performance boost if you’re using Web Workers, especially if your data is any considerable size. For example, you might transfer geometries as a Float32Array for speedier transfer.

Talking Points

This is the point at which we look at ways of making decisions around when to use Web Workers. Ultimately this is not a simple question but we can make some inferences from the data collected.

Smaller Workloads and Render Blocking

Objects of sub 50,0000 entries (or equivalent complexity) are on average in Chrome going to be less than 16ms to execute a postMessage and shouldn’t have too much noticeable effect on render performance for the user (i.e. there is some possibility that a frame render or two is skipped). However, overall using a Web Worker in a worse case in this situation will add up to ~50ms of overall processing time on top of the work the worker actually has to do. The trade-off is not blocking the main thread with heavy work, but taking a little extra time for the results to come back.

Large Workloads

Transfering over 100,000 entry object (or equivalent) is most likely going to have a noticeable blockage on the main thread because of the cost of postMessage. A recommendation could be to batch up heavy work into multiple postMessages so that the chance of frame rendering being blocked at any point is substantially reduced. You could even spin up a pool of workers and implement prioritisation strategies (see the fibrelite library I worked on for inspiration). Furthermore it may be worth considering if the data can be turned into Transferables which have a fairly minimal constant cost which could in turn be a massive performance boost.

The Trade Off

Ultimately here we are trading off transfer time to and from the Web Worker in exchange for preventing long render blocking tasks in the main thread. You may make overall times to results longer but prevent poor user experience in the process, for example a janky input or scroll experience.

If you can avoid your object transfers to and from the worker being render blocking you can move complex processing over to a Web Worker with the only cost of being the transfer times. We have shown for simple objects (1000 keys or less) should be sub-millisecond.

Final Thoughts On Web Worker Performance

Hopefully this data and commentary has helped explore the cost and benefits of Web Workers. Overall, we have shown how they can be a big win in the right situations. Blocking the main thread with heavy work is never going to be great for user experience, so we can make use of Web Workers here to prevent that, especially when transfer times are low (smaller data loads). For larger loads it might be worth batching work to prevent extensive blocking postMessages.

Although Web Workers are very useful, they do have a cost; the transfer times increases the overall time to work being finished, and they can add complexity to a code base. For some cases, this tradeoff might be undesired. In these situations it might be worth exploring using requestAnimationFrame and requestIdleCallback with batched workloads on the main thread to keep rendering and user interactions fluid.

It’s also worth concluding with the idea that Web Workers can be used for things other than simply running long running tasks however. David East recently wrote an article about wrapping the Firebase JavaScript SDK into a Web Worker which means the importing and parsing of the SDK is handled in the worker, leaving the main thread able to handle user input, reducing the First Input Delay (FID).

Lastly I think there is some strong potential for Web Workers to become core elements of some web frameworks. We’ve seen this jump recently in the start of React Native DOM by Vincent Riemer which tries to move work off the main thread into worker threads leaving the main thread for rendering and handling user input. Time will tell if this takes off as an approach!