sub-millisecond timing in browser audio

When I was building a browser-based instrument, the first thing I learned was that audio is unforgiving about timing. A keyboard event that arrives a few milliseconds late shows up as a faint inconsistency. Past around 20 milliseconds, the delay starts to feel sluggish. Somewhere beyond 50 milliseconds, the connection between gesture and sound breaks — you stop believing you caused it. These thresholds are approximate and vary by listener, but the direction is clear: audio tolerates less latency than almost anything else in a browser.

the wrong clock

My instinct was to reach for Date.now() or performance.now() when an input event fired. Both measure time, both report milliseconds, and both turned out to be the wrong clocks to trust for precise audio scheduling. Date.now() is wall-clock time with millisecond resolution and arbitrary jitter from system load. performance.now() is monotonic with sub-millisecond resolution, but it lives on the main thread, which is not the thread that produces audio samples. There is a gap between when the main thread observes an event and when the audio thread can act on it. That gap varies. It is not zero.

the right clock

The Web Audio API exposes AudioContext.currentTime, a high-resolution monotonic clock measured in seconds — the clock the audio engine uses for scheduling. At 48 kHz, one sample period is roughly 0.02 milliseconds, which gives a sense of the time scale involved. Any sound scheduled against this clock is rendered on that same timeline, which is the only one that matters for what the listener actually hears.

The pattern I ended up using: when a UI event arrives, immediately read audioCtx.currentTime, then schedule the resulting audio against that timestamp plus a small lookahead. The lookahead — typically 5 to 20 milliseconds — gives the audio engine enough buffer to render the sound without underrunning, while still being short enough that the listener perceives it as instant.

// inside a keydown handler
const t = audioCtx.currentTime;
osc.start(t + 0.005);
gain.gain.setValueAtTime(0, t);
gain.gain.linearRampToValueAtTime(1, t + 0.01);

Everything is anchored to currentTime. Nothing is anchored to Date.now(). The lookahead absorbs the variance between the main thread and the audio thread.

two thread problems

The deeper issue is that the main thread and the audio thread do not run in lockstep. The main thread handles UI, scrolling, layout, garbage collection, and a dozen other jobs that can stall it for milliseconds at a time. The audio thread runs on a tight, fixed cadence, asking for new samples every few milliseconds whether the main thread is ready or not. If the main thread is busy when an event fires, the event is delivered late. If the audio thread is starved when it needs samples, the listener hears a glitch.

The schedule-against-currentTime pattern handles the first problem. The second problem requires a different tool: the AudioWorklet, which moves synthesis work off the main thread and into the audio rendering path. For anything beyond simple oscillator scheduling, this is the most reliable path to low-latency audio in the browser.

what I took away

If audio matters, don't use a wall-clock or main-thread timer for an event that produces sound. Read the audio context clock, schedule against it, give the audio thread a small lookahead to work with, and move synthesis work off the main thread when the synthesis is non-trivial.

The browser audio stack is more capable than its reputation suggests. It just punishes the wrong assumptions about which clock you are reading.