WebCodecs API in Practice: Low-Latency Audio/Video Encoding and Decoding in the Browser
The Evolution of Browser Audio/Video Processing
| Approach | Latency | Frame-Level Control | Codec Access | Best For |
|---|---|---|---|---|
<video> + MSE |
2-5s | ❌ | Container format | Playback, streaming |
| FFmpeg.wasm | 1-3s | ✅ | Full | Transcoding (but 30MB+ size) |
| WebCodecs | <100ms | ✅ | Native | Real-time processing, low-latency transcoding |
WebCodecs provides direct access to the browser's built-in codecs—no WASM wrapper needed, 10-50x lower latency.
Core Object Overview
VideoFrame ←→ VideoEncoder ←→ EncodedVideoChunk
↓ Network / Storage
AudioData ←→ AudioEncoder ←→ EncodedAudioChunk
↓
VideoFrame ←→ VideoDecoder ←→ EncodedVideoChunk
AudioData ←→ AudioDecoder ←→ EncodedAudioChunk
VideoFrame: Zero-Copy Frame Operations
// Create VideoFrame from canvas (zero-copy)
const frame = new VideoFrame(canvas, { timestamp: performance.now() * 1000 });
// Create from ImageBitmap
const bitmap = await createImageBitmap(videoElement);
const frame2 = new VideoFrame(bitmap, { timestamp: 0 });
// Access frame metadata
console.log(frame.width, frame.height, frame.duration, frame.timestamp);
// MUST close when done (otherwise GPU memory leak)
frame.close();
Critical: VideoFrame holds a GPU texture reference. You MUST call
close()to release it, or GPU memory will leak.
VideoEncoder: Video Encoding
const encoder = new VideoEncoder({
output: (chunk, metadata) => {
const data = new Uint8Array(chunk.byteLength);
chunk.copyTo(data);
handleEncodedChunk(chunk.type, data, chunk.timestamp, metadata);
},
error: (e) => console.error('Encode error:', e),
});
encoder.configure({
codec: 'avc1.64001E', // H.264 Main Profile
width: 1920,
height: 1080,
bitrate: 5_000_000, // 5 Mbps
framerate: 30,
});
// Encode a frame
encoder.encode(frame, { keyFrame: true });
frame.close();
Supported Codecs
| Codec | codec String | Characteristics |
|---|---|---|
| H.264 | avc1.64001E |
Widest compatibility |
| H.265 | hev1.1.6.L93.B0 |
40% better compression, limited support |
| VP8 | vp8 |
WebM format |
| VP9 | vp9 |
YouTube default |
| AV1 | av01.0.01M.08 |
Next-gen, highest compression |
VideoDecoder: Video Decoding
const decoder = new VideoDecoder({
output: (frame) => {
ctx.drawImage(frame, 0, 0);
frame.close(); // Must close
},
error: (e) => console.error('Decode error:', e),
});
decoder.configure({
codec: 'avc1.64001E',
codedWidth: 1920,
codedHeight: 1080,
});
const chunk = new EncodedVideoChunk({
type: isKeyFrame ? 'key' : 'delta',
timestamp: timestamp,
data: encodedData,
});
decoder.decode(chunk);
AudioEncoder / AudioDecoder
const audioEncoder = new AudioEncoder({
output: (chunk) => {
const data = new Uint8Array(chunk.byteLength);
chunk.copyTo(data);
handleAudioChunk(data, chunk.timestamp);
},
error: (e) => console.error(e),
});
audioEncoder.configure({
codec: 'aac',
sampleRate: 48000,
numberOfChannels: 2,
bitrate: 128000,
});
const audioData = new AudioData({
format: 'f32-planar',
sampleRate: 48000,
numberOfFrames: 1024,
numberOfChannels: 2,
timestamp: 0,
data: float32Array,
});
audioEncoder.encode(audioData);
audioData.close();
Practice: Video Format Transcoding Pipeline
The core transcoding flow for ToolsKu's Video Convert:
async function transcodeVideo(inputChunks: EncodedVideoChunk[], config: TranscodeConfig) {
const decodedFrames: VideoFrame[] = [];
// 1. Decode source video
const decoder = new VideoDecoder({
output: (frame) => decodedFrames.push(frame),
error: (e) => { throw e; },
});
decoder.configure(inputConfig);
for (const chunk of inputChunks) {
decoder.decode(chunk);
}
await decoder.flush();
// 2. Encode to target format
const encoder = new VideoEncoder({
output: (chunk) => muxer.addChunk(chunk),
error: (e) => { throw e; },
});
encoder.configure(config.outputCodec);
for (const frame of decodedFrames) {
encoder.encode(frame);
frame.close();
}
await encoder.flush();
return muxer.finalize();
}
Comparison with MSE: Low-Latency Scenarios
| Metric | MSE (MediaSource) | WebCodecs |
|---|---|---|
| First-frame latency | 2-5s (buffering strategy) | <100ms |
| Frame-level control | ❌ Only appendBuffer | ✅ Per-frame encode/decode |
| Real-time processing | ❌ Needs extra muxing | ✅ Native support |
| Memory usage | High (buffer queue) | Low (per-frame processing) |
| Best for | Video playback | Real-time communication, transcoding, analysis |
Performance Benchmark: Video Compression
H.264→H.265 transcoding of 1080p 30fps 60s video:
| Approach | Transcode Time | Output Size | Latency |
|---|---|---|---|
| FFmpeg.wasm | 180s | 12MB | 3s (WASM init) |
| WebCodecs | 45s | 12MB | <100ms |
WebCodecs leverages native browser codecs—4x faster, and no 30MB+ WASM download needed.
Common Questions
What container formats does WebCodecs support?
WebCodecs only handles raw encoded chunks, not container muxing. You need libraries like mp4-muxer or webm-muxer to package into MP4/WebM.
How to get encoded video data from a file?
Use an MP4 demuxer (e.g., mp4box.js) to demux the MP4 file and extract EncodedVideoChunks to feed into the Decoder.
Can VideoFrame be transferred across Workers?
Yes. VideoFrame implements Transferable—use postMessage(frame, [frame]) for zero-copy transfer to a Worker.
Summary
The WebCodecs API gives browsers direct access to audio/video codecs, enabling zero-copy frame processing and <100ms latency encoding/decoding. VideoEncoder/Decoder handle video, AudioEncoder/Decoder handle audio, and VideoFrame/AudioData provide frame-level operations. Combined with demuxer/muxer libraries, you can build complete low-latency video transcoding pipelines entirely in the browser.
Try these browser-local tools — no sign-up required →