Skip to main content

DOM Serialization

Technical Definition & OverviewUpdated 2026-05-26
DOM Serialization
The process of converting the live state of a web page's Document Object Model into a persistent data format (like JSON) that can be stored and later reconstructed.

Detailed Explanation

DOM serialization is the foundation of every modern session-replay product worth deploying. The idea is simple but the engineering is not: instead of recording the screen as pixels (a video file), you capture the structural state of the page — every element, every attribute, every text node — as data, and then capture every change to that state over time. A teammate can later reconstruct the session by rebuilding the DOM from the initial snapshot and replaying the recorded changes at the same wall-clock timing the original user experienced. The initial snapshot is the first thing the SDK does on page load. It walks the entire DOM tree (typically using a recursive serializer that handles standard elements, Shadow DOM open roots, same-origin iframes, and SVG namespaces), and emits a compact JSON representation: each node gets a unique numeric ID, its tag name, its attributes (with sensitive ones like authorization headers redacted), inherited styles that can't be reconstructed from stylesheets, and its text content. The result is normally 30–200 KB for a typical page, gzipped down further by the transport layer. Stylesheets are captured separately — same-origin sheets inline, cross-origin sheets as URL references with computed-style fallbacks for the cases where the cross-origin stylesheet later fails to load in the replay environment. After the initial snapshot, the serializer hands off to a MutationObserver subscribed to the document root with `{ childList: true, attributes: true, characterData: true, subtree: true }`. Every mutation — node added, node removed, attribute changed, text replaced — fires a callback. The serializer encodes each mutation as a delta against the prior state, timestamped, and queues it for transport. A well-engineered implementation runs the delta encoder in a Web Worker so it never blocks the main thread, and applies "DoS protection" — caps on the rate and volume of mutations per second — so a runaway render loop in the host page doesn't blow up the recording. On the playback side, the viewer reconstructs the DOM inside a sandboxed iframe. Scripts from the original page are stripped (so they don't re-fire side effects like analytics pings or duplicate API calls), and the stylesheet snapshot is injected into the iframe head. The recorded mutations are then applied at their original timing, with optional 0.5×–8× playback speed and a scrubbable timeline. The result is structurally identical to what the user saw — and it is fully inspectable: a developer can open browser dev tools inside the replay, hover any element, see the computed styles, replay console errors from the captured stream, and inspect the network waterfall alongside the visual playback. There are edge cases that separate well-built serializers from naive ones. Canvas and WebGL content need special handling — pure DOM serialization can't capture pixel data, so the serializer either samples frames at 4 FPS with WebP encoding or substitutes a placeholder. CSS-in-JS libraries that generate runtime class names need the stylesheet captured along with the DOM. Iframes need a recursive nested-serializer call for same-origin children, and opaque-placeholder fallback for cross-origin ones (browser security forbids reading their content). And Shadow DOM needs `attachShadow({ mode: 'open' })` traversal — closed shadow roots are unreachable by design and show up as opaque elements in the replay. Why all this work instead of just a screen video? Because structured data participates in workflows that pixels can't. You can search across sessions for an element selector. You can diff two sessions to spot a regression. You can run an AI summarisation pass on the captured timeline and get a draft bug report. You can auto-generate a Playwright spec by walking the input events. None of those workflows are possible from a video — they all need the DOM to be queryable, which is what serialization makes possible.

How Relyv helps: Relyv's serialization engine handles every edge case above out of the box: Shadow DOM (open roots), iframes (same-origin recursive, cross-origin placeholder), CSS-in-JS (stylesheet capture alongside DOM), canvas/WebGL (configurable frame sampling), SVG namespaces, and runtime-generated class names. Capture runs in a Web Worker; main-thread overhead stays under 1% on production traffic, and the SDK ships under 30KB gzipped. Replays render in a sandboxed iframe with full dev-tools inspection, side-by-side session diff, and an AI-summarised root cause attached to every captured error.

Built for modern product teams.

Start capturing sessions with full technical context today.