The one-sentence answer
Auto-generating a Playwright test from a session replay means walking the captured interaction stream (clicks, keystrokes, navigations, network responses), extracting a stable selector for each interacted element, and emitting a .spec.ts file that re-runs the same sequence with Playwright's API plus assertions inferred from the observed state. The recording is the source of truth; the generated spec is a deterministic re-execution of it.
Why generate tests from sessions instead of writing them
Hand-authoring end-to-end tests has two well-known problems: they drift from real user behaviour (engineers test the happy path they imagine; users hit the path no one thought to test) and they take time the QA team usually doesn't have. Generated tests fix both. Every real user session that contained a bug becomes a regression test for that bug. Every important user flow you watched in replay can become a test in one click. The test corpus grows alongside the product, grounded in production behaviour.
What a session replay captures that a Playwright test needs
A good replay SDK captures the full set of inputs Playwright needs to faithfully replay:
- <strong>Click events</strong> with target selectors (DOM path + nearby data-* attributes + accessible role + text content).
- <strong>Keyboard input</strong> per element (the actual values typed, masked or unmasked depending on PII rules).
- <strong>Scroll position</strong> per scrollable container at each timestamp.
- <strong>Navigation events</strong> (URL changes, including soft-routed History API pushes).
- <strong>Network calls</strong> with request + response status + body shape (so assertions can verify "the POST /checkout returned 200").
- <strong>DOM mutations</strong> with timestamps (so assertions can verify "the success banner appeared after the click").
- <strong>Console errors</strong> (so the generated test can assert no error fired during the flow).
The five-step generation pipeline
How a captured session becomes a runnable spec, step by step:
1. Filter the action stream
A raw capture has hundreds of events per second (mouse moves, scroll deltas, micro-mutations). The generator first filters to the meaningful actions — typically the click/keystroke/navigation/submit events. Mouse moves and scrolls are usually dropped unless they cross a meaningful threshold (scrolled past a fold, hovered a tooltip target for >500ms).
2. Extract a stable selector per interacted element
This is the hardest step. A click event has a target element — but the DOM path that worked at capture time may not work at test time (random IDs, CSS-in-JS class names, virtualised lists). A good selector-extraction algorithm tries strategies in order of stability:
- <strong>data-testid</strong> attribute (most stable; intentionally added by engineers).
- <strong>role + accessible name</strong> (semantic + screen-reader-aware; survives CSS refactors).
- <strong>text content</strong> for buttons, links (e.g. <code>getByRole('button', { name: 'Add to cart' })</code>).
- <strong>label-based selector</strong> for form inputs (<code>getByLabel('Email')</code>).
- <strong>nth-of-type + parent role</strong> as fallback (least stable; flag for review).
3. Emit the Playwright actions
Each filtered event maps to a Playwright API call: click → page.click(); keystroke sequence on a single input → page.fill(); navigation → page.goto() (or assert URL after a click for soft-routed nav); scroll past fold → page.evaluate(scrollTo). Wait conditions are inserted automatically: after a click that triggered a network call, the generator inserts await page.waitForResponse() against the captured URL pattern.
4. Infer assertions from observed state
The captured state at each step grounds assertions. After a successful checkout, the captured DOM showed an order-confirmation panel — the generator emits await expect(page.getByRole('heading', { name: /order confirmed/i })).toBeVisible(). After a form submission, the captured network call returned 200 — the generator asserts the status. After an interaction, no console error fired — the generator wires page.on('pageerror', ...) to fail on errors.
5. Emit the .spec.ts file
A test file gets written with the session's metadata as comments at the top (URL, capture date, user-agent, replay-link) and the generated actions inside a test('description', async ({ page }) => { ... }) block. The description comes from the session's flow summary (AI-generated from the action stream) so future engineers can read it as documentation.
What the generated test looks like — an example
A captured checkout-flow session for an e-commerce site produces something like this:
// auto-generated from session 0x7a2f...e91d (2026-05-25T14:33:12Z)
// flow: search → add to cart → checkout → success
// captured: https://staging.shop.example.com
// replay: https://relyv.ai/s/0x7a2f...e91d
import { test, expect } from '@playwright/test';
test('checkout: blue running shoes', async ({ page }) => {
await page.goto('/products');
await page.getByPlaceholder('Search products').fill('blue running shoes');
await page.getByRole('button', { name: 'Search' }).click();
await page.getByRole('link', { name: /Nimbus 7.*Blue/ }).click();
await page.getByRole('button', { name: 'Add to cart' }).click();
await page.getByRole('link', { name: 'Checkout' }).click();
await page.getByLabel('Card number').fill('4242424242424242');
await page.getByLabel('Expiry').fill('12/27');
await page.getByLabel('CVC').fill('123');
const checkoutResp = page.waitForResponse(/\/checkout$/);
await page.getByRole('button', { name: 'Pay now' }).click();
expect((await checkoutResp).status()).toBe(200);
await expect(page.getByRole('heading', { name: /order confirmed/i })).toBeVisible();
});Limits + what manual review still has to catch
Generation isn't a free lunch. The generator should always flag, not silently include:
- <strong>nth-of-type selectors</strong> — fragile across UI changes; engineer should add a <code>data-testid</code> instead.
- <strong>Time-sensitive assertions</strong> — a captured network call that returned in 80ms might take 800ms in CI; explicit waits beat implicit timing.
- <strong>Hard-coded values</strong> — captured credit-card numbers, emails, addresses. Generator masks during capture; review prompts the engineer to template them.
- <strong>Cross-session state</strong> — a flow that depended on session-local state (logged-in user, populated cart) needs setup the generator can't infer.
- <strong>Visual regressions</strong> — Playwright generation captures functional behaviour; visual-regression assertions (screenshot comparisons) need a separate step.
How Relyv does it
Relyv generates Playwright + Cypress specs from any captured session, with the selector-extraction strategy described above (data-testid → role → text → label → fallback). Generated specs are previewed before commit so the engineer can edit, run locally, and add to the test suite via one-click PR. The full pipeline lives in /features/playwright-test-generation; the underlying capture event stream is documented in how DOM replay works.