Advanced Tools — Network, JavaScript Eval, File Upload

9 min read

The core navigate/click/type tools handle most flows, but a handful of advanced tools turn Playwright MCP from "automated clicker" into "general-purpose browser agent." Network inspection lets the assistant verify what actually hit the API. JavaScript evaluation opens up DOM extraction and custom assertions. File upload, drag-and-drop, dialog handling, tabs, and viewport resize fill in the rest. This lesson walks each one with a real prompt and the moment it earns its slot.

These tools are powerful and also — for the unsafe ones — genuinely dangerous when pointed at production. The lesson closes on the security boundary; read that section before using any of them against a non-throwaway environment.

Network — verifying side effects, not just UI

Two tools work together:

browser_network_requests — list every HTTP request issued since page load (or since the last reset). The assistant uses this to confirm a click actually triggered the expected call.

browser_network_request — fetch full details of a specific request: URL, method, status, headers, request body, response body.

A real prompt where this matters:

Place an order for one Premium t-shirt. After clicking "Place order," verify that
POST /api/orders was sent with quantity 1, returned 201, and that the response body
contains an order id matching the value shown on the confirmation page.

The assistant orchestrates the click, calls browser_network_requests to find the matching POST, fetches the full request and response, and cross-checks the order id against the snapshot. UI assertions alone can't catch "the button looked clicked but no request fired" — network assertions can.

Use this any time the test is genuinely about what the system did rather than what the user saw.

browser_evaluate — when the snapshot isn't enough

browser_evaluate runs an arbitrary function in the page and returns the result.

() => Array.from(document.querySelectorAll('.product')).map(p => p.textContent.trim())

Prompt:

Use evaluate to return all product names visible on the page, then verify the list
is sorted alphabetically.

When to reach for it:

  • Custom DOM extraction the snapshot doesn't capture — data attributes, computed styles, dataset payloads.
  • App-state introspection when the app exposes a global (e.g., window.__APP_STATE__) for testing.
  • Computed assertions the assistant can't easily express through other tools — checking sort order, computing totals, parsing complex strings.

Two cautions: the function runs in page context, so it cannot import test utilities or read local files. And the return value must be JSON-serialisable — DOM nodes, functions, and circular references won't cross the bridge.

browser_run_code_unsafe — power and danger

browser_run_code_unsafe lets the assistant execute arbitrary Playwright code, not just the predefined tools. It can construct any locator, call any Page API, write files, and chain Playwright operations the standard tools don't expose.

await page.context().tracing.start({ screenshots: true, snapshots: true });
await page.goto('/checkout');
// … steps …
await page.context().tracing.stop({ path: '/tmp/trace.zip' });

The "unsafe" suffix is real. There is no allowlist of operations and no sandboxing beyond what your shell already enforces. Use it for:

  • Building Playwright traces during an exploratory session.
  • Custom retry loops the standard tools don't compose into.
  • Anything that genuinely needs the Page API directly.

Don't use it against production. Don't use it under credentials that have write access to anything you can't afford to lose. Treat it the way you'd treat shell access for an automation account.

File operations

browser_file_upload — supply a local path to a page's <input type="file">. File inputs can't be driven by typing; this tool wraps Playwright's setInputFiles so the assistant can complete CSV imports, image uploads, and document attachments.

browser_drop — drop file(s) onto a drop zone. For modern UIs that accept drag-and-drop file uploads alongside (or instead of) a file picker.

browser_drag — drag from one element to another. Reordering lists, moving cards across a kanban board, drawing in canvas-based design tools.

A combined prompt:

Go to /import. Drag the file /tmp/contacts.csv onto the upload zone. Wait for
"Import complete" to appear. Then verify the network call POST /api/contacts/import
returned 200 and processed 250 rows.

Two specialised tools, one network assertion, one user flow.

Dialogs, tabs, and viewport

browser_handle_dialog — accept, dismiss, or respond to JavaScript alert, confirm, and prompt dialogs. Without this, a confirm("Are you sure?") blocks the session indefinitely.

browser_tabs — list, switch, open, and close tabs. Required for multi-tab flows: opening a help link in a new tab, OAuth popups, multi-step wizards that span windows.

browser_resize — change the browser viewport. "Resize to 375×812 (iPhone portrait) and verify the mobile menu is visible" is one tool call away. Pair with vision mode for layout assertions across breakpoints.

A composed workflow

What makes these tools matter isn't any one of them — it's how the assistant chains them. Single prompt:

Log in as admin. Navigate to /products. Verify the GET /api/products call returned
within 2 seconds. Click any product whose name starts with "Premium." On the detail
page, take a screenshot. Then resize the viewport to 375×812 and confirm the
"Add to cart" button is still visible.

Behind the scenes:

  • browser_navigatebrowser_snapshot → two browser_typebrowser_click (login)
  • browser_navigate to /products
  • browser_network_requestsbrowser_network_request (verify timing on the GET)
  • browser_evaluate to find products with names starting "Premium"
  • browser_click on the matching product
  • browser_take_screenshot (vision)
  • browser_resize to mobile viewport
  • browser_snapshot to confirm the button is still in the tree

One prompt, one session, eight different tools — and a deterministic Playwright equivalent waiting on the other side.

Tools at a glance

Advanced MCP Tools
  • – browser_network_requests
  • – browser_network_request
  • – browser_console_messages
  • – browser_evaluate (sandboxed)
  • – browser_run_code_unsafe (full Page API)
  • – browser_file_upload
  • – browser_drop (file drag-and-drop)
  • – browser_drag (element-to-element)
  • – browser_handle_dialog
  • browser_tabs (list/switch/open/close) –
  • browser_resize (viewport) –
  • browser_take_screenshot (vision) –

The security boundary, restated

The advanced tools cross from driving the UI a user could drive into executing arbitrary code with browser-level access. That distinction matters:

  • The core tools (click, type, navigate) can do anything a logged-in user could do. Bad enough; bound by the app's own permissions.
  • browser_evaluate can do anything page JavaScript can do — read tokens out of localStorage, exfiltrate state, mutate the DOM in ways no user could.
  • browser_run_code_unsafe can do anything Playwright can do, which includes filesystem writes via traces, network capture, and arbitrary navigations.

Run advanced-tool sessions against staging, with throwaway credentials, in a scoped browser profile. Don't paste production cookies into the chat. Don't enable --vision plus full page eval against an internal admin tool unless you fully trust the prompt and the audit trail.

⚠️ Common mistakes

  • Asserting on the UI when the system call is what matters. "The button shows a checkmark" is not the same as "the order was created." When a test is about side effects — orders placed, emails sent, rows inserted — assert against browser_network_request results or a follow-up DB check, not the post-click UI alone.
  • Reaching for browser_run_code_unsafe when browser_evaluate would do. Eval runs in the page sandbox; unsafe runs the host Playwright API. The blast radius is wildly different. Default to eval; only escalate to unsafe when you genuinely need the Page API.
  • Forgetting to resize back, switch tabs back, or accept dialogs deterministically. Long sessions accumulate state — a leftover modal, a 375px viewport, an extra tab. The next prompt step then runs in unexpected conditions and fails. Reset explicitly between logical phases of a session, or restart the session for a clean slate.

🎯 Practice task

Build a multi-tool session and harden the generated test. 30 minutes.

  1. Pick a flow on your staging app that involves both UI and a clear API side effect — placing an order, submitting a form that emails a notification, uploading a file. Write a prompt that:
    • Drives the UI to the action.
    • Uses browser_network_requests to verify the expected API call fired.
    • Uses browser_network_request to assert the status code and one field of the response body.
  2. Run the session. Read the tool-call panel and confirm the network assertions ran. If the request you expected isn't in the list, that's a real finding — file or fix it.
  3. Ask the assistant to emit the equivalent Playwright TypeScript test, using page.waitForResponse(...) for the network assertion. Save and run it.
  4. Stretch: add a browser_evaluate step to extract a value from the page (e.g., the rendered total) and assert it equals the value from the API response. This kind of cross-source assertion catches whole classes of bugs UI-only tests miss.
  5. Security drill: review the prompt and generated code together. If anything in either references real user credentials or production URLs, replace with disposable equivalents before committing or sharing. Build the habit now while the stakes are low.

You've covered the full Playwright MCP tool surface. The next chapter shifts focus from running the tools to generating production-quality Playwright code from sessions like the one you just built.

// tip to track lessons you complete and pick up where you left off across devices.