AI-Native Web for Energy: Playwright, Firecrawl & Real-World Deployment Lessons

The Problem Nobody Talks About

Every energy utility runs on web interfaces nobody designed for machine consumption. Your SCADA historian has a web UI. Your GIS platform serves data through a portal. Your state PUC posts critical regulatory updates to HTML pages. Your vendor publishes firmware advisories in PDF format embedded in JavaScript-heavy pages.

The conventional wisdom says APIs solve this. In practice, 70% of the systems we interact with in energy operations either have no API, have an API so poorly documented it's unusable, or charge enterprise pricing that makes the CFO's eye twitch. Meanwhile, the data you need is right there in the web interface your operators use every day.

We spent the last eighteen months building AI systems that treat the web as a first-class data source. Not scraping in the grey-market sense, but architecting workflows where browser automation and LLM-powered content extraction are primary integration patterns. Here's what we learned deploying this across three utilities and two renewable operators.

Why Browser Automation Became Critical

Our first deployment was straightforward: a Midwest utility needed to monitor their primary equipment vendor's support portal for critical firmware advisories. The vendor had no API. They posted updates to a web portal behind authentication. Missing an advisory meant potential NERC CIP violations.

We built it with Playwright in three days. The script logged in every six hours, navigated to the advisories section, extracted new posts, fed the content to Ollama running Llama 3.1 8B to classify severity, and posted alerts to our n8n workflow. Total infrastructure cost: one Ubuntu VM. The system caught a critical substation relay firmware issue four days before the vendor's email notification arrived.

That changed the conversation. Within two months, we had requests to monitor eleven different vendor portals, three regulatory sites, and the RTO's outage coordination system. Browser automation went from "interesting experiment" to production infrastructure.

Playwright vs. Puppeteer: The Details Matter

We standardized on Playwright after running both in production. Puppeteer is mature and has more community resources, but Playwright's architecture made the difference in reliability.

First, Playwright's auto-wait mechanism is dramatically better. It automatically waits for elements to be actionable before interacting—not just present in the DOM, but actually clickable and stable. In energy sector portals built with legacy JavaScript frameworks, this matters enormously. Our Puppeteer scripts needed explicit waits everywhere. Our Playwright equivalents just worked.

Second, Playwright's network interception is cleaner. We monitor what data the page fetches, block unnecessary resources to speed execution, and capture API calls the page makes internally. One utility's asset management system had no documented API, but the web UI called internal JSON endpoints. We extracted those endpoints with Playwright network monitoring, then called them directly. Cut our data retrieval time from 40 seconds to 800 milliseconds.

Third, Playwright's trace files saved us dozens of debugging hours. When a script fails at 3 AM, the trace shows you exactly what the browser saw, every network request, every console message, every DOM mutation. You open it in Playwright's trace viewer and watch the failure happen in slow motion. With Puppeteer, we were guessing based on screenshots and logs.

The version that proved stable for us: Playwright 1.40.1 with Chromium. We pin versions aggressively. Browser automation breaks when upstream changes rendering behavior.

Firecrawl Changed How We Handle Content

Playwright gets you the page. Firecrawl makes it usable for LLMs.

The problem with raw HTML is that it's 90% structural markup and 10% content. You can strip tags with BeautifulSoup, but you lose document structure that LLMs need for context. You can feed raw HTML to an LLM, but you waste 80% of your context window on nav menus and footer links.

Firecrawl solves this by rendering the page in a real browser, extracting the semantic content, and returning clean markdown with document structure preserved. Headings become markdown headers. Lists stay lists. Tables convert to readable text. It handles JavaScript rendering, waits for dynamic content, and even chunks large documents intelligently.

Our most successful Firecrawl deployment monitors regulatory filings. State PUCs publish orders as HTML pages with embedded PDFs. We point Firecrawl at the index page, it crawls linked documents, returns structured markdown for each filing, we feed that to Ollama with a prompt that extracts action items and deadlines, and we route those to the compliance team's ERPNext project tasks.

Before Firecrawl, this workflow used Playwright to download PDFs, pdf2text for extraction, custom Python to clean the output, and still produced garbage for scanned documents. Firecrawl handles it in one API call. The markdown quality is good enough that our Llama 3.1 8B model catches compliance deadlines with 94% accuracy based on our six-month audit.

The self-hosted version runs on a 4-core VM with 16GB RAM. We process about 200 pages per day across all monitoring workflows. API response time averages 3-4 seconds per page. For our use case, self-hosting made sense—data sovereignty and no per-page API costs.

The Integration Pattern That Actually Works

Our production architecture looks like this: Playwright collects raw content, Firecrawl converts it to markdown, Ollama processes it with task-specific prompts, n8n orchestrates the workflow, and ChromaDB stores embeddings for historical search.

The n8n workflow runs on a schedule or triggered by external events. It calls our Playwright scripts via HTTP endpoints—we wrapped them in FastAPI services. The script returns structured data including URLs, screenshots, and status codes. If the page content matters, we call Firecrawl's API with the URL. Firecrawl returns markdown. We send that markdown to Ollama with a prompt like "Extract equipment model numbers and firmware versions from this vendor advisory."

Ollama returns structured JSON. We validate the schema, store the embedding in ChromaDB for semantic search later, and route the data to appropriate systems. Critical items go to PagerDuty. Routine updates flow to ERPNext. Everything logs to our central monitoring.

This pattern handles about 85% of our web automation needs. The other 15% requires custom logic—complex multi-step interactions, CAPTCHAs we solve with vendor APIs, or sites so poorly built they require careful orchestration.

What Breaks and How to Fix It

Browser automation in production means handling failure. Here's what broke and how we fixed it:

Authentication state: Sites log you out. Sessions expire. We store authentication state in Playwright contexts and refresh it proactively. Every script checks for login indicators before proceeding. If logged out, re-authenticate and retry.

Page structure changes: Vendors redesign their portals. Your selectors break. We use Playwright's text-based selectors ("click the link containing 'Firmware Updates'") over CSS selectors (".nav-menu > li:nth-child(3)") wherever possible. Text changes less frequently than DOM structure. When selectors do break, our monitoring catches it within one run cycle.

Rate limiting: Hit a site too hard and you get blocked. We added jitter to our schedules, randomized user agents, and added respectful delays. Most importantly, we only scrape what we need. No bulk downloads. Target specific pages at human-like intervals.

JavaScript rendering timing: Some pages load content via multiple async calls with unpredictable timing. Playwright's auto-wait helps, but not always enough. We added custom wait conditions: "wait until this specific element contains non-empty text." For especially difficult pages, we wait for network idle after the initial load.

Memory leaks: Long-running browser instances leak memory. We restart browser contexts every 50 pages. Our FastAPI services that wrap Playwright recycle worker processes every hour. Memory usage stays flat.

ChromaDB for Institutional Memory

Every piece of content we extract goes into ChromaDB as an embedding. This creates searchable institutional memory.

An operator asks, "What firmware versions did Vendor X recommend for our relays in the last year?" We query ChromaDB with that natural language question. It returns the top 10 relevant vendor advisories we've collected. The operator gets an answer in seconds instead of digging through email archives.

We run ChromaDB 0.4.18 in persistent mode on the same VM as our n8n instance. It's SQLite-backed, requires zero administration, and handles our embedding volume easily. We generate embeddings with Ollama's nomic-embed-text model—it's fast, runs locally, and produces quality vectors for retrieval.

Storage grows about 2GB per month across all our monitoring workflows. We archive embeddings older than two years to a separate collection. Queries typically return in under 200ms for our corpus of about 40,000 embedded documents.

The Compliance Angle

Energy sector compliance documentation is web-native and terrible. NERC publishes standards as HTML. FERC posts orders as PDFs embedded in web pages. State regulators use document management systems designed in 2008. ISOs publish market rules across hundreds of interlinked pages.

We built a compliance monitoring system that crawls all these sources weekly, extracts relevant sections based on our registered assets and operations, identifies changes since the last crawl, and highlights new obligations. It uses Firecrawl for content extraction, Ollama for change detection and relevance scoring, and presents results in AnythingLLM where compliance staff can ask questions against the full document corpus.

This system caught a NERC CIP standard revision that applied to one specific substation type we operate. The revision was published to a "minor updates" page we didn't regularly monitor. Our AI flagged it four weeks before our manual quarterly review would have caught it. That's the difference between planned compliance and scrambling.

What We'd Do Differently

If we started today, we'd invest more heavily in monitoring and alerting from day one. Browser automation fails silently. A selector breaks, the script returns empty data, and you don't notice until someone asks why they stopped getting updates.

We now have comprehensive checks: every workflow validates that it extracted the expected data structure, flags anomalies in content length or format, and sends a daily summary of what it collected. If a workflow runs but extracts zero items three times in a row, PagerDuty fires. This caught a dozen silent failures in the first month after implementation.

We'd also standardize on TypeScript for Playwright scripts instead of Python. Playwright's TypeScript API has better type safety and catches errors at development time. Python's dynamic typing caused runtime errors we should have caught earlier. The trade-off is that our operations team knows Python better, so we accepted the risk.

Finally, we'd budget more time for site-specific quirks. Every web portal has unique authentication flows, pagination patterns, and content structures. The first script takes three days. The tenth takes four hours because you've built reusable patterns. Don't underestimate the learning curve.

The Verdict

Browser automation and AI-native web scraping are production-ready technologies for energy operations. The tooling is mature, the infrastructure requirements are modest, and the operational benefits are substantial.

Playwright is the correct choice for browser automation. It's more reliable than Puppeteer, better maintained than Selenium, and has the right abstractions for production deployment. Pair it with Firecrawl for content extraction and you've eliminated 80% of the complexity in building web-based AI workflows.

The pattern that works: Playwright for automation, Firecrawl for content conversion, Ollama for processing, ChromaDB for storage, n8n for orchestration. This stack runs on modest hardware, keeps data sovereign, and integrates cleanly with existing energy sector systems.

Start with one high-value use case—vendor portal monitoring, regulatory change tracking, or equipment documentation collection. Build it in a week. Put it in production. Measure the time saved. Then expand.

The web wasn't designed as a machine-readable API, but with the right tools, it's close enough. In an industry where half your critical systems have no API and won't for another decade, that matters.

Dimension	Playwright	Firecrawl	Postiz
Content Extraction Quality	Raw HTML, requires parsing★★★☆☆	Clean markdown, LLM-ready★★★★★	N/A, outbound scheduling★☆☆☆☆
Setup Complexity	15min install, moderate code★★★★☆	5min API key, minimal code★★★★★	Docker deploy, 20min★★★★☆
JavaScript Rendering	Full browser, perfect★★★★★	Full browser, excellent★★★★★	Not applicable★☆☆☆☆
Cost at Scale	Self-hosted, $0/page★★★★★	$0.005/page or self-host★★★★☆	Self-hosted, unlimited★★★★★
OT Network Suitability	Runs fully air-gapped★★★★★	Self-host option available★★★★☆	Works in isolated networks★★★★★
Best For	Teams that need reliable browser automation with full control	Teams that need LLM-ready content with minimal engineering effort	Utilities managing social media presence with compliance requirements
Verdict	Best choice for production automation when you need JavaScript rendering and have engineering resources to write scripts.	Optimal for AI workflows where content quality matters more than fine-grained control over extraction logic.	Solves a different problem—outbound content publishing rather than data collection, but relevant for external communications teams.

AI-Native Web: What We Learned Deploying Browser Automation in Energy Operations