Bot handling

How OakData classifies bot and crawler traffic — tagged at ingest, stored but never billed, and filterable.

Bots, crawlers, and automation hit every public site — search indexers, AI agents, uptime monitors, preview scrapers. OakData's philosophy is tag, don't drop: bot traffic is classified and stored so you can see it, but it never counts against your billable usage, and you can exclude it from any view with a single flag.

Where classification happens

The server is the source of truth. At ingest, OakData classifies each request from its user agent and marks the resulting rows with is_bot and a bot_name (e.g. Googlebot, GPTBot, ClaudeBot). This runs on the server, where the real request headers are available.

The $bot hint

Some bot signals only exist in the browser — navigator.webdriver, headless markers, automation brands — and the server can't see them. So the SDK does a lightweight client-side check and, when it suspects automation, tags its events with a $bot hint. The server O-Rs that hint with its own classification, so a headless browser that spoofs a normal user agent is still caught.

Default behaviour: tag, not block

By default the SDK does notdrop bot traffic client-side — it tags it and lets the server decide, so crawlers still show up in your dashboard. The events are recorded; they're just flagged.

Stored, but not billed

Bot-flagged events are written to your project so the dashboard can surface crawler activity — useful for confirming an AI agent or search engine is reaching your pages. But OakData only bills for non-bot events, so crawler floods never inflate your usage or push you toward a plan limit.

Excluding bots from a view

Most read surfaces default to excluding bots, and the traffic overview takes an explicit exclude_bots flag:

exclude bots
bash
curl "https://oakdata.co/api/v1/overview?range=30d&exclude_bots=true" \
  -H "Authorization: Bearer oak_sec_xxxxxxxxxxxxxxxxxxxxxxxx"

Over MCP, the get_overview tool accepts the same exclude_bots argument. Each session and event also carries its is_bot / bot_name fields, so you can split human and bot traffic however you like.