1Abstract
This document specifies a set of web surfaces, headers, and primitives that together enable an AI agent to discover, register with, authenticate against, and operate a website as a first-class user — without relying on screen scraping, browser emulation, or out-of-band coordination with a human.
The specification organizes requirements around four orthogonal agent environments — browser agents, headless coding agents, MCP-client agents, and in-site agents — rather than a single linear conformance ladder. A site MAY support one environment without supporting any other; each has its own minimum viable surface.
This document is a draft. It incorporates learnings from production deployments and explicitly identifies surfaces whose underlying standards have not converged.
2Conformance and terminology
The key words MUST, MUST NOT, SHOULD, and MAY in this document are to be interpreted as described in RFC 2119.
A "user agent" in this specification refers to any automated client — including but not limited to: large-language-model-based coding assistants (Claude Code, Codex, Cursor), browser-integrated tool-invoking assistants (MCP-aware sidebars), MCP client hosts (Claude Desktop, Cursor IDE), and first-party in-site agent surfaces.
An "origin" in this specification is as defined in RFC 6454. Surfaces specified at well-known paths MUST be served from the site's canonical origin.
This specification does not define an agent-identity protocol. Section 6 discusses forward-compatibility with future agent-identity standards (such as RFC 8693 token exchange); implementers SHOULD follow the audit-log and soft-header guidance in §4.10 to remain migratable.
3Environments
A conforming site MAY support any subset of the four environments. Each environment has a defined minimum surface and a recommended surface. A site claiming conformance with an environment MUST meet the minimum for that environment and SHOULD meet the recommended.
| Environment | Minimum (MUST) | Recommended (SHOULD) |
|---|---|---|
| E1. Browser agent In-tab assistants using the user's existing session. |
Same-origin API reachable from XHR/fetch with cookie credentials. JSON error bodies (§4.10). CORS (§4.10). | WebMCP tool registration on relevant pages (§4.8). Page-context tools auto-register on navigation. |
| E2. Headless coding agent Claude Code, Cursor background, Codex, scripts. |
/AGENTS.md (§4.2). Programmatic signup (§4.3). Bearer-token API. JSON errors (§4.10). |
Content negotiation (§4.1). Re-retrievable PAT (§4.4). Credentials-file convention. Magic-link endpoint (§4.5). Coding-agent repo access (§4.9). |
| E3. MCP-client agent Claude Desktop, Cursor IDE, MCP hosts. |
At least one of /.well-known/mcp/server-card.json (SEP-1649) or /.well-known/mcp (SEP-1960) (§4.7). |
Both. Typed tool definitions. OAuth-compatible or header-passed auth. Documented setup.md URL. |
| E4. In-site agent First-party agent bar, sidebar, or command palette. |
Some first-party UI element that invokes agent behavior against the site's own API with the logged-in session. | A "For Agents" navigation entry. Hand-off primitive ("Open in your agent"). Clear transitional framing — E4 fades when E1/E2 are ubiquitous. |
4Patterns
4.1Content negotiation and the .md suffix
A site conforming to environment E2 MUST support at least one of:
- Responding to
Accept: text/markdownon an HTML URL by returning the canonical markdown representation of that resource (typically the rendered page's markdown source with YAML frontmatter). The response MUST includeVary: Accept. - Serving the same markdown representation at a URL formed by appending
.mdto the HTML URL.
Sites SHOULD support both. When supporting the .md suffix,
the HTML response SHOULD advertise it via
Link: <URL.md>; rel="alternate"; type="text/markdown".
GET /pages/quickstart HTTP/1.1
Accept: text/markdown
HTTP/1.1 200 OK
Content-Type: text/markdown; charset=utf-8
Vary: Accept
Link: </pages/quickstart.md>; rel="alternate"; type="text/markdown"
---
title: Quickstart
author: jacob
updated: 2026-04-14
---
# Quickstart
...
Empirical data from commercial CDNs in 2026 shows approximate 80% reduction in agent token consumption when agents request markdown directly instead of parsing HTML. This is the single highest-impact low-cost pattern in this specification.
4.2AGENTS.md at site root
A site conforming to environment E2 MUST publish a
markdown document at /AGENTS.md containing, at minimum:
- A one-sentence description of the site.
- The base URL for the API (if any).
- The signup endpoint and request shape.
- The authentication scheme (e.g.,
Authorization: Bearer <PAT>). - The URL of
llms.txt(if present). - The URL of the MCP server card (if present).
The file MAY contain additional information including
CLI installation instructions, example workflows, credential-file conventions, and per-endpoint examples.
A site MAY publish additional AGENTS.md documents
at subpaths (e.g., /@user/project/AGENTS.md) to federate instructions for
sub-sections of the site.
4.3Programmatic account registration
A site conforming to environment E2 MUST provide at least one endpoint that creates a new account and returns a credential usable for subsequent authenticated requests, without requiring:
- Confirmation by email or SMS;
- Completion of a CAPTCHA;
- Human review.
If a site requires proof-of-work or rate-limiting at the signup boundary, it SHOULD accept a hashcash-style token computed by the client as an alternative to interactive challenges.
Email, if collected at all, SHOULD be treated as an optional affiliation field and MUST NOT block initial use of the API.
POST /api/v1/accounts HTTP/1.1
Content-Type: application/json
{"username": "jacob-agent"}
HTTP/1.1 201 Created
Content-Type: application/json
{
"user_id": "u_01HNKV...",
"username": "jacob-agent",
"api_key": "pat_live_abc123...",
"client_config": {
"credential_path": "~/.yourapp/credentials.json",
"mode": "0600"
}
}
4.4Re-retrievable personal access tokens
A site conforming to environment E2 SHOULD provide a mechanism by which an authenticated user (via password or other long-lived credential) can retrieve a current valid PAT without creating a new account.
Rationale: coding agents frequently lose ephemeral credentials due to workspace resets or session boundaries. Forcing re-signup pollutes the username space and correlates multiple identities to the same human user.
POST /api/v1/auth/key HTTP/1.1
Content-Type: application/json
{"username": "jacob-agent", "password": "..."}
HTTP/1.1 200 OK
{"api_key": "pat_live_abc123..."}
A site implementing this pattern SHOULD document a
canonical filesystem location and file mode in the signup response's client_config,
and MAY provide shell or language-specific snippets.
4.5Magic login links for agent-to-human handoff
A site conforming to environment E2 MAY expose an endpoint that, given a valid PAT, returns a short-lived single-use URL that logs the user into the browser without exposing the PAT in the URL path or query string.
This primitive enables an agent-to-human handoff pattern: an agent completes background work, then surfaces a URL the human can click to review, confirm, or continue interactively.
POST /api/v1/auth/magic-link HTTP/1.1
Authorization: Bearer pat_live_abc123...
Content-Type: application/json
{"next": "/settings", "ttl": 600}
HTTP/1.1 200 OK
{
"login_url": "https://example.com/auth/magic/xyz...",
"expires_at": "2026-04-16T17:24:00Z"
}
The returned URL MUST be single-use and MUST expire within 24 hours; a TTL of 10 minutes is SHOULD-level default. The server MUST NOT accept the link after redemption.
4.6llms.txt and llms-full.txt
A site SHOULD publish a site description at
/llms.txt following the conventions of
llmstxt.org: an H1 title, a blockquote summary,
followed by one or more H2 sections each containing a markdown link list. A site
MAY additionally publish a full-content expansion at
/llms-full.txt.
Adoption of llms.txt by major LLM answer engines is measured at
under 1%. Implementers are advised not to expect referral traffic from this surface.
Its primary value is credibility, single-shot agent onboarding, and implementation
discipline.
4.7MCP discovery surfaces
A site conforming to environment E3 MUST publish at least one of:
/.well-known/mcp/server-card.jsonfollowing the draft conventions of SEP-1649 (name, description, tools, auth)./.well-known/mcpfollowing the draft conventions of SEP-1960 (endpoint enumeration).
Sites SHOULD publish both; neither proposal has merged into the core MCP specification, and clients vary in which they probe.
Both SEPs remain in flight as of publication. A site should treat them as forward-compatible shims; merge status and future spec changes may rename these paths. Implementers SHOULD monitor spec.modelcontextprotocol.io.
4.8WebMCP tool registration
A site conforming to environment E1 SHOULD register
typed tools on relevant pages via the W3C WebMCP draft API
(navigator.modelContext.registerTool()), gated on feature detection.
Registration MUST be idempotent across navigations.
Tools that depend on page context SHOULD auto-register
on the pages where they apply and deregister elsewhere. Tool execute
handlers SHOULD use credentials: 'include'
when calling same-origin API routes so the user's existing session handles authentication.
Tools that mutate server state SHOULD require user confirmation in the agent's UI surface. Silent destructive writes are MUST NOT-level.
<script defer>
if ("modelContext" in window.navigator) {
window.navigator.modelContext.registerTool({
name: "search_pages",
description: "Search pages in the current site.",
inputSchema: {
type: "object",
properties: { query: { type: "string" } },
required: ["query"]
},
execute: async ({ query }) => {
const r = await fetch("/api/v1/search?q=" + encodeURIComponent(query), {
credentials: "include"
});
const data = await r.json();
return { content: [{ type: "text", text: JSON.stringify(data) }] };
}
});
}
</script>
4.9Coding-agent-with-filesystem pattern (informative)
Many sites that organize their canonical content as files in a git repository can offer agents a higher-level interface than RPC: a clone of the repository, a set of filesystem tools, and a push token. The agent operates on files and commits back.
This pattern is informative rather than normative because it does not define a wire protocol; it reuses existing git infrastructure. A site wishing to offer it SHOULD:
- Advertise the git clone URL for each user-visible resource (e.g., in the HTML
linkheader, inAGENTS.md, or in a/.well-known/-style manifest). - Document the branch/committer conventions expected for agent-authored commits.
- Accept agent pushes via HTTP Basic auth using a PAT as the password.
- Document whether agent commits trigger asynchronous site updates (webhook, poll, etc.).
4.10Errors, CORS, and audit headers
Every 4xx and 429 response on an agent-facing endpoint MUST
have a Content-Type of application/json (or application/problem+json
per RFC 7807) and MUST include a machine-readable error body
with at least the fields:
{
"error": "rate_limited",
"message": "Per-key quota exceeded.",
"retry_after_seconds": 60,
"docs_url": "https://example.com/api/docs#rate-limits"
}
HTTP 429 responses MUST include a Retry-After
header. Responses that trigger quotas SHOULD include
X-RateLimit-Remaining and X-RateLimit-Reset headers.
Agent-facing endpoints (those documented in AGENTS.md, llms.txt,
or the MCP server card) MUST send
Access-Control-Allow-Origin: * on GET responses, or reflect the requesting
origin if credentials are required.
Agent-facing endpoints SHOULD accept and log (but not
enforce) a soft header X-Agent-Name identifying the agent software.
Implementations SHOULD retain this value in audit logs
and MAY expose it to users of the resource being acted upon.
5Badges
A conforming site MAY display per-environment badges reflecting the surfaces it has implemented. Badges are awarded per environment, not cumulatively:
| Badge | Criteria (all MUST be met) |
|---|---|
| Markdown Native | §4.1 |
| Headless Ready | §4.2, §4.3, §4.10 |
| MCP Server | §4.7 (at least one SEP) |
| WebMCP Ready | §4.8, §4.1, §4.10 |
| Agent UX | E4 minimum, plus a published "For Agents" nav entry and a hand-off primitive |
A future revision of this specification will define an automated verifier
(/check?url=…) to award badges based on crawled evidence. Until that verifier
ships, badges are self-asserted and implementations SHOULD
link to the specific URLs demonstrating each badge's criteria.
6Security considerations
Signup abuse. Programmatic signup (§4.3) expands the abuse surface. Sites
SHOULD rate-limit signup by IP, log X-Agent-Name,
and reject obviously automated burst patterns. A hashcash token offers a cost gradient
that humans and agents handle equivalently.
PAT exposure. Re-retrieval (§4.4) trades one-shot secrecy for operational resilience. Sites SHOULD allow users to rotate all PATs on demand and SHOULD scope PATs narrowly.
Magic link attacks. The endpoint in §4.5 is an account-takeover vector if TTL or single-use semantics are not enforced. Implementations MUST invalidate a link on redemption and MUST expire all outstanding links on password change.
WebMCP write amplification. Browser-registered tools inherit the user's session (§4.8). Malicious pages can register tools; the browser SHOULD surface the tool provenance to the user. Servers SHOULD rate-limit per session even when the request appears human-origin.
Forward compatibility with agent identity. This specification does
not define an agent-identity protocol. A future version is expected to profile either
RFC 8693 token exchange or an equivalent primitive. Sites adopting §4.10
(audit-logged X-Agent-Name) are positioned to migrate without
breaking clients.
7Example conforming sites
The following are self-described conforming implementations. They are listed as examples, not endorsements; inclusion does not imply automated verification.
| Site | Environments | Notes |
|---|---|---|
| WikiHub | E1 (partial), E2, E3, E4 | Reference implementation. Ships all ten patterns. Source of the coding-agent pattern in §4.9 via its Curator feature. |
| ListHub | E1, E2, E3, E4 | Earlier sibling project. WebMCP (§4.8) implementation with page-context tools. /AGENTS.md, /llms.txt, /.well-known/mcp/server-card.json. |
Implementers who wish to be listed as conforming examples may open a pull request against github.com/tmad4000/agentfirst-web.
8Normative references
- RFC 2119 — Key words for requirement levels.
- RFC 6454 — The Web Origin Concept.
- RFC 7807 — Problem Details for HTTP APIs.
- RFC 8615 — Well-Known URIs.
Informative references
- llmstxt.org — Original
llms.txtproposal (Jeremy Howard, AnswerDotAI). - agents.md —
AGENTS.mdadoption registry. - Model Context Protocol — upstream spec for §4.7 and §4.8.
- SEP-1649 · SEP-1960 — draft MCP discovery proposals (see MCP repo).
- RFC 8693 — OAuth 2.0 Token Exchange (candidate for future agent-identity profile).
Editorial note. This is a living draft. Comments and pull requests are welcome at the spec repository. A companion non-normative overview with a reference implementation is available at the landing page.