PRIVATE BETA SURFACE
Flock MCP Server
Give coding agents direct access to evidence-backed UX findings so they can plan and implement real fixes in context.
This surface is available in private beta with scoped auth and evidence-first tool contracts. We refine boundaries and defaults from customer feedback.
Evidence-Backed by Default
MCP tools prioritize grounded retrieval. Findings are expected to include artifact links and explicit evidence fields, so agents can cite what they read or saw instead of guessing.
Quickstart shape
Target setup follows standard MCP client configuration with scoped credentials and project isolation.
{
"mcpServers": {
"flock": {
"command": "npx",
"args": ["-y", "@flock/mcp-server"],
"env": {
"FLOCK_API_KEY": "${FLOCK_API_KEY}",
"FLOCK_PROJECT_ID": "proj_123"
}
}
}
} What agents can query
Tool contract example
Tool: flock.list_findings
Input:
{
"run_id": "run_013",
"severity_gte": "major",
"requires_evidence": true
}
Output:
{
"findings": [
{
"id": "friction-07",
"summary": "Shipping options are visually ambiguous",
"severity": "major",
"evidence": {
"screenshot": "artifacts://runs/run_013/step-07.png",
"dom_excerpt": "<label>Standard</label><label>Priority</label>",
"trace_event": "timeline://run_013/events/183"
},
"suggested_change": "Update option labels with delivery-time helper text"
}
]
} Prompt patterns to support
- Show all critical findings for run_013 that include screenshot evidence.
- Generate an addressable fix prompt for friction-07 and include linked artifacts.
- List findings that are unsupported by evidence so I can ignore them.
- Compare run_013 vs run_014 and show only new, evidence-backed regressions.
Safety and evidence guarantees
Evidence required
Tools can enforce requires_evidence=true so agents only consume grounded findings.
Scoped access
Project and environment scoping limit what each agent instance can read.
Read-first posture
Default behavior is retrieval and explanation; write actions stay explicit and auditable.
Help shape the MCP boundary
We are deciding which tools should be read-only, which fields are mandatory for evidence, and how strict default filters should be. Feedback now directly determines the shipped behavior.
Get Started