PRIVATE BETA SURFACE
GitHub PR Review App
Post evidence-backed UX concerns directly on pull requests so teams can merge with fewer surprises.
This surface is available in private beta. Teams use it in active PR workflows while we refine defaults and controls from real feedback.
Evidence-Backed by Default
The app only posts concerns when an issue is observable: readable UI copy, visible state mismatch, screenshot evidence, console/runtime error, or network failure. No evidence means no blocking claim.
How the PR loop works
Install + scope
Install the app on selected repos and map it to staging or preview environments first.
Run on PR events
On opened/synchronized PRs, Flock runs synthetic journeys focused on changed flows.
Post grounded feedback
The app posts inline comments and a summary only when claims have attached evidence artifacts.
Gate with clarity
A check run reports pass/fail with severity thresholds and direct links to artifacts.
Artifacts in pull requests
Inline concern comment (example)
Flock UX Concern (Major, 92% confidence)
Claim:
The shipping method labels are visually identical, and novice personas selected the wrong option 3/4 times.
Evidence:
- Screenshot: artifacts://runs/run_013/step-07.png
- DOM text: "Standard" and "Priority" appear without delivery-time context
- Replay event: timeline://run_013/events/183
Suggested change:
Add helper text under each method (e.g., "3-5 days" vs "1-2 days") and increase spacing between radio cards. Check run summary (example)
check: flock/ux-evidence
status: failure
critical: 0
major: 2
moderate: 3
blocking_findings:
- checkout/shipping-method labels ambiguous (evidence: screenshot + DOM)
- payment tab-switch drops field state (evidence: console error + replay trace) What ships to GitHub
- Inline comments anchored to changed files
- PR summary with severity totals and top blockers
- Check run status for merge gating
- Artifact deep links (screenshots, DOM snippets, errors, traces)
Configuration shape
Configure this at the repo level with explicit evidence requirements and merge-gate thresholds.
# .flock/github-app.yml
mode: pull_request
trigger:
on_opened: true
on_synchronize: true
policy:
minimum_evidence_per_claim: 1
require_artifact_link: true
block_merge_on:
- severity: critical
- severity: major
confidence_gte: 0.85
evidence:
include:
- screenshot
- dom_snapshot
- console_error
- network_failure
output:
inline_comments: true
summary_comment: true
check_run: true Tell us what should block a merge
We are deciding which evidence types and severity levels should be blocking by default. Feedback now directly shapes the product.
Get Started