How we work.
An adversarial discovery engagement, validated by a four-model AI-Alliance, signed off by a single human auditor.
From scope to attestation in five business days.
The flow is deliberately short. Labs is not a months-long red-team program or a compliance readiness retainer. It is a focused external-surface assurance engagement that moves from authorization to evidence, remediation, retest, and signed closure.
Scope agreement
T-0
Adversarial discovery
T+1 to T+3
AI-Alliance Challenge
T+3
Remediation drafting
T+3 to T+4
Report assembly
T+4
Delivery + retest
T+5
01
30-min call. We agree on the surface, sign mutual NDA, and set written authorization for OSINT-only discovery.
02
Founder runs OSINT against public Docker registries, NPM packages, GitHub artifacts, CI/CD logs, and adjacent public sources.
03
Each candidate finding is independently re-evaluated by Claude, Gemini, Codex, and Mimo against raw evidence.
04
The founder authors exact remediation steps. The same alliance challenges whether the fix closes the exposure.
05
Executive summary, evidence pack, business impact analysis, remediation runbook, and challenge log are bound.
06
Founder reviews the report with you. After fixes, we retest within 14 days and issue signed attestation.
The surface is written down before discovery starts.
A clean scope protects both sides. It tells your legal team what was authorized. It tells the auditor where to stop. It tells the report reader what the attestation does and does not cover.
01
Root domain
The canonical company domain and explicitly related domains named in the statement of work.
02
GitHub organization
Public repositories, releases, Actions artifacts that remain public, and organization-level metadata.
03
Docker namespace
Public image tags, manifests, layers, build metadata, and registry naming patterns.
04
NPM scope
Public packages, tarballs, package metadata, install scripts, and accidental bundled configuration.
05
Adjacent public artifacts
CT logs, archived pages, public package mirrors, exposed docs, and historical public records.
Four models. Independent context. Forced convergence.
Each frontier LLM has a different blind spot. Run them independently against the same evidence — the disagreements are signal.
For each finding, we run a structured three-pass protocol: independent judgment, steel-man pass, and documented founder decision if non-convergence persists.
The same protocol applies to remediation. Each model proposes a fix; each fix is challenged by the others; the simplest convergent fix wins.
Convergence is not majority vote. A 3-1 disagreement on severity or scope is documented; a 4-0 false-positive verdict kills the finding.
[finding F-2026-12, candidate verdict: CRITICAL] claude verdict=CRITICAL conf=0.92 "AWS_ACCESS_KEY in /etc layer, prod-tag, scope=*" gemini verdict=CRITICAL conf=0.88 "concur; suggest verify via STS" codex verdict=HIGH conf=0.80 "scope wildcard ambiguous" mimo verdict=CRITICAL conf=0.85 "concur; impact = full S3+IAM" [steel-man pass] codex defends HIGH: "no production telemetry visible" claude rebuts: "tag 'prod' + manifest digest matches CI" verdict converges: CRITICAL conf=0.91
Claude
Strong at narrative coherence and impact synthesis. We challenge it for confirmation bias and over-complete explanations.
Gemini
Strong at broad pattern recognition. We challenge it for regex-driven false positives and over-reading weak signals.
Codex
Strong at implementation detail and remediation diffs. We challenge it for under-rating business impact outside code paths.
Mimo
Strong as a stabilizing reviewer. We challenge it for averaging toward consensus when the outlier might be right.
Every finding is rated on its evidence.
We only ship findings at level 4 or 5 in the executive summary. Lower-level signals are documented in the appendix as watch list items and not actioned.
Suspicion
Pattern matched in public artifact, not yet challenged.
Static corroboration
Pattern, entropy, variable name, and structural context align.
Context-validated
At least one frontier LLM has reviewed the artifact and confirmed.
Cross-validated
The AI-Alliance has converged on verdict.
Externally grounded
Independently verified against external truth source.
What you get.
01_Executive_Risk_Memo.pdf — 2 pages, board-ready
02_Findings_Detail.pdf — full finding-by-finding, with evidence pack
03_Business_Impact_Analysis.pdf — breach scenario, RGPD/NIS2/DORA exposure, monetary range
04_Remediation_Runbook.md — step-by-step, code-ready, version-controlled
05_AI_Alliance_Challenge_Log.json + .pdf — verbatim model-to-model challenges
06_Retest_Attestation_Template.pdf — signed on closure
Sample anonymized deliverable available on request after the discovery call.
Mock report preview.
PAGE 1
Cover memo
PAGE 2
Evidence pack
PAGE 3
Retest attestation
Every executive claim must trace to evidence in the appendix.
Every evidence item must include path, timestamp, hash, and collection context where available.
Every remediation step must be specific enough for an engineer to ship without a second discovery meeting.
Every AI-Alliance disagreement that changes severity, scope, or remediation must remain visible in the log.
Every closure attestation must reference the retest date, scoped surface, and residual limits.
What we will not do.
- We never probe, exploit, or test credentials against client infrastructure. OSINT-only.
- We never store, share, or republish discovered secrets. Hashes and truncated previews only.
- We never scope-creep. The engagement contract specifies the surface.
- We never publish a finding with the client's identity attached without explicit written permission.
- We never withhold a finding to upsell. Every confirmed finding is in the report.
Credential handling
We do not test discovered credentials. We prove exposure by origin, context, structure, and safe corroboration.
No opportunistic expansion
If we find an adjacent asset that looks relevant but is not authorized, we document the scoping question instead of investigating it.
No public attribution
Client identity, sector, timing, and technical fingerprint stay confidential unless the client explicitly authorizes publication.
No scanner theater
We do not pad the report with low-confidence findings, dependency noise, or generic best-practice checklists.
How findings move in or out of the report.
The report is not a dumping ground for every interesting signal. The founder uses explicit acceptance criteria so the final artifact stays short, defensible, and useful to engineering teams.
Finding accepted
The evidence reaches level 4 or 5, the business impact is defensible, and remediation can be stated precisely.
Finding rejected
The AI-Alliance converges on false positive, the evidence cannot be reproduced safely, or the impact depends on an unproven assumption.
Finding downgraded
The exposure exists, but scope, privilege, or reachable impact does not support executive-summary severity.
Watch-list item
The signal is useful for future monitoring but too weak to ask engineers to remediate under the current engagement.
Closure accepted
Retest shows the public exposure path is gone, and the attestation names the exact surface and date checked.
Closure deferred
If the client needs more time to remediate, the report records the remaining exposure and the retest window.
Ready to talk scope?
Book a 30-min discovery callEvery engagement signed by the founder. BleedWatch Labs