Give your AI agent
a real debugger
Empowering agents with structured debugging
25 MCP tools for GitHub Copilot — trace variables, validate hypotheses, safety-check fixes — before touching a single line of code. Across JS, Python, and Java.
See It in Action
Watch AgentProbe guide Copilot through a real multi-bug diagnosis
What it actually does
Six diagnostic capabilities that add structured verification to how Copilot investigates bugs — each one filling a specific gap.
Evidence Before Action
validate_hypothesis returns a hard true/false with file+line evidence tokens before a single character is changed.
Pre-flight Safety Check
check_suggestion scans each proposed fix against logic guards and known violation patterns before the edit lands.
Exact Variable Tracing
trace_value_flow traces a variable from assignment through every read/write — surfacing the exact race window.
Team-Searchable Audit Trail
Sessions saved with root cause, fix, evidence, and keywords. Searchable by any teammate months later.
Failing Fast on Scope
detect_error_pattern returns explicit "no match" signals — no time wasted chasing patterns that do not apply.
You Stay in Control
Direct the tools — "trace this, validate that, don't fix yet" — and decide when to act on the evidence.
Just tell Copilot what you need
AgentProbe maps plain-language requests to the right tool chain automatically — no syntax to remember.
Discovers running Node/Python/Java processes with open debug ports, attaches to the matching one, and reads live variable state. Process auto-resumes.
Finds the already-running debug process, parses the stack trace to pick the 3 most useful inspection points, then walks the call chain collecting variable snapshots.
Extracts the exact crash file and line from the stack trace, then attaches and reads only the `user` variable at that point — nothing else touched.
Sends SIGUSR1 to the running process — no restart needed. Falls back to returning the exact restart command if the process isn't debuggable yet.
Finds the assignment source and every subsequent read and write — surfacing exactly where the value changes before it reaches the crash site.
Validates the root cause claim against codebase evidence first, then scans the proposed edit for null-check removal, broken guards, or version-incompatible APIs.
Language & debug mode support
Three languages with full rule sets, version-aware context, and a dedicated debug protocol adapter for each runtime.
JavaScript & TypeScript
Debugger: CDP — Chrome DevTools Protocol
Frameworks
100+ error patterns
- Null / undefined access
- Async / Promise errors
- Type mismatches
- Module resolution
- Memory & stack
- Syntax errors
Python
Debugger: DAP — Debug Adapter Protocol
Frameworks
30+ error patterns
- None / NoneType errors
- AttributeError / KeyError
- ImportError / NameError
- Async event loop errors
- Django / ORM exceptions
- Pandas / NumPy type errors
Java
Debugger: JDWP — Java Debug Wire Protocol
Frameworks
15+ error patterns
- NullPointerException
- ClassNotFoundException
- LazyInitializationException
- OutOfMemoryError (heap + metaspace)
- ClassCastException
- EJB / container errors
Installation Guide
Get AgentProbe running in your editor in under 2 minutes
GitHub Copilot with MCP support required
AgentProbe is accessed exclusively through GitHub Copilot’s MCP integration — available in both VS Code and IntelliJ. Copilot acts as the AI layer; AgentProbe provides the 25 debugging tools Copilot calls via MCP.
Pre-requisites
GitHub Copilot with MCP support
Required — AgentProbe is accessed exclusively through Copilot’s MCP integration
Node.js 18+
Required for VS Code extension & CLI
Java 17+
Required for IntelliJ plugin
VS Code 1.85+
With GitHub Copilot extension installed
IntelliJ IDEA 2024.3+
With GitHub Copilot plugin installed
Get Started in 2 Steps
Install from Marketplace
Open VS Code → Extensions (Ctrl+Shift+X) → Search “AgentProbe” → Click Install
Start the MCP Server
Open the Command Palette (Cmd+Shift+P) → Run “AgentProbe: Start MCP Server”
That’s it. AgentProbe is now available as a tool inside GitHub Copilot Chat.
Quick Reference
Take the MetricRegistry bug from the demo. Every timer reported the same average — 8.23ms — regardless of how long each pipeline step actually took. Here is what finding it looks like, with and without AgentProbe.
// Copilot opens MetricRegistry.java
// and reads through the logic manually.
recordTimer(key, ms) {
DoubleSummaryStatistics stats =
timers.computeIfAbsent(
key,
k -> sharedTimerStats // suspects aliasing
);
stats.accept(ms);
}
// Forms a hypothesis from reading alone.
// No verification — edits immediately.// Step 1: trace_value_flow("sharedTimerStats")
→ assigned : MetricRegistry.java line 12
→ aliased read : line 20 (every key)
→ all keys map to the same object
// Step 2: validate_hypothesis(
// "sharedTimerStats aliases all timers")
→ hypothesis_valid : true
→ evidence : MetricRegistry.java:20
// Step 3: check_suggestion(fix)
→ safe : true, no violations
// Now edit with hard evidence, not a guess.The fix is identical either way — replace sharedTimerStats with new DoubleSummaryStatistics() per key. The difference is certainty. AgentProbe gives machine-verified evidence and a safety check before a single line is changed.
Measurable debugging advantage
Seven metrics. Same bug, different outcome.
Debug sessions that don’t disappear
Root cause, fix, evidence, keywords — saved as structured JSON, searchable by anyone on the team
{
"session_id": "sess_1774397529015_8f9d58",
"date": "2026-03-28",
"error": "All timers report identical average 8.23ms",
"root_cause": "sharedTimerStats singleton aliased to all timer keys",
"fix_applied": "new DoubleSummaryStatistics() per key",
"files_involved": ["MetricRegistry.java"],
"keywords": ["DoubleSummaryStatistics", "alias", "metrics", "timer"],
"timestamp": "2026-03-28T14:22:00Z"
}Where to find everything
.agentprobe/sessions/Every debug session saved as structured JSON — error, root cause, fix, files involved, and questions asked.
.agentprobe/published/Published records with keywords, tags, version context, and redaction metadata — ready for team-wide search.
.agentprobe/sessions.jsonToken budget tracker — every debug attempt logged with what was tried, so repeat investigations are caught early.
All data stays local by default. Sensitive values (API keys, JWTs, paths, emails, IPs) are automatically redacted before any export. Add .agentprobe/ to .gitignore to keep sessions private.
Searchable Knowledge Base
A new engineer hits the same bug and gets the full diagnosis in seconds. Jaccard similarity matches past sessions by error signature.
Built-in Audit Trail
Every attempt tracked — what was tried, what evidence supported it, and what was verified safe. Full incident timeline out of the box.
Faster Incident Response
On-call engineers search sessions by symptom or file path. A 5-minute investigation becomes 30 seconds with search_past_sessions.
Junior Dev Acceleration
Sessions written in plain language with file and line references — learn the debugging pattern, not just the fix.
Side-by-Side Comparison
Copilot working alone vs. Copilot with AgentProbe
| Dimension | Copilot Alone | Copilot + AgentProbe |
|---|---|---|
| How bugs are found | ×Read code, reason manually, form hypothesis | ✓trace_value_flow pinpoints write→read windows with line numbers |
| Hypothesis confidence | ×Based on expertise alone — could be wrong | ✓validate_hypothesis returns true/false + evidence before editing |
| Safety before edit | ×None — hypothesis immediately becomes a code change | ✓check_suggestion flags violations before any file is touched |
| Post-fix artifact | ×Just a diff — no record of why or what was ruled out | ✓Structured session: root cause, fix, evidence, keywords, timestamp |
| Team reuse | ×Next developer starts from scratch | ✓Searchable by keyword — same bug surfaces the full diagnostic in seconds |
| Speed | ✓~2 min — fewer round-trips | ~~5 min — tool round-trips add latency but add certainty |
| Token cost | ✓~3,500–4,000 tokens | ~~4,800–5,500 tokens — higher due to tool payloads |
| Best for | ×Known codebase, expert who understands the system | ✓Unknown codebase, onboarding, production incidents, audit requirements |
All 25 MCP Tools
The full arsenal. Most sessions use 3–5 tools. Reach for the right one at the right depth.
gather_contextInvestigateScans runtime, dependencies, git history, and project type to orient the investigation.
+ Copilot reads files you point to. AgentProbe discovers runtime, dependency graph, and git changes automatically.
detect_error_patternInvestigateMatches error message against known failure patterns with remediation guidance.
+ Copilot always produces a suggestion. AgentProbe gives confidence-bounded "no match" signals too.
decompose_errorInvestigateBreaks a compound error into atomic sub-problems with individual investigation paths.
+ Produces structured decomposition you can step through one sub-problem at a time.
trace_call_chainInvestigateWalks the full call stack annotating each frame with what it mutates or reads.
+ Tracks across thread boundaries and interface dispatch where manual reading loses the trail.
trace_value_flowInvestigateTraces a variable from assignment through every read/write site with file+line evidence.
+ Returns structured evidence: "assigned line 27, read in lambda line 30" — the race window is explicit.
explain_logicInvestigateProduces a structured logic map: preconditions, postconditions, branches, side effects.
+ Enumerable branches, flagged side effects, and targeted Socratic questions.
find_breaking_changeInvestigateScans recent commits to find the specific change that introduced the regression.
+ Fetches and correlates diffs automatically — often surfacing the culprit commit in seconds.
validate_hypothesisValidateTakes a plain-English hypothesis and returns a hard true/false with file+line citations.
+ Returns a structured verdict with explicit evidence tokens — refutable and persistable.
check_logic_guardsValidateScans a file for missing input guards, unchecked nulls, and invariant gaps.
+ Proactively enumerates every guard gap without prompting.
check_suggestionValidateEvaluates a proposed code change against known violation patterns before the file is touched.
+ Pre-screens every fix — the file is never edited unless the safety check passes.
generate_logic_guardValidateGenerates defensive guard code tailored to specific identified risk points.
+ Guards target specific identified gaps, not generic templates.
capture_value_snapshotValidateRecords the value of a variable at a specific execution point for comparison across runs.
+ Bridges static reasoning and live behaviour — Copilot has no runtime access.
plan_breakpointsDebugProduces a prioritised breakpoint list with conditional expressions ready for your debugger.
+ Returns ranked list with full conditional expressions — no further reasoning needed.
agentprobe_debugDebugFull AI-assisted debug cycle: hypothesis generation, evidence search, structured finding.
+ Structures the cycle into discrete inspectable phases: gather, hypothesise, validate, conclude.
attach_and_inspectDebugAttaches to a running process and inspects heap, thread state, and live variables.
+ Live runtime introspection that no static LLM can replicate from source alone.
list_debuggable_processesDebugEnumerates all running processes that expose a debug port.
+ Surfaces debug context automatically — no manual jps or ps needed.
request_debug_sessionDebugOpens a structured session with problem statement, scope, and expected outcome.
+ Forces a crisp problem statement: what is in scope, who owns it, and what "done" looks like.
create_reproDebugGenerates a minimal reproducible test case for the identified bug.
+ Derives repro from structured diagnosis — tests the exact failure path that was confirmed.
save_debug_sessionKnowledgePersists the full session as structured JSON to .agentprobe/sessions/.
+ When chat closes, diagnosis survives — searchable, auditable, reusable.
search_past_sessionsKnowledgeFull-text and keyword search across all saved sessions.
+ Teams accumulate institutional debugging knowledge that compounds over time.
publish_summaryKnowledgePushes sanitised session summary to a shared team index.
+ Broadcasts structured knowledge to the whole team automatically.
post_to_fixloreKnowledgeSubmits the fix pattern to the FixLore community knowledge base.
+ Your solution actively prevents the same bug for someone else tomorrow.
summarize_sessionKnowledgeProduces consistent human-readable summary for incident reports or PR descriptions.
+ Derived from session schema — complete and consistent across different authors.
should_i_keep_tryingGuidanceEvaluates investigation state and advises whether to keep digging, pivot, or escalate.
+ Gives explicit evidence-based inflection point instead of cycling endlessly.
search_solutionsGuidanceSearches curated index of validated fixes filtered by language, framework, and error type.
+ Curated validated fixes from real production incidents — not undated training data.
Try it on your next stuck bug
Two-minute install. Works inside VS Code with GitHub Copilot.