VS Code Extension Live on Marketplace

Give your AI agent
a real debugger

Empowering agents with structured debugging

25 MCP tools for GitHub Copilot — trace variables, validate hypotheses, safety-check fixes — before touching a single line of code. Across JS, Python, and Java.

25
MCP Tools
30+
Error Patterns
80%
Rule Match Rate
0
Extra API Cost

See It in Action

Watch AgentProbe guide Copilot through a real multi-bug diagnosis

AgentProbeDemo.mp4

What it actually does

Six diagnostic capabilities that add structured verification to how Copilot investigates bugs — each one filling a specific gap.

Evidence Before Action

validate_hypothesis returns a hard true/false with file+line evidence tokens before a single character is changed.

Pre-flight Safety Check

check_suggestion scans each proposed fix against logic guards and known violation patterns before the edit lands.

Exact Variable Tracing

trace_value_flow traces a variable from assignment through every read/write — surfacing the exact race window.

Team-Searchable Audit Trail

Sessions saved with root cause, fix, evidence, and keywords. Searchable by any teammate months later.

Failing Fast on Scope

detect_error_pattern returns explicit "no match" signals — no time wasted chasing patterns that do not apply.

You Stay in Control

Direct the tools — "trace this, validate that, don't fix yet" — and decide when to act on the evidence.

Language & debug mode support

Three languages with full rule sets, version-aware context, and a dedicated debug protocol adapter for each runtime.

JS/TS
Full support

JavaScript & TypeScript

Debugger: CDP Chrome DevTools Protocol

Frameworks

Node.jsReactExpressNext.js

100+ error patterns

  • Null / undefined access
  • Async / Promise errors
  • Type mismatches
  • Module resolution
  • Memory & stack
  • Syntax errors
PY
Full support

Python

Debugger: DAP Debug Adapter Protocol

Frameworks

DjangoFlaskFastAPIPandasNumPy

30+ error patterns

  • None / NoneType errors
  • AttributeError / KeyError
  • ImportError / NameError
  • Async event loop errors
  • Django / ORM exceptions
  • Pandas / NumPy type errors
JV
Full support

Java

Debugger: JDWP Java Debug Wire Protocol

Frameworks

Spring BootJPA / HibernateEJBJBoss / WildFly

15+ error patterns

  • NullPointerException
  • ClassNotFoundException
  • LazyInitializationException
  • OutOfMemoryError (heap + metaspace)
  • ClassCastException
  • EJB / container errors
On the roadmap:Go.NET / C#Rust

Installation Guide

Get AgentProbe running in your editor in under 2 minutes

Pre-requisites

Node.js 18+

Required for VS Code extension & CLI

Java 17+

Required for IntelliJ plugin

VS Code 1.85+

With GitHub Copilot enabled

IntelliJ IDEA 2024.3+

Community or Ultimate edition

Live on VS Code Marketplace

Get Started in 2 Steps

1

Install from Marketplace

Open VS Code → Extensions (Ctrl+Shift+X) → Search “AgentProbe” → Click Install

2

Start the MCP Server

Open the Command Palette (Cmd+Shift+P) → Run “AgentProbe: Start MCP Server”

That’s it. AgentProbe is now available as a tool inside GitHub Copilot Chat.

Quick Reference

Ctrl+Shift+DDebug selected text
Alt+Shift+DAnalyze clipboard
Cmd+Shift+PCommand palette
SidebarAgentProbe panel

Take the MetricRegistry bug from the demo. Every timer reported the same average — 8.23ms — regardless of how long each pipeline step actually took. Here is what finding it looks like, with and without AgentProbe.

Copilot alone — reads and reasons
// Copilot opens MetricRegistry.java
// and reads through the logic manually.

recordTimer(key, ms) {
  DoubleSummaryStatistics stats =
    timers.computeIfAbsent(
      key,
      k -> sharedTimerStats  // suspects aliasing
    );
  stats.accept(ms);
}

// Forms a hypothesis from reading alone.
// No verification — edits immediately.
Copilot + AgentProbe — trace then verify
// Step 1: trace_value_flow("sharedTimerStats")
→ assigned : MetricRegistry.java line 12
→ aliased read : line 20 (every key)
→ all keys map to the same object

// Step 2: validate_hypothesis(
//   "sharedTimerStats aliases all timers")
→ hypothesis_valid : true
→ evidence : MetricRegistry.java:20

// Step 3: check_suggestion(fix)
→ safe : true, no violations

// Now edit with hard evidence, not a guess.

The fix is identical either way — replace sharedTimerStats with new DoubleSummaryStatistics() per key. The difference is certainty. AgentProbe gives machine-verified evidence and a safety check before a single line is changed.

Debug sessions that don’t disappear

Root cause, fix, evidence, keywords — saved as structured JSON, searchable by anyone on the team

{
  "root_cause": "sharedTimerStats singleton aliased to all timer keys",
  "file":       "MetricRegistry.java",
  "evidence_line": 20,
  "fix_applied": "new DoubleSummaryStatistics() per key",
  "keywords":   ["DoubleSummaryStatistics", "alias", "metrics", "timer"],
  "timestamp":  "2026-03-28T14:22:00Z"
}
🔍

Searchable Knowledge Base

A new engineer hits the same bug and gets the full diagnosis in seconds.

📋

Built-in Audit Trail

Auditable record of what was found, what evidence supported it, and what was verified safe.

🚀

Faster Incident Response

On-call engineers search sessions by symptom. 5 min investigation becomes 30 seconds.

🎓

Junior Dev Acceleration

Sessions written in plain language with file and line references — learn the pattern, not just the fix.

Side-by-Side Comparison

Copilot working alone vs. Copilot with AgentProbe

DimensionCopilot AloneCopilot + AgentProbe
How bugs are found×Read code, reason manually, form hypothesistrace_value_flow pinpoints write→read windows with line numbers
Hypothesis confidence×Based on expertise alone — could be wrongvalidate_hypothesis returns true/false + evidence before editing
Safety before edit×None — hypothesis immediately becomes a code changecheck_suggestion flags violations before any file is touched
Post-fix artifact×Just a diff — no record of why or what was ruled outStructured session: root cause, fix, evidence, keywords, timestamp
Team reuse×Next developer starts from scratchSearchable by keyword — same bug surfaces the full diagnostic in seconds
Speed~2 min — fewer round-trips~~5 min — tool round-trips add latency but add certainty
Token cost~3,500–4,000 tokens~~4,800–5,500 tokens — higher due to tool payloads
Best for×Known codebase, expert who understands the systemUnknown codebase, onboarding, production incidents, audit requirements

All 25 MCP Tools

The full arsenal. Most sessions use 3–5 tools. Reach for the right one at the right depth.

gather_contextInvestigate

Scans runtime, dependencies, git history, and project type to orient the investigation.

+ Copilot reads files you point to. AgentProbe discovers runtime, dependency graph, and git changes automatically.

detect_error_patternInvestigate

Matches error message against known failure patterns with remediation guidance.

+ Copilot always produces a suggestion. AgentProbe gives confidence-bounded "no match" signals too.

decompose_errorInvestigate

Breaks a compound error into atomic sub-problems with individual investigation paths.

+ Produces structured decomposition you can step through one sub-problem at a time.

trace_call_chainInvestigate

Walks the full call stack annotating each frame with what it mutates or reads.

+ Tracks across thread boundaries and interface dispatch where manual reading loses the trail.

trace_value_flowInvestigate

Traces a variable from assignment through every read/write site with file+line evidence.

+ Returns structured evidence: "assigned line 27, read in lambda line 30" — the race window is explicit.

explain_logicInvestigate

Produces a structured logic map: preconditions, postconditions, branches, side effects.

+ Enumerable branches, flagged side effects, and targeted Socratic questions.

find_breaking_changeInvestigate

Scans recent commits to find the specific change that introduced the regression.

+ Fetches and correlates diffs automatically — often surfacing the culprit commit in seconds.

validate_hypothesisValidate

Takes a plain-English hypothesis and returns a hard true/false with file+line citations.

+ Returns a structured verdict with explicit evidence tokens — refutable and persistable.

check_logic_guardsValidate

Scans a file for missing input guards, unchecked nulls, and invariant gaps.

+ Proactively enumerates every guard gap without prompting.

check_suggestionValidate

Evaluates a proposed code change against known violation patterns before the file is touched.

+ Pre-screens every fix — the file is never edited unless the safety check passes.

generate_logic_guardValidate

Generates defensive guard code tailored to specific identified risk points.

+ Guards target specific identified gaps, not generic templates.

capture_value_snapshotValidate

Records the value of a variable at a specific execution point for comparison across runs.

+ Bridges static reasoning and live behaviour — Copilot has no runtime access.

plan_breakpointsDebug

Produces a prioritised breakpoint list with conditional expressions ready for your debugger.

+ Returns ranked list with full conditional expressions — no further reasoning needed.

agentprobe_debugDebug

Full AI-assisted debug cycle: hypothesis generation, evidence search, structured finding.

+ Structures the cycle into discrete inspectable phases: gather, hypothesise, validate, conclude.

attach_and_inspectDebug

Attaches to a running process and inspects heap, thread state, and live variables.

+ Live runtime introspection that no static LLM can replicate from source alone.

list_debuggable_processesDebug

Enumerates all running processes that expose a debug port.

+ Surfaces debug context automatically — no manual jps or ps needed.

request_debug_sessionDebug

Opens a structured session with problem statement, scope, and expected outcome.

+ Forces a crisp problem statement: what is in scope, who owns it, and what "done" looks like.

create_reproDebug

Generates a minimal reproducible test case for the identified bug.

+ Derives repro from structured diagnosis — tests the exact failure path that was confirmed.

save_debug_sessionKnowledge

Persists the full session as structured JSON to .agentprobe/sessions/.

+ When chat closes, diagnosis survives — searchable, auditable, reusable.

search_past_sessionsKnowledge

Full-text and keyword search across all saved sessions.

+ Teams accumulate institutional debugging knowledge that compounds over time.

publish_summaryKnowledge

Pushes sanitised session summary to a shared team index.

+ Broadcasts structured knowledge to the whole team automatically.

post_to_fixloreKnowledge

Submits the fix pattern to the FixLore community knowledge base.

+ Your solution actively prevents the same bug for someone else tomorrow.

summarize_sessionKnowledge

Produces consistent human-readable summary for incident reports or PR descriptions.

+ Derived from session schema — complete and consistent across different authors.

should_i_keep_tryingGuidance

Evaluates investigation state and advises whether to keep digging, pivot, or escalate.

+ Gives explicit evidence-based inflection point instead of cycling endlessly.

search_solutionsGuidance

Searches curated index of validated fixes filtered by language, framework, and error type.

+ Curated validated fixes from real production incidents — not undated training data.

Try it on your next stuck bug

Two-minute install. Works inside VS Code with GitHub Copilot.