POST /api/compare
Runs the same threat-classification prompt against two independent LLMs (Groq Llama 3.1 and Tencent Hunyuan) in parallel. Returns both verdicts side by side with agreement metrics so operators can assess model consensus before acting.
Endpoint
POST /api/compare
Runtime: Node.js
Auth: Not required
Request
Body
{
"txHash": "0x8f2a9aac...",
"protocol": "MantleSwap",
"threatType": "Reentrancy"
}
| Parameter | Type | Required | Description |
|---|---|---|---|
txHash | string | No | Transaction hash for context |
protocol | string | No | Protocol name |
threatType | string | No | Detection signature |
Response
Success (200)
{
"groq": {
"model": "llama-3.1-8b-instant",
"provider": "Groq",
"confidence": 0.92,
"severity": "CRITICAL",
"recommendation": "Pause protocol",
"reasoning": "Transaction exhibits recursive external call pattern with state mutation after transfer, consistent with reentrancy.",
"latencyMs": 387,
"source": "live"
},
"hunyuan": {
"model": "hunyuan-lite",
"provider": "Hunyuan",
"confidence": 0.88,
"severity": "CRITICAL",
"recommendation": "Pause protocol",
"reasoning": "The transaction shows a reentrancy attack pattern where the contract state is modified after an external call.",
"latencyMs": 742,
"source": "live"
},
"agreement": true,
"consensusConfidence": 0.9
}
Response Fields
| Field | Type | Description |
|---|---|---|
groq | ModelVerdict | Groq (Llama 3.1-8B-Instant) verdict |
hunyuan | ModelVerdict | Hunyuan (hunyuan-lite) verdict |
agreement | boolean | Whether both models agree on severity level |
consensusConfidence | number | Average of both confidence scores |
ModelVerdict Fields
| Field | Type | Description |
|---|---|---|
model | string | Model identifier |
provider | string | Provider name (Groq or Hunyuan) |
confidence | number | 0.0 – 1.0 |
severity | string | CRITICAL, HIGH, or MEDIUM |
recommendation | string | Pause protocol, Alert operators, Monitor only, or Multisig review |
reasoning | string | 1-2 sentence explanation |
latencyMs | number | Model response time in milliseconds |
source | string | live (real AI) or fallback (no API key) |
How It Works
Parallel Execution
Both models are called simultaneously using Promise.all:
export async function POST(request: Request) {
const { txHash, protocol, threatType } = await request.json();
const prompt = buildUserPrompt(protocol, txHash, threatType);
// Parallel -- total latency is max(groq, hunyuan), not sum
const [groq, hunyuan] = await Promise.all([
callGroq(prompt),
callHunyuan(prompt)
]);
const agreement = groq.severity === hunyuan.severity;
const consensusConfidence = (groq.confidence + hunyuan.confidence) / 2;
return NextResponse.json({ groq, hunyuan, agreement, consensusConfidence });
}
Groq Call
async function callGroq(prompt: string): Promise<ModelVerdict> {
const start = Date.now();
const res = await fetch('https://api.groq.com/openai/v1/chat/completions', {
method: 'POST',
headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'llama-3.1-8b-instant',
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
{ role: 'user', content: prompt },
],
temperature: 0.2,
max_tokens: 250,
response_format: { type: 'json_object' }, // Groq supports structured output
}),
});
const parsed = safeParse(data.choices?.[0]?.message?.content ?? '{}');
return normalize('llama-3.1-8b-instant', 'Groq', parsed, Date.now() - start);
}
Hunyuan Call
async function callHunyuan(prompt: string): Promise<ModelVerdict> {
const start = Date.now();
const res = await fetch('https://api.hunyuan.cloud.tencent.com/v1/chat/completions', {
method: 'POST',
headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'hunyuan-lite',
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
{ role: 'user', content: prompt },
],
temperature: 0.2,
max_tokens: 250,
// No response_format -- Hunyuan may wrap output in ```json fences
}),
});
const parsed = safeParse(data.choices?.[0]?.message?.content ?? '{}');
return normalize('hunyuan-lite', 'Hunyuan', parsed, Date.now() - start);
}
Safe JSON Parsing
Hunyuan does not support response_format: { type: 'json_object' } and may wrap output in markdown fences. The safeParse function handles this:
function safeParse(content: string): Record<string, unknown> {
try {
return JSON.parse(content);
} catch {
// Try to extract JSON from markdown code fences
const match = content.match(/\{[\s\S]*\}/);
if (match) {
try { return JSON.parse(match[0]); } catch {}
}
return {};
}
}
Fallback
If a model's API key is not configured or the call fails, that model returns a fallback verdict:
function fallbackVerdict(model: string, provider: string, latencyMs: number): ModelVerdict {
return {
model,
provider,
confidence: 0.9,
severity: 'CRITICAL',
recommendation: 'Pause protocol',
reasoning: 'Transaction exhibits a recursive external-call pattern with state mutation after transfer, consistent with reentrancy.',
latencyMs,
source: 'fallback',
};
}
If both models fall back, agreement is true and consensusConfidence is 0.9.
Interpreting Results
| Scenario | agreement | Interpretation |
|---|---|---|
| Both CRITICAL, high confidence | true | Strong signal -- both AIs independently agree |
| One CRITICAL, one HIGH | false | Models disagree on severity -- investigate further |
| One CRITICAL, one MEDIUM | false | Significant disagreement -- likely false positive |
| Both fallback | true | No AI keys configured -- fallback data only |
Guideline: Only act on incidents where both models agree on CRITICAL severity with confidence > 0.85. When models disagree, treat the incident as requiring manual investigation.
Example: cURL
curl -X POST http://localhost:3000/api/compare \
-H "Content-Type: application/json" \
-d '{
"txHash": "0x8f2a9aac9e3a4d5b6c7d8e9f0a1b2c3d4e5f6a7b8",
"protocol": "TargetVault",
"threatType": "Reentrancy"
}'
Performance
| Stage | Typical Latency |
|---|---|
| Groq call | 200-600ms |
| Hunyuan call | 500-1200ms |
| Parallel execution | 500-1200ms (max of the two) |
| Fallback (no keys) | <1ms |
| Total (with both AIs) | 500-1200ms |
Next Steps
- Analyze Threat -- Single-model classification (faster, simpler)
- Audit Contract -- Bytecode-level security analysis