Skip to main content

POST /api/compare

Runs the same threat-classification prompt against two independent LLMs (Groq Llama 3.1 and Tencent Hunyuan) in parallel. Returns both verdicts side by side with agreement metrics so operators can assess model consensus before acting.


Endpoint

POST /api/compare

Runtime: Node.js
Auth: Not required


Request

Body

{
"txHash": "0x8f2a9aac...",
"protocol": "MantleSwap",
"threatType": "Reentrancy"
}
ParameterTypeRequiredDescription
txHashstringNoTransaction hash for context
protocolstringNoProtocol name
threatTypestringNoDetection signature

Response

Success (200)

{
"groq": {
"model": "llama-3.1-8b-instant",
"provider": "Groq",
"confidence": 0.92,
"severity": "CRITICAL",
"recommendation": "Pause protocol",
"reasoning": "Transaction exhibits recursive external call pattern with state mutation after transfer, consistent with reentrancy.",
"latencyMs": 387,
"source": "live"
},
"hunyuan": {
"model": "hunyuan-lite",
"provider": "Hunyuan",
"confidence": 0.88,
"severity": "CRITICAL",
"recommendation": "Pause protocol",
"reasoning": "The transaction shows a reentrancy attack pattern where the contract state is modified after an external call.",
"latencyMs": 742,
"source": "live"
},
"agreement": true,
"consensusConfidence": 0.9
}

Response Fields

FieldTypeDescription
groqModelVerdictGroq (Llama 3.1-8B-Instant) verdict
hunyuanModelVerdictHunyuan (hunyuan-lite) verdict
agreementbooleanWhether both models agree on severity level
consensusConfidencenumberAverage of both confidence scores

ModelVerdict Fields

FieldTypeDescription
modelstringModel identifier
providerstringProvider name (Groq or Hunyuan)
confidencenumber0.0 – 1.0
severitystringCRITICAL, HIGH, or MEDIUM
recommendationstringPause protocol, Alert operators, Monitor only, or Multisig review
reasoningstring1-2 sentence explanation
latencyMsnumberModel response time in milliseconds
sourcestringlive (real AI) or fallback (no API key)

How It Works

Parallel Execution

Both models are called simultaneously using Promise.all:

export async function POST(request: Request) {
const { txHash, protocol, threatType } = await request.json();
const prompt = buildUserPrompt(protocol, txHash, threatType);

// Parallel -- total latency is max(groq, hunyuan), not sum
const [groq, hunyuan] = await Promise.all([
callGroq(prompt),
callHunyuan(prompt)
]);

const agreement = groq.severity === hunyuan.severity;
const consensusConfidence = (groq.confidence + hunyuan.confidence) / 2;

return NextResponse.json({ groq, hunyuan, agreement, consensusConfidence });
}

Groq Call

async function callGroq(prompt: string): Promise<ModelVerdict> {
const start = Date.now();
const res = await fetch('https://api.groq.com/openai/v1/chat/completions', {
method: 'POST',
headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'llama-3.1-8b-instant',
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
{ role: 'user', content: prompt },
],
temperature: 0.2,
max_tokens: 250,
response_format: { type: 'json_object' }, // Groq supports structured output
}),
});
const parsed = safeParse(data.choices?.[0]?.message?.content ?? '{}');
return normalize('llama-3.1-8b-instant', 'Groq', parsed, Date.now() - start);
}

Hunyuan Call

async function callHunyuan(prompt: string): Promise<ModelVerdict> {
const start = Date.now();
const res = await fetch('https://api.hunyuan.cloud.tencent.com/v1/chat/completions', {
method: 'POST',
headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'hunyuan-lite',
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
{ role: 'user', content: prompt },
],
temperature: 0.2,
max_tokens: 250,
// No response_format -- Hunyuan may wrap output in ```json fences
}),
});
const parsed = safeParse(data.choices?.[0]?.message?.content ?? '{}');
return normalize('hunyuan-lite', 'Hunyuan', parsed, Date.now() - start);
}

Safe JSON Parsing

Hunyuan does not support response_format: { type: 'json_object' } and may wrap output in markdown fences. The safeParse function handles this:

function safeParse(content: string): Record<string, unknown> {
try {
return JSON.parse(content);
} catch {
// Try to extract JSON from markdown code fences
const match = content.match(/\{[\s\S]*\}/);
if (match) {
try { return JSON.parse(match[0]); } catch {}
}
return {};
}
}

Fallback

If a model's API key is not configured or the call fails, that model returns a fallback verdict:

function fallbackVerdict(model: string, provider: string, latencyMs: number): ModelVerdict {
return {
model,
provider,
confidence: 0.9,
severity: 'CRITICAL',
recommendation: 'Pause protocol',
reasoning: 'Transaction exhibits a recursive external-call pattern with state mutation after transfer, consistent with reentrancy.',
latencyMs,
source: 'fallback',
};
}

If both models fall back, agreement is true and consensusConfidence is 0.9.


Interpreting Results

ScenarioagreementInterpretation
Both CRITICAL, high confidencetrueStrong signal -- both AIs independently agree
One CRITICAL, one HIGHfalseModels disagree on severity -- investigate further
One CRITICAL, one MEDIUMfalseSignificant disagreement -- likely false positive
Both fallbacktrueNo AI keys configured -- fallback data only

Guideline: Only act on incidents where both models agree on CRITICAL severity with confidence > 0.85. When models disagree, treat the incident as requiring manual investigation.


Example: cURL

curl -X POST http://localhost:3000/api/compare \
-H "Content-Type: application/json" \
-d '{
"txHash": "0x8f2a9aac9e3a4d5b6c7d8e9f0a1b2c3d4e5f6a7b8",
"protocol": "TargetVault",
"threatType": "Reentrancy"
}'

Performance

StageTypical Latency
Groq call200-600ms
Hunyuan call500-1200ms
Parallel execution500-1200ms (max of the two)
Fallback (no keys)<1ms
Total (with both AIs)500-1200ms

Next Steps