POST /api/compare

Runs the same threat-classification prompt against two independent LLMs (Groq Llama 3.1 and Tencent Hunyuan) in parallel. Returns both verdicts side by side with agreement metrics so operators can assess model consensus before acting.

Endpoint

POST /api/compare

Runtime: Node.js
Auth: Not required

Request

Body

{
  "txHash": "0x8f2a9aac...",
  "protocol": "MantleSwap",
  "threatType": "Reentrancy"
}

Parameter	Type	Required	Description
`txHash`	`string`	No	Transaction hash for context
`protocol`	`string`	No	Protocol name
`threatType`	`string`	No	Detection signature

Response

Success (200)

{
  "groq": {
    "model": "llama-3.1-8b-instant",
    "provider": "Groq",
    "confidence": 0.92,
    "severity": "CRITICAL",
    "recommendation": "Pause protocol",
    "reasoning": "Transaction exhibits recursive external call pattern with state mutation after transfer, consistent with reentrancy.",
    "latencyMs": 387,
    "source": "live"
  },
  "hunyuan": {
    "model": "hunyuan-lite",
    "provider": "Hunyuan",
    "confidence": 0.88,
    "severity": "CRITICAL",
    "recommendation": "Pause protocol",
    "reasoning": "The transaction shows a reentrancy attack pattern where the contract state is modified after an external call.",
    "latencyMs": 742,
    "source": "live"
  },
  "agreement": true,
  "consensusConfidence": 0.9
}

Response Fields

Field	Type	Description
`groq`	`ModelVerdict`	Groq (Llama 3.1-8B-Instant) verdict
`hunyuan`	`ModelVerdict`	Hunyuan (hunyuan-lite) verdict
`agreement`	`boolean`	Whether both models agree on severity level
`consensusConfidence`	`number`	Average of both confidence scores

ModelVerdict Fields

Field	Type	Description
`model`	`string`	Model identifier
`provider`	`string`	Provider name (`Groq` or `Hunyuan`)
`confidence`	`number`	0.0 – 1.0
`severity`	`string`	`CRITICAL`, `HIGH`, or `MEDIUM`
`recommendation`	`string`	`Pause protocol`, `Alert operators`, `Monitor only`, or `Multisig review`
`reasoning`	`string`	1-2 sentence explanation
`latencyMs`	`number`	Model response time in milliseconds
`source`	`string`	`live` (real AI) or `fallback` (no API key)

How It Works

Parallel Execution

Both models are called simultaneously using Promise.all:

export async function POST(request: Request) {
  const { txHash, protocol, threatType } = await request.json();
  const prompt = buildUserPrompt(protocol, txHash, threatType);

  // Parallel -- total latency is max(groq, hunyuan), not sum
  const [groq, hunyuan] = await Promise.all([
    callGroq(prompt),
    callHunyuan(prompt)
  ]);

  const agreement = groq.severity === hunyuan.severity;
  const consensusConfidence = (groq.confidence + hunyuan.confidence) / 2;

  return NextResponse.json({ groq, hunyuan, agreement, consensusConfidence });
}

Groq Call

async function callGroq(prompt: string): Promise<ModelVerdict> {
  const start = Date.now();
  const res = await fetch('https://api.groq.com/openai/v1/chat/completions', {
    method: 'POST',
    headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: 'llama-3.1-8b-instant',
      messages: [
        { role: 'system', content: SYSTEM_PROMPT },
        { role: 'user', content: prompt },
      ],
      temperature: 0.2,
      max_tokens: 250,
      response_format: { type: 'json_object' },  // Groq supports structured output
    }),
  });
  const parsed = safeParse(data.choices?.[0]?.message?.content ?? '{}');
  return normalize('llama-3.1-8b-instant', 'Groq', parsed, Date.now() - start);
}

Hunyuan Call

async function callHunyuan(prompt: string): Promise<ModelVerdict> {
  const start = Date.now();
  const res = await fetch('https://api.hunyuan.cloud.tencent.com/v1/chat/completions', {
    method: 'POST',
    headers: { Authorization: `Bearer ${key}`, 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: 'hunyuan-lite',
      messages: [
        { role: 'system', content: SYSTEM_PROMPT },
        { role: 'user', content: prompt },
      ],
      temperature: 0.2,
      max_tokens: 250,
      // No response_format -- Hunyuan may wrap output in ```json fences
    }),
  });
  const parsed = safeParse(data.choices?.[0]?.message?.content ?? '{}');
  return normalize('hunyuan-lite', 'Hunyuan', parsed, Date.now() - start);
}

Safe JSON Parsing

Hunyuan does not support response_format: { type: 'json_object' } and may wrap output in markdown fences. The safeParse function handles this:

function safeParse(content: string): Record<string, unknown> {
  try {
    return JSON.parse(content);
  } catch {
    // Try to extract JSON from markdown code fences
    const match = content.match(/\{[\s\S]*\}/);
    if (match) {
      try { return JSON.parse(match[0]); } catch {}
    }
    return {};
  }
}

Fallback

If a model's API key is not configured or the call fails, that model returns a fallback verdict:

function fallbackVerdict(model: string, provider: string, latencyMs: number): ModelVerdict {
  return {
    model,
    provider,
    confidence: 0.9,
    severity: 'CRITICAL',
    recommendation: 'Pause protocol',
    reasoning: 'Transaction exhibits a recursive external-call pattern with state mutation after transfer, consistent with reentrancy.',
    latencyMs,
    source: 'fallback',
  };
}

If both models fall back, agreement is true and consensusConfidence is 0.9.

Interpreting Results

Scenario	`agreement`	Interpretation
Both CRITICAL, high confidence	`true`	Strong signal -- both AIs independently agree
One CRITICAL, one HIGH	`false`	Models disagree on severity -- investigate further
One CRITICAL, one MEDIUM	`false`	Significant disagreement -- likely false positive
Both fallback	`true`	No AI keys configured -- fallback data only

Guideline: Only act on incidents where both models agree on CRITICAL severity with confidence > 0.85. When models disagree, treat the incident as requiring manual investigation.

Example: cURL

curl -X POST http://localhost:3000/api/compare \
  -H "Content-Type: application/json" \
  -d '{
    "txHash": "0x8f2a9aac9e3a4d5b6c7d8e9f0a1b2c3d4e5f6a7b8",
    "protocol": "TargetVault",
    "threatType": "Reentrancy"
  }'

Performance

Stage	Typical Latency
Groq call	200-600ms
Hunyuan call	500-1200ms
Parallel execution	500-1200ms (max of the two)
Fallback (no keys)	<1ms
Total (with both AIs)	500-1200ms

Next Steps

Analyze Threat -- Single-model classification (faster, simpler)
Audit Contract -- Bytecode-level security analysis

Endpoint​

Request​

Body​

Response​

Success (200)​

Response Fields​

ModelVerdict Fields​

How It Works​

Parallel Execution​

Groq Call​

Hunyuan Call​

Safe JSON Parsing​

Fallback​

Interpreting Results​

Example: cURL​

Performance​

Next Steps​