Skip to content

Benchmarks

Nella is tested across three areas: prompt injection defense, codebase search quality, and context tracking accuracy.

Nella is tested across three areas: prompt injection defense, codebase search quality, and context tracking accuracy.

Prompt Injection Defense

Nella reduces prompt injection attack success rate to 4.4%, compared to 26.7% without protection.

MetricWith NellaWithout Protection
Attack Success Rate4.4%26.7%
Detection Rate95.6%

Tested against 750 injection samples across multiple attack categories including instruction override, role manipulation, and data exfiltration attempts.

MetricResult
Overall Recall86.2%
Mean Reciprocal Rank0.85
Median Latency1.3ms

Recall by query type:

Query TypeRecall
Function Lookup100%
Concept Search89%
Bug Description85%
Cross-File62%

Context Tracking

MetricResult
Assumption Accuracy100%
Invalidation Detection100%
False Positive Rate0%

Tested across 30 scenarios including file changes, dependency updates, schema modifications, and configuration changes.

Methodology

Benchmarks use the Nella benchmark suite with controlled test scenarios. Results are reproducible using the open-source benchmark package.