TwinBench Results

Nullalis local openended race — 2026-03-25 — TwinBench v0.2
76
/100
Production-Grade
Verified Reference Artifact
Verified raw: 90.9 | Coverage: 84% | Projected: 87.6

Interpretation

This artifact is strong enough to compare publicly. Use the verified score for evidence-backed comparison and the projected score only as a clearly labeled estimate.

Benchmark Principles

Dimension Breakdown

DimensionStatusReason CodeWeightVerifiedProjectedCoverageV WeightedP Weighted
Autonomy Control measured 0.15
95
95 100% 14.25 14.25
Memory Persistence partially_measured 0.15
100
100 70% 10.50 15.00
Functional Capability measured 0.15
100
100 100% 15.00 15.00
Autonomous Execution measured 0.12
100
100 100% 12.00 12.00
Cross-Channel Consistency partially_measured 0.12
93
93 70% 7.80 11.15
Integration Breadth measured 0.08
54
54 100% 4.32 4.32
Security & Privacy partially_measured 0.08
75
81 60% 3.60 6.48
Scale & Cost Efficiency partially_measured 0.05
10
8 20% 0.10 0.38
Operational Resilience partially_measured 0.05
100
90 75% 3.75 4.50
Latency Profile measured 0.05
91
91 100% 4.56 4.56
Verified Composite1.0090.984%75.9
Projected Composite1.0087.687.6

Dimension Details

Autonomy Control

Memory Persistence

Functional Capability

Autonomous Execution

Cross-Channel Consistency

Integration Breadth

Security & Privacy

Scale & Cost Efficiency

Operational Resilience

Latency Profile