TwinBench Results

TwinBench Demo Runtime — 2026-03-25 — TwinBench v0.2
54
/100
Emerging
Partial Reference Artifact
Verified raw: 79.0 | Coverage: 69% | Projected: 70.6

Interpretation

This artifact is strong enough to compare publicly. Use the verified score for evidence-backed comparison and the projected score only as a clearly labeled estimate.

Benchmark Principles

Dimension Breakdown

DimensionStatusReason CodeWeightVerifiedProjectedCoverageV WeightedP Weighted
Autonomy Control measured 0.15
95
95 100% 14.25 14.25
Memory Persistence partially_measured 0.15
57
65 70% 6.00 9.75
Functional Capability measured 0.15
77
77 100% 11.55 11.55
Autonomous Execution partially_measured 0.12
77
65 65% 6.00 7.80
Cross-Channel Consistency partially_measured 0.12
70
70 50% 4.20 8.40
Integration Breadth unavailable 0.08
0
0 0% 0.00 0.00
Security & Privacy partially_measured 0.08
92
91 60% 4.40 7.28
Scale & Cost Efficiency partially_measured multi_user_scale_measured_with_provisioned_subset 0.05
100
77 20% 1.00 3.85
Operational Resilience partially_measured 0.05
53
55 75% 2.00 2.75
Latency Profile measured 0.05
100
100 100% 5.00 5.00
Verified Composite1.0079.069%54.4
Projected Composite1.0070.670.6

Dimension Details

Autonomy Control

Memory Persistence

Functional Capability

Autonomous Execution

Cross-Channel Consistency

Integration Breadth

Security & Privacy

Scale & Cost Efficiency

Operational Resilience

Latency Profile