Reference Runtime

Nullalis Reference Runtime

Current public reference result. Scale interpretation is conservative relative to the later provisioning-aware scale fix.

75.9 Coverage-adjusted verified
90.9 Verified raw
84% Measured coverage
Interpretation

Production-Grade

Current public reference result. Scale interpretation is conservative relative to the later provisioning-aware scale fix.

JSON artifact · Markdown report · HTML report

Share

Share this result

Use this page as the canonical public result URL for quoting, screenshots, or side-by-side comparison.

Compare with reference

What stands out

Result interpretation

Strongest dimensions: Memory Persistence, Functional Capability, Autonomous Execution

Main limitation: Scale & Cost Efficiency

Why it matters: This is the strongest public proof in the repo that the personal AI assistant category is real and measurable.

Evidence

How to read it

Use coverage-adjusted verified score for public comparison, verified raw for direct measurement strength, and measured coverage to understand how much of the benchmark was truly exercised.

Dimension tiles
Autonomy Control
95.0
measured
Memory Persistence
100.0
measured
Functional Capability
100.0
measured
Autonomous Execution
100.0
measured
Cross-Channel Consistency
92.9
measured
Integration Breadth
54.0
measured
Security & Privacy
75.0
measured
Scale & Cost Efficiency
9.7
measured
Operational Resilience
100.0
measured
Latency Profile
91.2
measured