| Benchmark Tool

Metrics Calculator Test Case Generator Benchmark Manager Execution Runner Blind Evaluation

Cargando...

Execution Runner

Execute benchmark test cases against the MindSurf API and capture responses

Running

0

Pool Available

0

Pool In Use

0

Total Runs

0

Start New Execution

Configure and start a benchmark execution against the MindSurf API

User ID

Locale

Provider

Uses MindSurf API with session management

Metrics to Execute

Safety Critical(0/3 selected)

CDR	0	0s
RPR	0	0s
HRR	0	0s

Conversational Quality(0/7 selected)

BERTScore	0	0s
Length_Ratio	0	0s
Diversity	0	0s
Conversation_Relevancy	0	0s
Role_Adherence	0	0s
Context_Retention	0	0s
Conversation_Completeness	0	0s

Therapeutic Value(0/3 selected)

CAS	0	0s
Empathy	0	0s
TAS	0	0s

Selected: 0 metrics~0 entries | ~0s

Please select at least one metric to execute

Use Mock API (for testing)

Generate Random User Profile

Running Executions

Currently active benchmark runs

No executions running

MindSurf AI Benchmark Tool - Internal Use Only