Create Model Evaluation Task
Access Entry
On the model details page, click the Model Evaluation button to navigate to the evaluation task creation page.
Note
Only some models support creating evaluation tasks. If the desired model does not have a "Model Evaluation" option, contact the platform administrator.
Configuration Parameters
On the model evaluation task creation page, fill in the following configuration, then click Create Evaluation:
| Parameter | Description |
|---|---|
| Task Name | Custom evaluation task name |
| Model ID | The model identifier on the platform |
| Evaluation Framework | Select the framework: OpenCompass, EvalScope, or lm-evaluation-harness |
| Dataset Selection | Select one or more benchmark datasets from the dataset list |
| Resource Type | Shared Resources: uses public compute, requires queuing; Dedicated Resources: exclusive compute, billed by time |
View Evaluation Results
After creation, use the top navigation to open Model Training & Evaluation → Model Evaluation to view the running status and results of all evaluation tasks. You can also view them centrally in Resource Management.