Last updated:

Model Evaluation FAQ

FAQ

The evaluation button is grayed out with message "Framework does not support this model"

Cause: The current evaluation framework does not yet support this model.

Solution: Contact the platform administrator with the model name and relevant details. The administrator will evaluate and add support as soon as possible.

Evaluation task is stuck in "Pending" state for a long time

Cause: Shared resources are selected and the public compute queue is busy.

Solution:

Wait for the queue to clear (shared resource evaluation tasks execute in submission order).
If immediate execution is needed, switch to Dedicated Resources (billed by time).

Evaluation results appear abnormally low

Possible Causes:

The selected evaluation dataset does not match the model's training language (e.g., using a Chinese dataset to evaluate an English model).
The model lacks instruction-following capability for the corresponding task (base model vs. instruction-tuned model).
Evaluation framework parameters are not configured appropriately.

Solution:

Choose evaluation datasets that match the model's language and task type.
For base models, use evaluation methods suitable for pre-trained models (e.g., perplexity evaluation).
Refer to Evaluation Framework Overview to understand each framework's applicable scenarios.

How to use a custom dataset for evaluation

Refer to the Custom Evaluation Datasets documentation for detailed steps.