Skip to content


Install Kolena to set up rigorous and repeatable model testing in minutes.

In this quickstart guide, we'll use the age_estimation example integration to demonstrate the how to curate test data and test models in Kolena.

Install kolena#

Install the kolena Python package to programmatically interact with Kolena:

pip install kolena
poetry add kolena

Clone the Examples#

The kolenaIO/kolena repository contains a number of example integrations to clone and run directly:

To get started, clone the kolena repository:

git clone

With the repository cloned, let's set up the age_estimation example:

cd kolena/examples/age_estimation
poetry update && poetry install

Now we're up and running and can start creating test suites and testing models.

Create Test Suites#

Each of the example integrations comes with scripts for two flows:

  1. Create test cases and test suite(s) from a source dataset
  2. Test model(s) on the created test suites

Before running, let's first configure our environment by populating the KOLENA_TOKEN environment variable. Visit the Developer page to generate an API token and copy and paste the code snippet into your environment:

export KOLENA_TOKEN="********"

We can now create test suites using the provided seeding script:

poetry run python3 age_estimation/

After this script has completed, we can visit the Test Suites page to view our newly created test suites.

In this age_estimation example, we've created test suites stratifying the LFW dataset (which is stored as a CSV in S3) into test cases by age, estimated race, and estimated gender.

Test a Model#

After we've created test suites, the final step is to test models on these test suites. The age_estimation example provides the ssrnet model for this step:

poetry run python3 age_estimation/ \
  "ssrnet" \
  "age :: labeled-faces-in-the-wild [age estimation]" \
  "race :: labeled-faces-in-the-wild [age estimation]" \
  "gender :: labeled-faces-in-the-wild [age estimation]"

Note: Testing additional models

In this example, model results have already been extracted and are stored in CSV files in S3. To run a new model, plug it into the infer method in

Once this script has completed, click the results link in your console or visit Results to view the test results for this newly tested model.


In this quickstart, we used an example integration from kolenaIO/kolena to create test suites from the Labeled Faces in the Wild (LFW) dataset and test the open-source ssrnet model on these test suites.

This example shows us how to define an ML problem as a workflow for testing in Kolena, and can be arbitrarily extended with additional metrics, plots, visualizations, and data.