Skip to content

Quickstart#

Install Kolena to set up rigorous and repeatable model testing in minutes.

In this quickstart guide, we'll use the age_estimation example integration to demonstrate the how to curate test data and test models in Kolena.

Install kolena#

Install the kolena Python package to programmatically interact with Kolena:

pip install kolena
poetry add kolena

Clone the Examples#

The kolenaIO/kolena repository contains a number of example integrations to clone and run directly:

To get started, clone the kolena repository:

git clone https://github.com/kolenaIO/kolena.git

With the repository cloned, let's set up the age_estimation example:

cd kolena/examples/age_estimation
poetry update && poetry install

Now we're up and running and can start creating test suites and testing models.

Create Test Suites#

Each of the example integrations comes with scripts for two flows:

  1. seed_test_suite.py: Create test cases and test suite(s) from a source dataset
  2. seed_test_run.py: Test model(s) on the created test suites

Before running seed_test_suite.py, let's first configure our environment by populating the KOLENA_TOKEN environment variable. Visit the Developer page to generate an API token and copy and paste the code snippet into your environment:

export KOLENA_TOKEN="********"

We can now create test suites using the provided seeding script:

poetry run python3 age_estimation/seed_test_suite.py

After this script has completed, we can visit the Test Suites page to view our newly created test suites.

In this age_estimation example, we've created test suites stratifying the LFW dataset (which is stored as a CSV in S3) into test cases by age, estimated race, and estimated gender.

Test a Model#

After we've created test suites, the final step is to test models on these test suites. The age_estimation example provides the ssrnet model for this step:

poetry run python3 age_estimation/seed_test_run.py \
  "ssrnet" \
  "age :: labeled-faces-in-the-wild [age estimation]" \
  "race :: labeled-faces-in-the-wild [age estimation]" \
  "gender :: labeled-faces-in-the-wild [age estimation]"

Note: Testing additional models

In this example, model results have already been extracted and are stored in CSV files in S3. To run a new model, plug it into the infer method in seed_test_run.py.

Once this script has completed, click the results link in your console or visit Results to view the test results for this newly tested model.

Conclusion#

In this quickstart, we used an example integration from kolenaIO/kolena to create test suites from the Labeled Faces in the Wild (LFW) dataset and test the open-source ssrnet model on these test suites.

This example shows us how to define an ML problem as a workflow for testing in Kolena, and can be arbitrarily extended with additional metrics, plots, visualizations, and data.