Install Kolena to set up rigorous and repeatable model testing in minutes.
In this quickstart guide, we'll use the
age_estimation example integration to
demonstrate the how to curate test data and test models in Kolena.
kolena Python package to programmatically interact with Kolena:
Clone the Examples#
The kolenaIO/kolena repository contains a number of example integrations to clone and run directly:
Age Estimation using the Labeled Faces in the Wild (LFW) dataset
Facial Keypoint Detection using the 300 Faces in the Wild (300-W) dataset
2D Object Detection using the COCO dataset
3D Object Detection using the KITTI dataset
Binary Classification of class "Dog" using the Dogs vs. Cats dataset
Multiclass Classification using the CIFAR-10 dataset
Semantic Textual Similarity using the STS benchmark dataset
Question Answering using the Conversational Question Answering (CoQA) dataset
Semantic Segmentation on class
Personusing the COCO-Stuff 10K dataset
To get started, clone the
With the repository cloned, let's set up the
Create Test Suites#
Each of the example integrations comes with scripts for two flows:
seed_test_suite.py: Create test cases and test suite(s) from a source dataset
seed_test_run.py: Test model(s) on the created test suites
let's first configure our environment by populating the
environment variable. Visit the
Developer page to
generate an API token and copy and paste the code snippet into your environment:
We can now create test suites using the provided seeding script:
After this script has completed, we can visit the Test Suites page to view our newly created test suites.
age_estimation example, we've created test suites stratifying the LFW dataset (which is stored as a CSV in
S3) into test cases by age, estimated race, and estimated gender.
Test a Model#
After we've created test suites, the final step is to test models on these test suites. The
ssrnet model for this step:
Note: Testing additional models
In this example, model results have already been extracted and are stored in CSV files in S3. To run a new model,
plug it into the
infer method in
Once this script has completed, click the results link in your console or visit Results to view the test results for this newly tested model.
In this quickstart, we used an example integration from kolenaIO/kolena to create
test suites from the Labeled Faces in the Wild (LFW) dataset and test the
ssrnet model on these test suites.
This example shows us how to define an ML problem as a workflow for testing in Kolena, and can be arbitrarily extended with additional metrics, plots, visualizations, and data.