Quickstart: Classification

In this quickstart tutorial we'll use the Dogs vs. Cats dataset and the open-source pre-trained YOLOX model to create and run tests for the Classification workflow.

Getting Started

With the kolena-client Python client installed, first let's initialize a client session:
import os
import kolena
kolena.initialize(os.environ["KOLENA_TOKEN"], verbose=True)
The data used in this tutorial is publicly available in the kolena-public-datasets S3 bucket in a metadata.csv file:
import pandas as pd
DATASET = "dogs-vs-cats"
BUCKET = "s3://kolena-public-datasets"
df = pd.read_csv(f"{BUCKET}/{DATASET}/meta/metadata.csv.gz")
To load CSVs directly from S3, make sure to install the s3fs Python module:pip3 install s3fs[boto3] and set up AWS credentials.
This metadata.csv file describes an object detection dataset with the following columns:
  • locator: location of the image in S3
  • label: label annotating if this image contains a dog or a cat
  • width: width of the image in pixels
  • height: height of the image in pixels
Each locator is present exactly one time and contains either a dog or a cat.

Step 1: Creating Tests

With our data already in an S3 bucket and metadata loaded into memory, we can start creating test cases!
For the purposes of this tutorial, let's simplify this problem by treating it as a binary classification problem. Rather than predicting both dog and cat labels, our model will predict only dog labels, where any predictions underneath the confidence threshold will be considered cat predictions.
Let's create a simple test case containing the entire dataset:
from kolena.classification import TestCase, TestImage
complete_test_case = TestCase(f"complete {DATASET}", images=[
TestImage(record.locator, dataset=DATASET, labels=[record.label])
for record in df.itertuples()
In this tutorial we created only a single simple test case, but more advanced test cases can be generated in a variety of fast and scalable ways. See Creating Test Cases for details.
Now that we have a basic test case for our entire dataset, let's create a barebones test suite with just this single test case:
from kolena.classification import TestSuite
test_suite = TestSuite(f"complete {DATASET}", test_cases=[complete_test_case])
This test suite represents a basic starting point for testing on Kolena.

Step 2: Running Tests

With basic tests defined for the Dogs vs. Cats dataset, we can start testing our models.
To start testing, we create an InferenceModel object describing the model being tested:
from typing import List, Tuple
from kolena.classification import InferenceModel, TestImage
def infer(test_image: TestImage) -> List[Tuple[str, float]]:
"""Load image, perform inference, and return (class, confidence)"""
return [] # TODO: your model implementation goes here
model = InferenceModel("example-classifier", infer=infer, metadata=dict(
description="Example Dogs vs. Cats model from quickstart tutorial",
Finally, let's test:
from kolena.classification import test
test(model, test_suite)
That's it! We can now visit the web platform to analyze and debug our model's performance on this test suite:


In this quickstart tutorial we learned how to create new tests for a simple binary classification dataset and how to test classification models on Kolena.
What we learned here just scratches the surface of what's possible with Kolena and covered a fraction of the kolena-client API — now that we're up and running, we can think about ways to create more detailed tests, improve existing tests, and dive deep into model behaviors.