Quickstart: Instance Segmentation

In this quickstart tutorial we'll use the COCO 2014 Validation dataset and a stubbed out example model to create and run tests for the Instance Segmentation workflow.

Getting Started

With the kolena-client Python client installed, first let's initialize a client session:
import os
import kolena
kolena.initialize(os.environ["KOLENA_TOKEN"], verbose=True)
The data used in this tutorial is publicly available in the kolena-public-datasets S3 bucket in an instance_segmentation/metadata.csv file:
import pandas as pd
DATASET = "coco-2014-val"
BUCKET = "s3://kolena-public-datasets"
df = pd.read_csv(f"{BUCKET}/{DATASET}/meta/instance_segmentation/metadata.csv")
To load files directly from S3, make sure to install the s3fs Python module:pip3 install s3fs[boto3]
This metadata.csv file describes an image segmentation dataset with the following columns:
  • locator: location of the image in S3
  • label: label corresponding to the described by this record's segmentation mask
  • points: stringified list of coordinates corresponding to the [x, y] coordinates of the segmentation mask vertices
There is one record in this table for each ground truth segmentation mask in the dataset, meaning a given locator may be present multiple times.
For brevity, the COCO dataset has been pared down to only 14 classes.

Step 1: Creating Tests

With our data already in an S3 bucket and metadata loaded into memory, we can start creating test cases!
Let's create a simple test case containing the entire dataset
import json
from typing import List, Tuple
from kolena.detection import TestCase, TestImage
from kolena.detection.ground_truth import SegmentationMask
# Converts points from the [[x1, y1], [x2, y2]...] format in the csv file to the
# an [(x1, y1), (x2, y2)...] format.
def as_point_tuples(points: List[List[float]]) -> List[Tuple[float, float]]:
return [(point[0], point[1]) for point in points]
complete_test_case = TestCase(f"complete {DATASET}", images=[
TestImage(locator, dataset=DATASET, ground_truths=[
SegmentationMask(record.label, as_point_tuples(json.loads(record.points)))
for record in df_locator.itertuples()
]) for locator, df_locator in df.groupby("locator")
This dataset-sized test case is a good place to start, but let's drill a little deeper to create test cases for each class in the dataset:
complete_test_case_images = complete_test_case.load_images()
class_test_cases = [
TestCase(f"{label} ({DATASET})", images=[
image.filter(lambda gt: gt.label == label)
for image in complete_test_case_images
]) for label, df_label in df.groupby("label")
Note that we're including every image in each class' test case such that there are a sizable number of true negative images.
See test case best practices for more information on balancing positive and negative examples in your test cases.
In this tutorial we created only a few single simple test cases, but more advanced test cases can be generated in a variety of fast and scalable ways. See Creating Test Cases for details.
Now that we have basic test cases for our entire dataset and for each class within the dataset, let's create a test suite to group them together:
from kolena.detection import TestSuite
test_suite = TestSuite(f"complete {DATASET}", test_cases=[
complete_test_case, *class_test_cases
This test suite represents a basic starting point for testing on Kolena.

Step 2: Running Tests

With basic tests defined for the COCO dataset, we can start testing our models.
To start testing, we create an InferenceModel object describing the model being tested:
from kolena.detection import InferenceModel, TestImage
from kolena.detection.ground_truth import SegmentationMask
def infer(test_image: TestImage) -> List[SegmentationMask]:
model = InferenceModel("example-model", infer=infer, metadata=dict(
description="Example model from quickstart tutorial",
Finally, let's test:
from kolena.detection import test
test(model, test_suite)
That's it! We can now visit the web platform to analyze and debug our model's performance on this test suite:


In this quickstart tutorial we learned how to create new tests for instance segmentation datasets and how to test instance segmentation models on Kolena.
What we learned here just scratches the surface of what's possible with Kolena and covered a fraction of the kolena-client API — now that we're up and running, we can think about ways to create more detailed tests, improve existing tests, and dive deep into model behaviors.