Quickstart: Instance Segmentation
In this quickstart tutorial we'll use the COCO 2014 Validation dataset and a stubbed out example model to create and run tests for the Instance Segmentation workflow.
import os
import kolena
kolena.initialize(os.environ["KOLENA_TOKEN"], verbose=True)
The data used in this tutorial is publicly available in the
kolena-public-datasets
S3 bucket in an instance_segmentation/metadata.csv
file:import pandas as pd
DATASET = "coco-2014-val"
BUCKET = "s3://kolena-public-datasets"
df = pd.read_csv(f"{BUCKET}/{DATASET}/meta/instance_segmentation/metadata.csv")
To load CSVs directly from S3, make sure to install the
s3fs
Python module:pip3 install s3fs[boto3]
and set up AWS credentials.This
metadata.csv
file describes an image segmentation dataset with the following columns:locator
: location of the image in S3label
: label corresponding to the described by this record's segmentation maskpoints
: stringified list of coordinates corresponding to the[x, y]
coordinates of the segmentation mask vertices
There is one record in this table for each ground truth segmentation mask in the dataset, meaning a given
locator
may be present multiple times.For brevity, the COCO dataset has been pared down to only 14 classes.
With our data already in an S3 bucket and metadata loaded into memory, we can start creating test cases!
Let's create a simple test case containing the entire dataset
import json
from typing import List, Tuple
from kolena.detection import TestCase, TestImage
from kolena.detection.ground_truth import SegmentationMask
# Converts points from the [[x1, y1], [x2, y2]...] format in the csv file to the
# an [(x1, y1), (x2, y2)...] format.
def as_point_tuples(points: List[List[float]]) -> List[Tuple[float, float]]:
return [(point[0], point[1]) for point in points]
complete_test_case = TestCase(f"complete {DATASET}", images=[
TestImage(locator, dataset=DATASET, ground_truths=[
SegmentationMask(record.label, as_point_tuples(json.loads(record.points)))
for record in df_locator.itertuples()
]) for locator, df_locator in df.groupby("locator")
])
This dataset-sized test case is a good place to start, but let's drill a little deeper to create test cases for each class in the dataset:
complete_test_case_images = complete_test_case.load_images()
class_test_cases = [
TestCase(f"{label} ({DATASET})", images=[
image.filter(lambda gt: gt.label == label)
for image in complete_test_case_images
]) for label, df_label in df.groupby("label")
]
Note that we're including every image in each class' test case such that there are a sizable number of true negative images.
See test case best practices for more information on balancing positive and negative examples in your test cases.
In this tutorial we created only a few single simple test cases, but more advanced test cases can be generated in a variety of fast and scalable ways. See Creating Test Cases for details.
Now that we have basic test cases for our entire dataset and for each class within the dataset, let's create a test suite to group them together:
from kolena.detection import TestSuite
test_suite = TestSuite(f"complete {DATASET}", test_cases=[
complete_test_case, *class_test_cases
])
This test suite represents a basic starting point for testing on Kolena.
With basic tests defined for the COCO dataset, we can start testing our models.
To start testing, we create an
InferenceModel
object describing the model being tested:from kolena.detection import InferenceModel, TestImage
from kolena.detection.ground_truth import SegmentationMask
def infer(test_image: TestImage) -> List[SegmentationMask]:
...
model = InferenceModel("example-model", infer=infer, metadata=dict(
description="Example model from quickstart tutorial",
))
Finally, let's test:
from kolena.detection import test
test(model, test_suite)
That's it! We can now visit the web platform to analyze and debug our model's performance on this test suite:
In this quickstart tutorial we learned how to create new tests for instance segmentation datasets and how to test instance segmentation models on Kolena.
What we learned here just scratches the surface of what's possible with Kolena and covered a fraction of the
kolena-client
API — now that we're up and running, we can think about ways to create more detailed tests, improve existing tests, and dive deep into model behaviors.Last modified 19d ago