Skip to main content

These docs are old and won't be updated. Go to docs.kognic.com for our latest documentation!

Upload ZOD data

This tutorial will guide you through uploading different scene types using the Zenseact Open Dataset (ZOD). The purpose of this page is to show you some of the steps that might be needed to convert recordings into Kognic scenes.

Prerequisites & Dependencies

To follow along in this guide you need to download the data from Zenseact Open Dataset. The data should be structured like this:

zod
├── sequences
│   ├── 000000
│   ├── 000002
│   ├── ...
└── trainval-sequences-mini.json

You will also need to install the ZOD Python package from PyPI, which provides some abstractions for reading the data:

pip install zod

You need to have a Kognic account and the Kognic Python client installed. If you have not done this yet, read the quickstart guide.

This guide follows the process of uploading scenes using ZOD data, using the example code from the Kognic IO ZOD examples repository which contains the complete source files for all of the snippets in this page. The examples are runnable, if you have the data available and have Kognic authentication set up.

Cameras Sequence
Lidars and Cameras Sequence
Aggregated Lidars and Cameras Sequence

Our example code initialises a Kognic IO Client at the top level, then creates the scene from ZOD data for (potentially) multiple scenes at once using a function.

examples/zod/upload_cameras_sequence_scene.py
loading...

The first step in creating the scene is to load and iterate ZOD sequences, picking as many as we are interested in.

examples/zod/upload_cameras_sequence_scene.py
loading...

Then we must convert the scene. Given we have the ZOD frames converted, it's very easy to create a single camera sequence.

examples/zod/upload_cameras_sequence_scene.py
loading...

But to convert the frames is more complex. We need to add all the sensors that we are interested in: in this case only the FRONT camera. We must also convert timestamps to different precision as we go.

ZOD frame start timestamps are in fractional seconds
Kognic frame relative timestamps are in milliseconds
We use integer nanoseconds as an intermediate.

examples/zod/upload_cameras_sequence_scene.py
loading...

Converting the camera frame to an image is a simple mapping in this case, which we have abstracted out. Note that we do not know the shutter timing of the ZOD frames, but we set it to 1 ns in this example. This is not a problem in this case where there is no 3D data.

examples/zod/conversion.py
loading...

Going back to the main create function, we move on to creating the scene:

examples/zod/upload_cameras_sequence_scene.py
loading...

Where we simply hand the scene (CamerasSequence) to Kognic IO to create for us. If it is not a dry run, we get back the UUID of the created scene (if it's a dry run, expect None).

examples/zod/upload_cameras_sequence_scene.py
loading...

This example follows the same broad structure as the cameras-only sequence with the addition of:

A LiDAR sensor
A calibration, to allow projection between 2D and 3D coordinate systems.
Conversion of point clouds from ZOD's packed NumPy arrays.
Conversion of ego poses for each frame.

As before we initialise a Kognic IO Client at the top-level, then create the scene from ZOD data for (potentially) multiple scenes at once using a function.

examples/zod/upload_lcs_scene.py
loading...

When using both camera and LiDAR sensors we need a calibrations to relate them to each other in space. The process for converting and creating the scene thus gains a step: calibration conversion.

examples/zod/upload_lcs_scene.py
loading...

ZOD provides calibrations that we need to map to Kognic's format. We've provided some utility functions in the examples repository that do this work for the sensors used in this example: the FRONT camera and the VELODYNE lidar.

examples/zod/conversion.py
loading...

Take a look in the example code for the full LiDAR and camera calibration conversion details; suffice to say we must unpack the intrinsics & extrinsics from ZOD format and plug them in to Kognic format.

Next we need to create the frames. As with camera-only sequences, each frame consists of a collection of images for each camera sensor, but now also a point cloud per LiDAR. We also specify the ego vehicle pose for each frame, telling us how the vehicle has moved through the world. This is optional but valuable, as it simplifies annotation of static objects which do not move in world space even though they do move in the reference coordinate system, by allowing very accurate interpolation across frames.

examples/zod/upload_lcs_scene.py
loading...

We convert the pointclouds from ZOD's packed NumPy arrays (.npy) to one of the formats supported by Kognic IO. Refer to the linked conversion.py for details of exactly how the data is read and reformatted.

examples/zod/conversion.py
loading...

In the Kognic platform, single LiDAR scenes should have their ego motion data expressed in the LiDAR coordinate system. For multi-lidar scenes we expect the reference coordinate system. Since ZOD uses the reference coordinate system (which moves with the vehidle), we convert to the LiDAR's coordinate system by applying the calibration transform:

examples/zod/upload_lcs_scene.py
loading...

Once we have created the frames, the main conversion function proceeds as it did in the cameras example: build the scene object and post it to Kognic.

This example follows the same structure as the LiDAR-and-camera sequence example.

Aggregated scenes are a special case of LiDAR + camera sequence scenes where the LiDAR data is aggregated across frames into a single pointcloud. This gives a dense, static pointcloud that represents the entire scene across all frames.

Aggregated scenes may be created by providing a pointcloud on every frame and allowing the Kognic platform to handle aggregation, or, they may be pre-aggregated and uploaded by specifying a pointcloud on the first frame, then nothing on subsequent frames.

In the case of ZOD data, we only have per-frame pointclouds, so the example uploads a pointcloud on every frame and leaves aggregation to the platform. As such it is very similar to the LiDAR-and-camera sequence example, except that:

The scene type is different: AggregatedLidarsAndCamerasSequence instead of LidarsAndCamerasSequence.

examples/zod/upload_alcs_scene.py
loading...

The Frames are of an aggregated-scene specific type

examples/zod/upload_alcs_scene.py
loading...

Ego pose data is required

Otherwise the two approaches are very similar - refer to the Lidars and Cameras Sequence tab.