Upload ZOD data
This tutorial will guide you through uploading different scene types using the Zenseact Open Dataset (ZOD). The purpose of this page is to show you some of the steps that might be needed to convert recordings into Kognic scenes.
Prerequisites & Dependencies
- To follow along in this guide you need to download the data from Zenseact Open Dataset. The data should be structured like this:
zod
├── sequences
│ ├── 000000
│ ├── 000002
│ ├── ...
└── trainval-sequences-mini.json
- You will also need to install the ZOD Python package from PyPI, which provides some abstractions for reading the data:
pip install zod
- You need to have a Kognic account and the Kognic Python client installed. If you have not done this yet, read the quickstart guide.
This guide follows the process of uploading scenes using ZOD data, using the example code from the Kognic IO ZOD examples repository which contains the complete source files for all of the snippets in this page. The examples are runnable, if you have the data available and have Kognic authentication set up.
- Cameras Sequence
- Lidars and Cameras Sequence
- Aggregated Lidars and Cameras Sequence
Our example code initialises a Kognic IO Client at the top level, then creates the scene from ZOD data for (potentially) multiple scenes at once using a function.
loading...
The first step in creating the scene is to load and iterate ZOD sequences, picking as many as we are interested in.
loading...
Then we must convert the scene. Given we have the ZOD frames converted, it's very easy to create a single camera sequence.
loading...
But to convert the frames is more complex. We need to add all the sensors that we are interested in: in this case only the FRONT
camera.
We must also convert timestamps to different precision as we go.
- ZOD frame start timestamps are in fractional seconds
- Kognic frame relative timestamps are in milliseconds
- We use integer nanoseconds as an intermediate.
loading...
Converting the camera frame to an image is a simple mapping in this case, which we have abstracted out. Note that we do not know the shutter timing of the ZOD frames, but we set it to 1 ns in this example. This is not a problem in this case where there is no 3D data.
loading...
Going back to the main create function, we move on to creating the scene:
loading...
Where we simply hand the scene (CamerasSequence
) to Kognic IO to create for us. If it is not a dry run, we get
back the UUID of the created scene (if it's a dry run, expect None
).
loading...
This example follows the same broad structure as the cameras-only sequence with the addition of:
- A LiDAR sensor
- A calibration, to allow projection between 2D and 3D coordinate systems.
- Conversion of point clouds from ZOD's packed NumPy arrays.
- Conversion of ego poses for each frame.
As before we initialise a Kognic IO Client at the top-level, then create the scene from ZOD data for (potentially) multiple scenes at once using a function.
loading...
When using both camera and LiDAR sensors we need a calibrations to relate them to each other in space. The process for converting and creating the scene thus gains a step: calibration conversion.
loading...
ZOD provides calibrations that we need to map to Kognic's format. We've provided some utility functions in the examples
repository that do this work for the sensors used in this example: the FRONT
camera and the VELODYNE
lidar.
loading...
Take a look in the example code for the full LiDAR and camera calibration conversion details; suffice to say we must unpack the intrinsics & extrinsics from ZOD format and plug them in to Kognic format.
Next we need to create the frames. As with camera-only sequences, each frame consists of a collection of images for each camera sensor, but now also a point cloud per LiDAR. We also specify the ego vehicle pose for each frame, telling us how the vehicle has moved through the world. This is optional but valuable, as it simplifies annotation of static objects which do not move in world space even though they do move in the reference coordinate system, by allowing very accurate interpolation across frames.
loading...
We convert the pointclouds from ZOD's packed NumPy arrays (.npy
) to one of the formats supported by Kognic IO.
Refer to the linked conversion.py
for details of exactly how the data is read and reformatted.
loading...
In the Kognic platform, single LiDAR scenes should have their ego motion data expressed in the LiDAR coordinate system. For multi-lidar scenes we expect the reference coordinate system. Since ZOD uses the reference coordinate system (which moves with the vehidle), we convert to the LiDAR's coordinate system by applying the calibration transform:
loading...
Once we have created the frames, the main conversion function proceeds as it did in the cameras example: build the scene object and post it to Kognic.
This example follows the same structure as the LiDAR-and-camera sequence example.
Aggregated scenes are a special case of LiDAR + camera sequence scenes where the LiDAR data is aggregated across frames into a single pointcloud. This gives a dense, static pointcloud that represents the entire scene across all frames.
Aggregated scenes may be created by providing a pointcloud on every frame and allowing the Kognic platform to handle aggregation, or, they may be pre-aggregated and uploaded by specifying a pointcloud on the first frame, then nothing on subsequent frames.
In the case of ZOD data, we only have per-frame pointclouds, so the example uploads a pointcloud on every frame and leaves aggregation to the platform. As such it is very similar to the LiDAR-and-camera sequence example, except that:
- The scene type is different:
AggregatedLidarsAndCamerasSequence
instead ofLidarsAndCamerasSequence
.
loading...
- The
Frame
s are of an aggregated-scene specific type
loading...
- Ego pose data is required
Otherwise the two approaches are very similar - refer to the Lidars and Cameras Sequence tab.