vision3d.transforms.functional#

Functional form of the 3D transforms in vision3d.transforms.

Functions

`accumulate_sweeps`(sweeps, transforms, ...)	Accumulate and time-stamp a set of lidar sweeps.
`center_crop_camera_intrinsics`(inpt, output_size)	Update `CameraIntrinsics` for a center crop of the corresponding image.
`crop_camera_intrinsics`(inpt, top, left, ...)	Update `CameraIntrinsics` for a crop of the corresponding image.
`flip_3d`(inpt, *, axis)	Flip a tensor along a 3D spatial axis.
`flip_3d_bounding_boxes`(boxes, *, format, axis)	Flip 3D bounding boxes along `axis`.
`flip_3d_point_cloud`(points, *, axis)	Flip point cloud coordinates along `axis`.
`horizontal_flip_bounding_boxes_3d`(inpt)	Flip `BoundingBoxes3D` to match a horizontal image flip.
`horizontal_flip_camera_extrinsics`(inpt)	Update `CameraExtrinsics` for a horizontal image flip.
`horizontal_flip_camera_intrinsics`(inpt)	Update `CameraIntrinsics` for a horizontal flip of the corresponding image.
`horizontal_flip_point_cloud_3d`(inpt)	Flip a `PointCloud3D` to match a horizontal image flip.
`jitter_points`(inpt, *, noise)	Dispatcher entry point for point jittering.
`jitter_points_point_cloud`(points, *, noise)	Add noise to point xyz coordinates.
`pad_camera_intrinsics`(inpt, padding, **kwargs)	Update `CameraIntrinsics` for a pad of the corresponding image.
`register_kernel`(functional, tv_tensor_cls, *)	Register a kernel for a functional and TVTensor type.
`resize_camera_intrinsics`(inpt, size[, max_size])	Update `CameraIntrinsics` for a resize of the corresponding image.
`resized_crop_camera_intrinsics`(inpt, top, ...)	Update `CameraIntrinsics` for a crop followed by a resize.
`rotate_3d`(inpt, *, rotation_matrix)	Rotate a tensor by a 3x3 rotation matrix.
`rotate_3d_bounding_boxes`(boxes, *, format, ...)	Rotate 3D bounding boxes by `rotation_matrix`.
`rotate_3d_camera_extrinsics`(extrinsics, *, ...)	Update camera extrinsics after rotating the lidar frame.
`rotate_3d_point_cloud`(points, *, rotation_matrix)	Rotate point cloud coordinates by `rotation_matrix`.
`sample_points`(inpt, *, indices)	Dispatcher entry point for point sampling.
`sample_points_point_cloud`(points, *, indices)	Select points by index.
`scale_3d`(inpt, *, factor)	Scale a tensor by a uniform factor.
`scale_3d_bounding_boxes`(boxes, *, format, factor)	Scale 3D bounding boxes by `factor`.
`scale_3d_camera_extrinsics`(extrinsics, *, factor)	Update camera extrinsics after scaling the lidar frame.
`scale_3d_point_cloud`(points, *, factor)	Scale point cloud coordinates by `factor`.
`shuffle_points`(inpt, *, perm)	Dispatcher entry point for point shuffling.
`shuffle_points_point_cloud`(points, *, perm)	Permute point order.
`translate_3d`(inpt, *, offset)	Translate a tensor by a 3D offset.
`translate_3d_bounding_boxes`(boxes, *, ...)	Translate 3D bounding boxes by `offset`.
`translate_3d_camera_extrinsics`(extrinsics, ...)	Update camera extrinsics after translating the lidar frame.
`translate_3d_point_cloud`(points, *, offset)	Translate point cloud coordinates by `offset`.
`vertical_flip_bounding_boxes_3d`(inpt)	Flip `BoundingBoxes3D` to match a vertical image flip.
`vertical_flip_camera_extrinsics`(inpt)	Update `CameraExtrinsics` for a vertical image flip.
`vertical_flip_camera_intrinsics`(inpt)	Update `CameraIntrinsics` for a vertical flip of the corresponding image.
`vertical_flip_point_cloud_3d`(inpt)	Flip a `PointCloud3D` to match a vertical image flip.

vision3d.transforms.functional.accumulate_sweeps(sweeps, transforms, time_offsets)[source]#

Accumulate and time-stamp a set of lidar sweeps.

Each sweep is mapped into a common target frame by its own rigid transform, then all sweeps are concatenated into a single point cloud with a new trailing column holding the per-point time offset. This densifies a sparse lidar frame by folding in neighbouring sweeps while recording when each point was captured.

The transform is applied to the (x, y, z) coordinates only. Feature columns (e.g. intensity) pass through unchanged and the time offset is appended after them, so a sweep of shape [N, 3+C] contributes rows of shape [N, 3+C+1].

Parameters:

sweeps (Sequence[Tensor]) – Sweeps to aggregate, each a [N_i, 3+C] point cloud whose first three columns are (x, y, z) in that sweep’s own frame, followed by C feature columns. Every sweep must share the same number of feature columns.
transforms (Tensor) – Rigid [S, 4, 4] homogeneous transforms, one per sweep, mapping that sweep’s coordinates into the target frame.
time_offsets (Tensor) – [S] per-sweep time offsets (e.g. seconds relative to the target frame) broadcast into the appended column.

Returns:

Aggregated [sum(N_i), 3+C+1] point cloud in the target frame, with the time offset as the last column. Rows follow the order of sweeps.

Raises:

ValueError – If sweeps is empty, or if transforms or time_offsets do not have exactly one entry per sweep.

Return type:

Tensor

vision3d.transforms.functional.center_crop_camera_intrinsics(inpt, output_size)[source]#

Update CameraIntrinsics for a center crop of the corresponding image.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.
output_size (list[int]) – Target (h, w) after the center crop.

Returns:

Updated intrinsics with image_size set to output_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.crop_camera_intrinsics(inpt, top, left, height, width)[source]#

Update CameraIntrinsics for a crop of the corresponding image.

Shifts the principal point so projection through the updated intrinsics matches projection through the original intrinsics on the cropped image.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.
top (int) – Top edge of the crop in pixels.
left (int) – Left edge of the crop in pixels.
height (int) – Crop height in pixels.
width (int) – Crop width in pixels.

Returns:

Updated intrinsics with image_size set to (height, width).

Return type:

CameraIntrinsics

vision3d.transforms.functional.flip_3d(inpt, *, axis)[source]#

Flip a tensor along a 3D spatial axis.

This is the dispatcher entry point. Type-specific kernels are registered below.

Parameters:

inpt (Tensor) – Input tensor.
axis (str) – One of "x", "y", "z".

Returns:

Flipped tensor.

Return type:

Tensor

vision3d.transforms.functional.flip_3d_bounding_boxes(boxes, *, format, axis)[source]#

Flip 3D bounding boxes along axis.

Parameters:

boxes (Tensor) – Bounding box tensor [..., K].
format (BoundingBox3DFormat) – Format of the boxes.
axis (str) – One of "x", "y", "z".

Returns:

Flipped bounding boxes with the same shape.

Return type:

Tensor

vision3d.transforms.functional.flip_3d_point_cloud(points, *, axis)[source]#

Flip point cloud coordinates along axis.

Parameters:

points (Tensor) – Point cloud tensor [..., 3+C].
axis (str) – One of "x", "y", "z".

Returns:

Flipped point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.horizontal_flip_bounding_boxes_3d(inpt)[source]#

Flip BoundingBoxes3D to match a horizontal image flip.

Reflects the source frame’s Y axis following the fixed world-axis convention for a horizontal flip. The paired extrinsics kernel applies the matching camera-frame reflection, so projection stays consistent for any camera pose.

Parameters:: inpt (BoundingBoxes3D) – The boxes to flip.
Returns:: The flipped boxes with the same format.
Return type:: BoundingBoxes3D

vision3d.transforms.functional.horizontal_flip_camera_extrinsics(inpt)[source]#

Update CameraExtrinsics for a horizontal image flip.

Reflects the source frame about its Y axis (paired with a camera-frame X reflection) so the source-to-camera mapping stays consistent with the horizontally flipped image.

Parameters:: inpt (CameraExtrinsics) – The extrinsics to update.
Returns:: Updated extrinsics with the same shape.
Return type:: CameraExtrinsics

vision3d.transforms.functional.horizontal_flip_camera_intrinsics(inpt)[source]#

Update CameraIntrinsics for a horizontal flip of the corresponding image.

Mirrors the principal point about the image’s vertical center line and negates the skew so projection through the updated intrinsics matches projection through the original intrinsics on the flipped image.

Parameters:: inpt (CameraIntrinsics) – The intrinsics to update.
Returns:: Updated intrinsics with the same image_size.
Return type:: CameraIntrinsics

vision3d.transforms.functional.horizontal_flip_point_cloud_3d(inpt)[source]#

Flip a PointCloud3D to match a horizontal image flip.

Parameters:: inpt (PointCloud3D) – The point cloud to flip.
Returns:: The flipped point cloud.
Return type:: PointCloud3D

vision3d.transforms.functional.jitter_points(inpt, *, noise)[source]#

Dispatcher entry point for point jittering.

Returns:

Input unchanged (passthrough for non-point types).

Parameters:

inpt (Tensor)
noise (Tensor)

Return type:

Tensor

vision3d.transforms.functional.jitter_points_point_cloud(points, *, noise)[source]#

Add noise to point xyz coordinates.

Parameters:

points (Tensor) – Point cloud [N, 3+C].
noise (Tensor) – Additive noise [N, 3].

Returns:

Jittered point cloud with the same shape. Non-xyz features are unchanged.

Return type:

Tensor

vision3d.transforms.functional.pad_camera_intrinsics(inpt, padding, **kwargs)[source]#

Update CameraIntrinsics for a pad of the corresponding image.

Shifts the principal point by the top-left pad and grows image_size to include the padded borders.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.
padding (int | list[int]) – Padding spec as accepted by torchvision.transforms.v2.functional.pad().
kwargs (Any) – Unused; accepted for signature compatibility with torchvision.transforms.v2.functional.pad().

Returns:

Updated intrinsics with the padded image_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.register_kernel(functional, tv_tensor_cls, *, tv_tensor_wrapper=True)[source]#

Parameters:

functional (Callable[[...], Any]) – The functional to register a kernel for.
tv_tensor_cls (type[TVTensor]) – The TVTensor subclass this kernel handles.
tv_tensor_wrapper (bool) – If True (default), the kernel receives an unwrapped pure tensor and the output is automatically re-wrapped. If False, the kernel receives the full TVTensor and must handle wrap itself.

Returns:

Decorator that registers the kernel.

Return type:

Callable[[Callable[[…], Any]], Callable[[…], Any]]

vision3d.transforms.functional.resize_camera_intrinsics(inpt, size, max_size=None, **kwargs)[source]#

Update CameraIntrinsics for a resize of the corresponding image.

Scales the focal lengths, skew, and principal point so projection through the updated intrinsics matches projection through the original intrinsics on the resized image.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.
size (list[int] | None) – Target (h, w) after resize.
max_size (int | None) – Optional cap on the longer edge.
kwargs (Any) – Unused; accepted for signature compatibility with torchvision.transforms.v2.functional.resize().

Returns:

Updated intrinsics with the new image_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.resized_crop_camera_intrinsics(inpt, top, left, height, width, size, **kwargs)[source]#

Update CameraIntrinsics for a crop followed by a resize.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.
top (int) – Top edge of the crop in pixels.
left (int) – Left edge of the crop in pixels.
height (int) – Crop height in pixels.
width (int) – Crop width in pixels.
size (list[int]) – Target (h, w) after the resize.
kwargs (Any) – Unused; accepted for signature compatibility with torchvision.transforms.v2.functional.resized_crop().

Returns:

Updated intrinsics with image_size set to size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.rotate_3d(inpt, *, rotation_matrix)[source]#

Rotate a tensor by a 3x3 rotation matrix.

Dispatcher entry point. Type-specific kernels are registered below.

Parameters:

inpt (Tensor) – Input tensor.
rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Rotated tensor.

Return type:

Tensor

vision3d.transforms.functional.rotate_3d_bounding_boxes(boxes, *, format, rotation_matrix)[source]#

Rotate 3D bounding boxes by rotation_matrix.

Only rotated formats are supported:

XYZLWHY: only Z-axis rotations (pure yaw).
XYZLWHYPR: arbitrary rotations.

Axis-aligned formats (XYZXYZ, XYZLWH) cannot represent rotation and will raise NotImplementedError.

Parameters:

boxes (Tensor) – Bounding box tensor [..., K].
format (BoundingBox3DFormat) – Format of the boxes.
rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Rotated bounding boxes with the same shape.

Raises:

NotImplementedError – If format is axis-aligned.
ValueError – If format is XYZLWHY and rotation is not pure yaw.

Return type:

Tensor

vision3d.transforms.functional.rotate_3d_camera_extrinsics(extrinsics, *, rotation_matrix)[source]#

Update camera extrinsics after rotating the lidar frame.

The lidar-to-camera extrinsic E satisfies p_cam = E @ p_lidar. After rotating the lidar frame by R, points become p' = R @ p, so E' = E @ R_inv to keep p_cam = E' @ p'.

Parameters:

extrinsics (Tensor) – Extrinsic matrices [..., 4, 4].
rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Updated extrinsics with the same shape.

Return type:

Tensor

vision3d.transforms.functional.rotate_3d_point_cloud(points, *, rotation_matrix)[source]#

Rotate point cloud coordinates by rotation_matrix.

Parameters:

points (Tensor) – Point cloud tensor [..., 3+C].
rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Rotated point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.sample_points(inpt, *, indices)[source]#

Dispatcher entry point for point sampling.

Returns:

Input unchanged (passthrough for non-point types).

Parameters:

inpt (Tensor)
indices (Tensor)

Return type:

Tensor

vision3d.transforms.functional.sample_points_point_cloud(points, *, indices)[source]#

Select points by index.

Parameters:

points (Tensor) – Point cloud [N, 3+C].
indices (Tensor) – Selection indices [M]. May contain repeats for oversampling.

Returns:

Selected point cloud [M, 3+C].

Return type:

Tensor

vision3d.transforms.functional.scale_3d(inpt, *, factor)[source]#

Scale a tensor by a uniform factor.

Dispatcher entry point. Type-specific kernels are registered below.

Parameters:

inpt (Tensor) – Input tensor.
factor (float) – Scale factor.

Returns:

Scaled tensor.

Return type:

Tensor

vision3d.transforms.functional.scale_3d_bounding_boxes(boxes, *, format, factor)[source]#

Scale 3D bounding boxes by factor.

Scales both position and dimensions. Rotation angles are unchanged.

Parameters:

boxes (Tensor) – Bounding box tensor [..., K].
format (BoundingBox3DFormat) – Format of the boxes.
factor (float) – Scale factor.

Returns:

Scaled bounding boxes with the same shape.

Return type:

Tensor

vision3d.transforms.functional.scale_3d_camera_extrinsics(extrinsics, *, factor)[source]#

Update camera extrinsics after scaling the lidar frame.

Parameters:

extrinsics (Tensor) – Extrinsic matrices [..., 4, 4].
factor (float) – Scale factor applied to the lidar frame.

Returns:

Updated extrinsics with the same shape.

Return type:

Tensor

vision3d.transforms.functional.scale_3d_point_cloud(points, *, factor)[source]#

Scale point cloud coordinates by factor.

Parameters:

points (Tensor) – Point cloud tensor [..., 3+C].
factor (float) – Scale factor.

Returns:

Scaled point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.shuffle_points(inpt, *, perm)[source]#

Dispatcher entry point for point shuffling.

Returns:

Input unchanged (passthrough for non-point types).

Parameters:

inpt (Tensor)
perm (Tensor)

Return type:

Tensor

vision3d.transforms.functional.shuffle_points_point_cloud(points, *, perm)[source]#

Permute point order.

Parameters:

points (Tensor) – Point cloud [N, 3+C].
perm (Tensor) – Permutation indices [N].

Returns:

Permuted point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.translate_3d(inpt, *, offset)[source]#

Translate a tensor by a 3D offset.

Dispatcher entry point. Type-specific kernels are registered below.

Parameters:

inpt (Tensor) – Input tensor.
offset (Tensor) – Translation [3] as (tx, ty, tz).

Returns:

Translated tensor.

Return type:

Tensor

vision3d.transforms.functional.translate_3d_bounding_boxes(boxes, *, format, offset)[source]#

Translate 3D bounding boxes by offset.

Parameters:

boxes (Tensor) – Bounding box tensor [..., K].
format (BoundingBox3DFormat) – Format of the boxes.
offset (Tensor) – Translation [3] as (tx, ty, tz).

Returns:

Translated bounding boxes with the same shape.

Return type:

Tensor

vision3d.transforms.functional.translate_3d_camera_extrinsics(extrinsics, *, offset)[source]#

Update camera extrinsics after translating the lidar frame.

The lidar-to-camera extrinsic translation changes because the lidar origin moved by offset in the lidar frame.

Parameters:

extrinsics (Tensor) – Extrinsic matrices [..., 4, 4].
offset (Tensor) – Translation [3] as (tx, ty, tz) in lidar frame.

Returns:

Updated extrinsics with the same shape.

Return type:

Tensor

vision3d.transforms.functional.translate_3d_point_cloud(points, *, offset)[source]#

Translate point cloud coordinates by offset.

Parameters:

points (Tensor) – Point cloud tensor [..., 3+C].
offset (Tensor) – Translation [3] as (tx, ty, tz).

Returns:

Translated point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.vertical_flip_bounding_boxes_3d(inpt)[source]#

Flip BoundingBoxes3D to match a vertical image flip.

Reflects the source frame’s Z axis following the fixed world-axis convention for a vertical flip. The paired extrinsics kernel applies the matching camera-frame reflection, so projection stays consistent for any camera pose.

Parameters:: inpt (BoundingBoxes3D) – The boxes to flip.
Returns:: The flipped boxes with the same format.
Return type:: BoundingBoxes3D

vision3d.transforms.functional.vertical_flip_camera_extrinsics(inpt)[source]#

Update CameraExtrinsics for a vertical image flip.

Reflects the source frame about its Z axis (paired with a camera-frame Y reflection) so the source-to-camera mapping stays consistent with the vertically flipped image.

Parameters:: inpt (CameraExtrinsics) – The extrinsics to update.
Returns:: Updated extrinsics with the same shape.
Return type:: CameraExtrinsics

vision3d.transforms.functional.vertical_flip_camera_intrinsics(inpt)[source]#

Update CameraIntrinsics for a vertical flip of the corresponding image.

Mirrors the principal point about the image’s horizontal center line and negates the skew so projection through the updated intrinsics matches projection through the original intrinsics on the flipped image.

Parameters:: inpt (CameraIntrinsics) – The intrinsics to update.
Returns:: Updated intrinsics with the same image_size.
Return type:: CameraIntrinsics

vision3d.transforms.functional.vertical_flip_point_cloud_3d(inpt)[source]#

Flip a PointCloud3D to match a vertical image flip.

Parameters:: inpt (PointCloud3D) – The point cloud to flip.
Returns:: The flipped point cloud.
Return type:: PointCloud3D