matx.vision package¶

class matx.vision.AutoContrastOp(device: Any)[source]¶

Bases: object

Apply auto contrast on input images, i.e. remap the image so that the darkest pixel becomes black (0), and the lightest becomes white (255)

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Apply auto contrast on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import AutoContrastOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = AutoContrastOp(device)
>>> ret = op(nds)

__init__(device: Any) → None[source]¶

Initialize AutoContrastOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.AverageBlurOp(device: Any, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Apply average blur on input images.

__call__(images: List[NDArray], ksizes: List[Tuple[int, int]], anchors: List[Tuple[int, int]] = [], sync: int = 0) → List[NDArray][source]¶

Apply average blur on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
ksizes (List[Tuple[int, int]]) – conv kernel size for each image, each item in this list should be a 2 element tuple (x, y).
anchors (List[Tuple[int, int]], optional) – anchors of each kernel, each item in this list should be a 2 element tuple (x, y). If not given, -1 would be used by default to indicate anchor for from the center.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import AverageBlurOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> ksizes = [(3, 3), (3, 5), (5, 5)]

>>> op = AverageBlurOp(device)
>>> ret = op(nds, ksizes)

__init__(device: Any, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize AverageBlurOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.CastOp(device: Any)[source]¶

Bases: object

Cast image data type to target type, e.g. uint8 to float32

__call__(images: NDArray, dtype: str, alpha: float = 1.0, beta: float = 0.0, sync: int = 0) → NDArray[source]¶

Cast image data type to target type. Could apply factor scale and shift at the same time.

Parameters:

images (matx.runtime.NDArray) – target images.
dtype (str) – target data type that want to convert to, e.g. uint8, float32, etc.
alpha (float, optional) – scale factor when casting the data type, e.g. cast image from uint8 to float32, if want to change the value range from [0, 255] to [0, 1], alpha can be set as 1.0/255.
beta (float, optional) – shift value when casting the data type
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import CastOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> dtype = "float32"
>>> alpha = 1.0 / 255
>>> beta = 0.0

>>> op = CastOp(device)
>>> ret = op(nds, dtype, alpha, beta)

__init__(device: Any) → None[source]¶

Initialize CastOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.CenterCropOp(device: Any, sizes: Tuple[int, int])[source]¶

Bases: object

Center crop the given images

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

CenterCrop images

Parameters:

images (List[matx.runtime.NDArray]) – input images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

center crop images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import CenterCropOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = CenterCropOp(device=device,
                      size=(224, 224))
>>> ret = op(nds)

__init__(device: Any, sizes: Tuple[int, int]) → None[source]¶

Initialize CenterCropOp

Parameters:

device (Any) – the matx device used for the operation.
sizes (Tuple[int, int]) – output size for all images, must be 2 dim tuple.

class matx.vision.ChannelReorderOp(device: Any)[source]¶

Bases: object

Apply channel reorder on input images.

__call__(images: List[NDArray], orders: List[List[int]], sync: int = 0) → List[NDArray][source]¶

Apply channel reorder on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
orders (List[List[int]]) – index order of the new channels for each image. e.g. if want to change bgr image to rgb image, the order could be [2,1,0]
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import ChannelReorderOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> orders = [[2,1,0], [1,0,1], [2,2,2]]

>>> op = ChannelReorderOp(device)
>>> ret = op(nds, orders)

__init__(device: Any) → None[source]¶

Initialize ChannelReorderOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.ColorLinearAdjustOp(device: Any, prob: float = 1.1, per_channel: bool = False)[source]¶

Bases: object

Apply linear adjust on pixels of input images, i.e. apply a * v + b for each pixel v in image/channel.

__call__(images: List[NDArray], factors: List[float], shifts: List[float], sync: int = 0) → List[NDArray][source]¶

Apply linear adjust on pixels of input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
factors (List[float]) – factor for linear adjustment.
shifts (List[float]) – shift for linear adjustment.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images. The output value would be in its original data type range, e.g. for uint [0, 255]

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import ColorLinearAdjustOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> # create parameters for linear adjustment
>>> factors = [1.1, 1.2, 1.3]
>>> shifts = [-10, -20, -30]

>>> op = ColorLinearAdjustOp(device, per_channel=False)
>>> ret = op(nds, factors, shifts)

__init__(device: Any, prob: float = 1.1, per_channel: bool = False) → None[source]¶

Initialize ColorLinearAdjustOp

Parameters:

device (Any) – the matx device used for the operation
prob (float, optional) – probability for linear ajustment on each image. Apply on all by default.
per_channel (bool, optional) – if False, all channels of a single image would use the same linear parameters; if True, each channel would be able to set different linear adjustment

class matx.vision.Conv2dOp(device: Any, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Apply conv kernels on input images.

__call__(images: List[NDArray], kernels: List[List[List[float]]], anchors: List[Tuple[int, int]] = [], sync: int = 0) → List[NDArray][source]¶

Apply conv kernels on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
kernels (List[List[List[float]]]) – conv kernels for each image.
anchors (List[Tuple[int, int]], optional) – anchors of each kernel, each item in this list should be a 2 element tuple (x, y). If not given, -1 would be used by default to indicate anchor for from the center.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import Conv2dOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> # create parameters for conv2d
>>> kernel = [[1.0/25] * 5 for _ in range(5)]
>>> kernels = [kernel, kernel, kernel]

>>> op = Conv2dOp(device)
>>> ret = op(nds, kernels)

__init__(device: Any, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize Conv2dOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.CropOp(device: Any)[source]¶

Bases: object

Crop images in batch on GPU with customized parameters.

__call__(images: List[NDArray], x: List[int], y: List[int], width: List[int], height: List[int], sync: int = 0) → List[NDArray][source]¶

Crop images

Parameters:

images (List[matx.runtime.NDArray]) – source/input image
x (List[int]) – the x coordinates of the top_left corner of the cropped region.
y (List[int]) – the y coordinates of the top_left corner of the cropped region.
width (List[int]) – desired width for each cropped image.
height (List[int]) – desired height for each cropped image.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

crop images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import CropOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> x = [10, 20, 30]
>>> y = [50, 35, 20]
>>> widths = [224, 224, 224]
>>> heights = [224, 224, 224]
>>> op = CropOp(device=device)
>>> ret = op(nds, x, y, widths, heights)

__init__(device: Any) → None[source]¶

Initialize CropOp

Parameters:: device (Any) – the matx device used for the operation.

class matx.vision.CvtColorOp(device: Any, color_code: str)[source]¶

Bases: object

Color convertion for input images.

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Apply color convertion for input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import CvtColorOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> color_code = matx.vision.COLOR_BGR2RGB

>>> op = CvtColorOp(device, color_code)
>>> ret = op(nds)

__init__(device: Any, color_code: str) → None[source]¶

Initialize CvtColorOp

Parameters:

device (Any) – the matx device used for the operation
color_code (str) – color convertion code, e.g. matx.vision.COLOR_BGR2RGB

class matx.vision.EdgeDetectOp(device: Any, alpha: float = 1.0, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Generate a black & white edge image and alpha-blend it with the input image. Edge detect kernel is [[0, 1, 0], [1, -4, 1], [0, 1, 0]].

__call__(images: List[NDArray], alpha: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Generate an edge image and alpha-blend it with the input image.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
alpha (List[float]) – blending factor for each image. If omitted, the alpha set in op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import EdgeDetectOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> # create parameters for sharpen
>>> alpha = [0.1, 0.5, 0.9]

>>> op = EdgeDetectOp(device)
>>> ret = op(nds, alpha)

__init__(device: Any, alpha: float = 1.0, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize EdgeDetectOp

Parameters:

device (Any) – the matx device used for the operation
alpha (float, optional) – alpha-blend factor, 1.0 by default, which means only keep the edge image.
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.EmbossOp(device: Any, alpha: float = 1.0, strength: float = 0.0, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Emboss images and alpha-blend the result with the original input images. Emboss kernel is [[-1-s, -s, 0], [-s, 1, s], [0, s, 1+s]], emboss strength is controlled by s here.

__call__(images: List[NDArray], alpha: List[float] = [], strength: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Emboss images and alpha-blend the result with the original input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
alpha (List[float], optional) – blending factor for each image. If omitted, the alpha set in op initialization would be used for all images.
strength (List[float], optional) – parameter that controls the strength of the emboss. If omitted, the strength set in op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import EmbossOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> # create parameters for sharpen
>>> alpha = [0.1, 0.5, 0.9]
>>> strength = [0, 1, 2]

>>> op = EmbossOp(device)
>>> ret = op(nds, alpha, strength)

__init__(device: Any, alpha: float = 1.0, strength: float = 0.0, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize EmbossOp

Parameters:

device (Any) – the matx device used for the operation
alpha (float, optional) – alpha-blend factor, 1.0 by default, which means only keep the emboss image.
strength (float, optional) – strength of the emboss, 0.0 by default.
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.FlipOp(device: Any, flip_code: int = 1, prob: float = 1.1)[source]¶

Bases: object

Flip the given images along specified directions.

__call__(images: List[NDArray], flip_code: List[int] = [], sync: int = 0) → List[NDArray][source]¶

Flip images with specified directions.

Parameters:

images (List[matx.runtime.matx.runtime.NDArray]) – target images.
flip_code (List[int], optional) – flip type for each image in the batch. HORIZONTAL_FLIP – flip horizontally, VERTICAL_FLIP – flip vertically, DIAGONAL_FLIP – flip horizontally and vertically, FLIP_NOT_APPLY – keep the original If omitted, the value set in the op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import FlipOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of matx.runtime.NDArrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> flip_code = matx.vision.HORIZONTAL_FLIP

>>> op = FlipOp(device, flip_code)
>>> ret = op(nds)

__init__(device: Any, flip_code: int = 1, prob: float = 1.1) → None[source]¶

Initialize FlipOp

Parameters:

device (Any) – the matx device used for the operation.
flip_code (int optional) – flip type. HORIZONTAL_FLIP – flip horizontally, VERTICAL_FLIP – flip vertically, DIAGONAL_FLIP – flip horizontally and vertically, FLIP_NOT_APPLY – keep the original HORIZONTAL_FLIP by default. Could be overriden in runtime to set for each image in the batch.
prob (float optional) – probability for flipping each image, by default flipping all images with given flip code.

class matx.vision.GammaContrastOp(device: Any, per_channel: bool = False)[source]¶

Bases: object

Apply gamma contrast on input images, i.e. for each pixel value v: 255*((v/255)**gamma)

__call__(images: List[NDArray], gammas: List[float], sync: int = 0) → List[NDArray][source]¶

Apply gamma contrast on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
gammas (List[float]) – gamma value for each image / channel. If per_channel is False, the list should have the same size as batch size. If per_channel is True, the list should contain channel * batch_size elements.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import GammaContrastOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> gammas = [0.5, 0.9, 1.2]

>>> op = GammaContrastOp(device)
>>> ret = op(nds, gammas)

__init__(device: Any, per_channel: bool = False) → None[source]¶

Initialize GammaContrastOp

Parameters:

device (Any) – the matx device used for the operation
per_channel (bool, optional) – For each pixel, whether to apply the gamma contrast with different gamma value (True), or through out the channels using same gamma value (False). False by default.

class matx.vision.GaussNoiseOp(device: Any, batch_size: int, mu: float = 0.0, sigma: float = 1.0, per_channel: bool = False)[source]¶

Bases: object

Apply gaussian noise on input images.

__call__(images: List[NDArray], mus: List[float] = [], sigmas: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Apply gaussian noise on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
mus (List[float], optional) – mu value for each image. If omitted, the mu value set during the op initialization would be used for all images.
sigmas (List[float], optional) – sigma value for each image. If omitted, the sigma value set during the op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import GaussNoiseOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> mus = [0.0, 5.0, 10.0]
>>> sigmas = [0.01, 0.1, 1]

>>> op = GaussNoiseOp(device, batch_size)
>>> ret = op(nds, mus, sigmas)

__init__(device: Any, batch_size: int, mu: float = 0.0, sigma: float = 1.0, per_channel: bool = False) → None[source]¶

Initialize GaussNoiseOp

Parameters:

device (Any) – the matx device used for the operation
batch_size (int) – max batch size for gaussian noise op. It is required for cuda randomness initialization. When actually calling this op, the input batch size should be equal to or less than this value.
mu (float, optional) – mu for gaussian noise. It is a global value for all images, can be overridden in calling time, 0.0 by default.
sigma (float, optional) – sigma for gaussian noise. It is a global value for all images, can be overridden in calling time, 1.0 by default.
per_channel (bool, optional) – For each pixel, whether to add the noise per channel with different value (True), or through out the channels using same value (False). False by default.

class matx.vision.GaussianBlurOp(device: Any, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Apply gaussian blur on input images.

__call__(images: List[NDArray], ksizes: List[Tuple[int, int]], sigmas: List[Tuple[float, float]], sync: int = 0) → List[NDArray][source]¶

Apply gaussian blur on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
ksizes (List[Tuple[int, int]]) – conv kernel size for each image, each item in this list should be a 2 element tuple (x, y).
sigmas (List[Tuple[float, float]]) – sigma for gaussian blur, each item in this list should be a 2 element tuple (x, y).
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import GaussianBlurOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> ksizes = [(3, 3), (3, 5), (5, 5)]
>>> sigmas = [(0.1, 0.1), (0.01, 0.01), (0.01, 0.1)]

>>> op = GaussianBlurOp(device)
>>> ret = op(nds, ksizes, sigmas)

__init__(device: Any, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize GaussianBlurOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.HistEqualizeOp(device: Any)[source]¶

Bases: object

Apply histgram equalization on input images. Please refer to https://en.wikipedia.org/wiki/Histogram_equalization for more information.

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Apply histgram equalization on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import HistEqualizeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = HistEqualizeOp(device)
>>> ret = op(nds)

__init__(device: Any) → None[source]¶

Initialize HistEqualizeOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.ImdecodeNoExceptionOp(device: Any, fmt: str, pool_size: int = 8)[source]¶

Bases: object

Decode binary image without raising exception when handle invalid image

__call__(images: List[bytes], sync: int = 0) → Tuple[List[NDArray], List[int]][source]¶

Parameters:

images (List[bytes]) – list of binary images
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the param makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will bolcking util the compute is completed. SYNC_CPU – If device is GPU, the whole calculation will bolcking util the compute is completed, then copying the CUDA data to CPU.

Defaults to ASYNC.

Returns:

decoded images List[int]: 1 means operation is successful, otherwise 0

Return type:

List[matx.runtime.NDArray]

__init__(device: Any, fmt: str, pool_size: int = 8) → None[source]¶

Initialize ImdecodeOp

Parameters:

device (matx.Device) – device used for the operation
fmt (str) – the color type for output image, support “RGB” and “BGR”
pool_size (int, optional) – concurrency of decode operation, only for gpu, Defaults to 8.

class matx.vision.ImdecodeNoExceptionRandomCropOp(device: Any, fmt: str, scale: List, ratio: List, pool_size: int = 8)[source]¶

Bases: object

__call__(images: List[bytes], sync: int = 0) → Tuple[List[NDArray], List[int]][source]¶: Call self as a function.

__init__(device: Any, fmt: str, scale: List, ratio: List, pool_size: int = 8) → None[source]¶

class matx.vision.ImdecodeOp(device: Any, fmt: str, pool_size: int = 8)[source]¶

Bases: object

Decode binary image

__call__(images: List[bytes], sync: int = 0) → List[NDArray][source]¶

Parameters:

images (List[bytes]) – list of binary images
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the param makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will bolcking util the compute is completed. SYNC_CPU – If device is GPU, the whole calculation will bolcking util the compute is completed, then copying the CUDA data to CPU.

Defaults to ASYNC.

Returns:

decoded images

Return type:

List[matx.runtime.NDArray]

Examples:

>>> import matx
>>> from matx.vision import ImdecodeOp
>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> fd = open("./origin_image.jpeg", "rb")
>>> content = fd.read()
>>> fd.close()
>>> device = matx.Device("gpu:0")
>>> decode_op = ImdecodeOp(device, "BGR")
>>> r = decode_op([content])
>>> r[0].shape()
[360, 640, 3]

__init__(device: Any, fmt: str, pool_size: int = 8) → None[source]¶

Initialize ImdecodeOp

Parameters:

device (matx.Device) – device used for the operation
fmt (str) – the color type for output image, support “RGB” and “BGR”
pool_size (int, optional) – concurrency of decode operation, only for gpu, Defaults to 8.

class matx.vision.ImdecodeRandomCropOp(device: Any, fmt: str, scale: List, ratio: List, pool_size: int = 8)[source]¶

Bases: object

Decode binary image and random crop

__call__(images: List[bytes], sync: int = 0) → List[NDArray][source]¶

Parameters:

images (List[bytes]) – list of binary images
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the param makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will bolcking util the compute is completed. SYNC_CPU – If device is GPU, the whole calculation will bolcking util the compute is completed, then copying the CUDA data to CPU.

Defaults to ASYNC.

Returns:

decoded images

Return type:

List[matx.runtime.NDArray]

Examples:

>>> import matx
>>> from matx.vision import ImdecodeRandomCropOp
>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> fd = open("./origin_image.jpeg", "rb")
>>> content = fd.read()
>>> fd.close()
>>> device = matx.Device("gpu:0")
>>> decode_op = ImdecodeRandomCropOp(device, "BGR", [0.08, 1.0], [3/4, 4/3])
>>> ret = decode_op([content]
>>> ret[0].shape()
[225, 292, 3]

__init__(device: Any, fmt: str, scale: List, ratio: List, pool_size: int = 8) → None[source]¶

Parameters:

device (matx.Device) – device used for the operation
fmt (str) – the color type for output image, support “RGB” and “BGR”
scale (List) – Specifies the lower and upper bounds for the random area of the crop, before resizing. The scale is defined with respect to the area of the original image.
ratio (List) – lower and upper bounds for the random aspect ratio of the crop, before resizing.
pool_size (int, optional) – concurrency of decode operation, only for gpu, Defaults to 8.

class matx.vision.ImencodeNoExceptionOp(device: Any, fmt: str, quality: int, optimized_Huffman: bool, pool_size: int = 8)[source]¶

Bases: object

Encode image to jpg binary without raising exception when handle invalid image

__call__(images: Any) → Tuple[List[bytes], List[int]][source]¶

Parameters:: images (List[matx.runtime.NDArray]) – list of image on GPU
Returns:: jpg encoded images
Return type:: List[bytes]

Examples:

>>> import matx
>>> from matx.vision import ImencodeOp
>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_str = "gpu:0"
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)
>>> device = matx.Device(device_str)
>>> encode_op = ImencodeOp(device, "BGR")
>>> r = encode_op([nds])

__init__(device: Any, fmt: str, quality: int, optimized_Huffman: bool, pool_size: int = 8) → None[source]¶

Initialize ImencodeOp

Parameters:

device (matx.Device) – device used for the operation
fmt (str) – the color type for output image, support “RGB” and “BGR”
quality (int) – the jpeg quality, valid between [1, 100]. 100 means no loss.
optimized_Huffman (bool) – boolean value that control if optimized huffman tree is used. Enabling it usually means slower encoding but smaller binary size.
pool_size (int, optional) – concurrency of encode operation, only for gpu, Defaults to 8.

class matx.vision.ImencodeOp(device: Any, fmt: str, quality: int, optimized_Huffman: bool, pool_size: int = 8)[source]¶

Bases: object

Encode image to jpg binary

__call__(images: Any) → List[bytes][source]¶

there is no sync model as all data will be on cpu before the return

Parameters:: images (List[matx.runtime.NDArray]) – list of image on GPU
Returns:: jpg encoded images
Return type:: List[bytes]

Examples:

>>> import matx
>>> from matx.vision import ImencodeOp
>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_str = "gpu:0"
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)
>>> device = matx.Device(device_str)
>>> encode_op = ImencodeOp(device, "BGR")
>>> r = encode_op([nds])

__init__(device: Any, fmt: str, quality: int, optimized_Huffman: bool, pool_size: int = 8) → None[source]¶

Initialize ImencodeOp

Parameters:

device (matx.Device) – device used for the operation
fmt (str) – the color type for input image, support “RGB” and “BGR”
quality (int) – the jpeg quality, valid between [1, 100]. 100 means no loss.
optimized_Huffman (bool) – boolean value that control if optimized huffman tree is used. Enabling it usually means slower encoding but smaller binary size.
pool_size (int, optional) – concurrency of encode operation, only for gpu, Defaults to 8.

class matx.vision.InvertOp(device: Any, prob: float = 1.1, per_channel: bool = False, cap_value: float = 255.0)[source]¶

Bases: object

Invert all values in images. e.g. turn 20 into 255-20=235

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Invert image pixels by substracting itself from given cap value

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import InvertOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = InvertOp(device)
>>> ret = op(nds)

__init__(device: Any, prob: float = 1.1, per_channel: bool = False, cap_value: float = 255.0) → None[source]¶

Initialize InvertOp

Parameters:

device (Any) – the matx device used for the operation
prob (float, optional) – probability for inversion. Invert all by default.
per_channel (float, optional) – whether to apply the inversion probability on each image or each channel.
cap_value (float, optional) – the minuend for inversion, 255.0 by default.

class matx.vision.LaplacianBlurOp(device: Any, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Apply laplacian blur on input images.

__call__(images: List[NDArray], ksizes: List[int], scales: List[float], sync: int = 0) → List[NDArray][source]¶

Apply average blur on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
ksizes (List[int]) – conv kernel size for each image, laplacian kernel is a square shaped kernel, so each item in this list is an integer.
scales (List[float]) – scale factor for laplacian blur
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import LaplacianBlurOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> ksizes = [3, 5, 3]
>>> scales = [1.0, 0.5, 0.5]

>>> op = LaplacianBlurOp(device)
>>> ret = op(nds, ksizes, scales)

__init__(device: Any, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize LaplacianBlurOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.MeanOp(device: Any, per_channel: bool = False)[source]¶

Bases: object

Calculate mean over each image.

__call__(images: List[NDArray], sync: int = 0) → NDArray[source]¶

Calculate mean over each image.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

mean result. For N images, the result would be shape Nx1 if per_channel is False, otherwise NxC where C is the image channel size.

Return type:

matx.runtime.NDArray

Example: >>> import cv2 >>> import matx >>> from matx.vision import MeanOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = MeanOp(device, per_channel = False)
>>> ret = op(nds)

__init__(device: Any, per_channel: bool = False) → None[source]¶

Initialize MeanOp

Parameters:

device (Any) – the matx device used for the operation.
per_channel (bool, optional) – if True, calculate mean over each channel; if False, calculate mean over the whole image.

class matx.vision.MedianBlurOp(device: Any)[source]¶

Bases: object

Apply median blur on input images.

__call__(images: List[NDArray], ksizes: List[Tuple[int, int]], sync: int = 0) → List[NDArray][source]¶

Apply median blur on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
ksizes (List[Tuple[int, int]]) – conv kernel size for each image, each item in this list should be a 2 element tuple (x, y).
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import MedianBlurOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> ksizes = [(3, 3), (3, 5), (5, 5)]

>>> op = MedianBlurOp(device)
>>> ret = op(nds, ksizes)

__init__(device: Any) → None[source]¶

Initialize MedianBlurOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.MixupImagesOp(device: Any)[source]¶

Bases: object

Weighted add up two images, i.e. calculate a * img1 + b * img2. img2 should have the same width and height as img1, while img2 would either have the same channel size as img1, or img2 only contains 1 channel.

__call__(images1: List[NDArray], images2: List[NDArray], factor1: List[float], factor2: List[float], sync: int = 0) → List[NDArray][source]¶

Weighted add up two images.

Parameters:

images1 (List[matx.runtime.NDArray]) – augend images.
images2 (List[matx.runtime.NDArray]) – addend images.
factor1 (List(float)) – weighted factor for images1.
factor2 (List(float)) – weighted factor for images2.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import MixupImagesOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> image_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds1 = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> nds2 = [matx.array.from_numpy(image_gray, device_str) for _ in range(batch_size)]
>>> factor1 = [0.5, 0.4, 0.3]
>>> factor2 = [1 - f for f in factor1]

>>> op = MixupImagesOp(device)
>>> ret = op(nds1, nds2, factor1, factor2)

__init__(device: Any) → None[source]¶

Initialize MixupImagesOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.NormalizeOp(device: Any, mean: List[float], std: List[float], dtype: str = 'float32', global_shift: float = 0.0, global_scale: float = 1.0)[source]¶

Bases: object

Normalize images with mean and std, and cast the image data type to target type.

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Normalize images with mean and std, and cast the image data type to target type.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import NormalizeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> mean = [0.485 * 255, 0.456 * 255, 0.406 * 255]
>>> std = [0.229 * 255, 0.224 * 255, 0.225 * 255]

>>> op = NormalizeOp(device, mean, std)
>>> ret = op(nds)

__init__(device: Any, mean: List[float], std: List[float], dtype: str = 'float32', global_shift: float = 0.0, global_scale: float = 1.0) → None[source]¶

Initialize NormalizeOp

Parameters:

device (Any) – the matx device used for the operation
mean (List[float]) – mean for normalize
std (List[float]) – std for normalize
dtype (str, optional) – output data type when normalize finished, float32 by default.
global_shift (float, optional) – shift value for all pixels after the normalization, 0.0 by default.
global_scale (float, optional) – scale factor value for all pixels after the normalization, 1.0 by default.

class matx.vision.PadOp(device: Any, size: Tuple[int, int], pad_values: Tuple[int, int, int] = (0, 0, 0), pad_type: str = 'BORDER_CONSTANT', with_corner: bool = False)[source]¶

Bases: object

Forms a border around given image.

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Pad input images.

Parameters:

images (List[matx.runtime.NDArray]) – input images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

Pad images.

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import PadOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = PadOp(device=device,
               size=(224, 224),
               pad_values=(0, 0, 0),
               pad_type=matx.vision.BORDER_CONSTANT)
>>> ret = op(nds)

__init__(device: Any, size: Tuple[int, int], pad_values: Tuple[int, int, int] = (0, 0, 0), pad_type: str = 'BORDER_CONSTANT', with_corner: bool = False) → None[source]¶

Initialize PadOp

Parameters:

device (Any) – the matx device used for the operation.
size (Tuple[int, int]) – output size for all images, must be 2 dim tuple.
pad_values (Tuple[int, int, int], optional) – Border value if border_type==BORDER_CONSTANT. Padding value is 3 dim tuple, three channels would be padded with the given value. Defaults to (0, 0, 0).
pad_type (str, optional) – pad mode, could be chosen from BORDER_CONSTANT, BORDER_REPLICATE, BORDER_REFLECT, BORDER_WRAP, more pad_type see cv_border_types for details. Defaults to BORDER_CONSTANT.
with_corner (bool, optional) – If True, forms a border in lower right of the image. Defaults to False.

class matx.vision.PadWithBorderOp(device: Any, pad_values: Tuple[int, int, int] = (0, 0, 0), pad_type: str = 'BORDER_CONSTANT')[source]¶

Bases: object

Forms a border around given image.

__call__(images: List[NDArray], top_pads: List[int], bottom_pads: List[int], left_pads: List[int], right_pads: List[int], sync: int = 0) → List[NDArray][source]¶

Pad input images with border.

Parameters:

images (List[matx.runtime.NDArray]) – input images.
top_pads (List[int]) – The number of pixels to pad that above the images.
bottom_pads (List[int]) – The number of pixels to pad that below the images.
left_pads (List[int]) – The number of pixels to pad that to the left of the images.
right_pads (List[int]) – The number of pixels to pad that to the right of the images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

Pad images.

Return type:

List[matx.runtime.NDArray]

Example: >>> import cv2 >>> import matx >>> from matx.vision import PadWithBorderOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = PadWithBorderOp(device=device,
                         pad_values=(0, 0, 0),
                         pad_type=matx.vision.BORDER_CONSTANT)
>>> ret = op(nds)

__init__(device: Any, pad_values: Tuple[int, int, int] = (0, 0, 0), pad_type: str = 'BORDER_CONSTANT') → None[source]¶

Initialize PadWithBorderOp

Parameters:

device (Any) – the matx device used for the operation.
pad_values (Tuple[int, int, int], optional) – Border value if border_type==BORDER_CONSTANT. Padding value is 3 dim tuple, three channels would be padded with the given value. Defaults to (0, 0, 0).
pad_type (str, optional) – pad mode, could be chosen from BORDER_CONSTANT, BORDER_REPLICATE, BORDER_REFLECT, BORDER_WRAP, more pad_type see cv_border_types for details. Defaults to BORDER_CONSTANT.

class matx.vision.PosterizeOp(device: Any, bit: int = 4, prob: float = 1.1)[source]¶

Bases: object

Apply posterization on images. i.e. remove certain bits for each pixel value, e.g. with bit=4, pixel 77 would become 64 (the last 4 bits are set to 0).

__call__(images: List[NDArray], bits: List[int] = [], sync: int = 0) → List[NDArray][source]¶

Apply posterization on images. Only support uint8 images

Parameters:

images (List[matx.runtime.NDArray]) – target images.
bits (List[int]) – posterization bit for each image. If not given, the bit for op initialization would be used.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import PosterizeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> bits = [1, 4, 7]

>>> op = PosterizeOp(device)
>>> ret = op(nds, bits)

__init__(device: Any, bit: int = 4, prob: float = 1.1) → None[source]¶

Initialize PosterizeOp

Parameters:

device (Any) – the matx device used for the operation
bit (int, optional) – bit for posterization for all images, range from [0, 8], set to 4 by default.
prob (float, optional) – probability for posterization on each image. Apply on all by default.

class matx.vision.RandomDropoutOp(device: Any, batch_size: int, prob: float = 0.01, per_channel: bool = False)[source]¶

Bases: object

Randomly drop out some pixels (set to 0) for input images.

__call__(images: List[NDArray], probs: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Randomly drop out some pixels (set to 0) for input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
probs (List[float], optional) – drop out probability for each image. If omitted, the value set during the op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import RandomDropoutOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> probs = [0.1, 0.01, 0.5]

>>> op = RandomDropoutOp(device, batch_size)
>>> ret = op(nds, probs)

__init__(device: Any, batch_size: int, prob: float = 0.01, per_channel: bool = False) → None[source]¶

Initialize RandomDropoutOp

Parameters:

device (Any) – the matx device used for the operation
batch_size (int) – max batch size for sp noise op. It is required for cuda randomness initialization. When actually calling this op, the input batch size should be equal to or less than this value.
prob (float, optional) – the probability for each pixel to be dropped out, range from 0 to 1, 0.01 by default, can be overridden in runtime.
per_channel (bool, optional) – For each pixel, whether to drop out the value differently for each channel (True), or drop out the value through out all the channels (False). False by default.

class matx.vision.RandomResizedCropOp(device: Any, size: Tuple[int, int], scale: List[float], ratio: List[float], interp: str = 'INTER_LINEAR')[source]¶

Bases: object

RandomResizedCropOp given image on gpu.

__call__(images: List[NDArray], sync: int = 0) → List[NDArray][source]¶

Resize and Crop image depends on scale and ratio.

Parameters:

images (List[matx.runtime.NDArray]) – input images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

RandomResizedCrop images.

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import RandomResizedCropOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = RandomResizedCropOp(device=device,
                             size=(224, 224),
                             scale=[0.8, 1.0],
                             ratio=[0.8, 1.25],
                             interp=matx.vision.INTER_LINEAR)
>>> ret = op(nds)

__init__(device: Any, size: Tuple[int, int], scale: List[float], ratio: List[float], interp: str = 'INTER_LINEAR') → None[source]¶

Initialize RandomResizedCropOP

Parameters:

device (Any) – the matx device used for the operation.
size (Tuple[int, int]) – output size for all images, must be 2 dim tuple.
scale (List[float]) – Specifies the lower and upper bounds for the random area of the crop, before resizing. The scale is defined with respect to the area of the original image.
ratio (List[float]) – lower and upper bounds for the random aspect ratio of the crop, before resizing.
interp (str, optional) – Desired interpolation. INTER_NEAREST – a nearest-neighbor interpolation; INTER_LINEAR – a bilinear interpolation (used by default); INTER_CUBIC – a bicubic interpolation over 4x4 pixel neighborhood; PILLOW_INTER_LINEAR – a bilinear interpolation, simalir to Pillow(only support GPU) Defaults to INTER_LINEAR.

class matx.vision.ResizeOp(device: Any, size: Tuple[int, int] = (-1, -1), max_size: int = 0, interp: str = 'INTER_LINEAR', mode: str = 'default')[source]¶

Bases: object

Resize input images.

__call__(images: List[NDArray], size: List[Tuple[int, int]] = [], sync: int = 0) → List[NDArray][source]¶

Resize input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
size (List[Tuple[int, int]], optional) – target size for each image, must be 2 dim tuple (h, w). If omitted, the target size set in op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import ResizeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = ResizeOp(device, size=(224, 224), mode=matx.vision.RESIZE_NOT_SMALLER)
>>> ret = op(nds)

__init__(device: Any, size: Tuple[int, int] = (-1, -1), max_size: int = 0, interp: str = 'INTER_LINEAR', mode: str = 'default') → None[source]¶

Initialize ResizeOp

Parameters:

device (Any) – the matx device used for the operation.
size (Tuple[int, int], optional) – output size for all images, must be 2 dim tuple. If omitted, the size must be given when calling.
max_size (int, optional) – used in RESIZE_NOT_SMALLER mode to make sure output size is not too large.
interp (str, optional) – desired interpolation method. INTER_NEAREST – a nearest-neighbor interpolation; INTER_LINEAR – a bilinear interpolation (used by default); INTER_CUBIC – a bicubic interpolation over 4x4 pixel neighborhood; PILLOW_INTER_LINEAR – a bilinear interpolation, simalir to Pillow(only support GPU) INTER_LINEAR by default.
mode (str, optional) – resize mode, could be chosen from RESIZE_DEFAULT, RESIZE_NOT_LARGER, and RESIZE_NOT_SMALLER RESIZE_DEFAULT – resize to the target output size RESIZE_NOT_LARGER – keep the width/height ratio, final output size would be one dim equal to target, one dim smaller. e.g. original image shape (360, 240), target size (480, 360), output size (480, 320) RESIZE_NOT_SMALLER – keep the width/height ratio, final output size would be one dim equal to target, one dim larger. e.g. original image shape (360, 240), target size (480, 360), output size (540, 360) RESIZE_DEFAULT by default.

class matx.vision.RotateOp(device: Any, pad_type: str = 'BORDER_CONSTANT', pad_values: Tuple[int, int, int] = (0, 0, 0), interp: str = 'INTER_LINEAR', expand: bool = False)[source]¶

Bases: object

Apply image rotation.

__call__(images: List[NDArray], angles: List[float], center: List[Tuple[int, int]] = [], sync: int = 0) → List[NDArray][source]¶

Apply rotation on images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
angles (List[float]) – rotation angle for each image
center (List[Tuple[int, int]], optional) – rotation center (y, x) for each image, if omitted, the image center would be used as rotation center.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import RotateOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> angles = [10, 20, 30]

>>> op = RotateOp(device, expand = True)
>>> ret = op(nds, angles)

__init__(device: Any, pad_type: str = 'BORDER_CONSTANT', pad_values: Tuple[int, int, int] = (0, 0, 0), interp: str = 'INTER_LINEAR', expand: bool = False) → None[source]¶

Initialize RotateOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – border type to fill the target image, use constant value by default.
pad_values (Tuple[int, int, int], optional) – the border value to fill the target image if pad_type is BORDER_CONSTANT, (0, 0, 0) by default.
interp (str, optional) – desired interpolation method. INTER_NEAREST – a nearest-neighbor interpolation; INTER_LINEAR – a bilinear interpolation (used by default); INTER_CUBIC – a bicubic interpolation over 4x4 pixel neighborhood; INTER_LINEAR by default.
expand (bool, optional) – control the shape of rotated image. If False, the rotated images would be center cropped into the original size; if True, expand the output to make it large enough to hold the entire rotated image.

class matx.vision.SaltAndPepperOp(device: Any, batch_size: int, noise_prob: float = 0.01, salt_prob: float = 0.5, per_channel: bool = False)[source]¶

Bases: object

Apply salt and pepper noise on input images.

__call__(images: List[NDArray], noise_probs: List[float] = [], salt_probs: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Apply sp noise on input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
noise_probs (List[float], optional) – probability to add sp noise for each image. If omitted, the value set during the op initialization would be used for all images.
salt_probs (List[float], optional) – probability to add salt noise for each image. If omitted, the value set during the op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import SaltAndPepperOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> noise_probs = [0.1, 0.01, 0.5]
>>> salt_probs = [0.1, 0.5, 0.9]

>>> op = SaltAndPepperOp(device, batch_size)
>>> ret = op(nds, noise_probs, salt_probs)

__init__(device: Any, batch_size: int, noise_prob: float = 0.01, salt_prob: float = 0.5, per_channel: bool = False) → None[source]¶

Initialize SaltAndPepperOp

Parameters:

device (Any) – the matx device used for the operation
batch_size (int) – max batch size for sp noise op. It is required for cuda randomness initialization. When actually calling this op, the input batch size should be equal to or less than this value.
noise_prob (float, optional) – the probability for each pixel to add sp noise, range from 0 to 1, 0.01 by default, can be overridden in runtime.
salt_prob (float, optional) – for those pixels that need to apply salt_n_pepper noise, the probability that the salt noise would be, range from 0 to 1. The pepper probability would then be (1 - salt_prob). 0.5 by default, can be overridden in runtime.
per_channel (bool, optional) – For each pixel, whether to add the noise per channel with different value (True), or through out the channels using same value (False). False by default.

class matx.vision.SharpenOp(device: Any, alpha: float = 1.0, lightness: float = 1.0, pad_type: str = 'BORDER_DEFAULT')[source]¶

Bases: object

Sharpen images and alpha-blend the result with the original input images. Sharpen kernel is [[-1, -1, -1], [-1, 8+l,-1], [-1, -1, -1]], sharpen lightness is controlled by l here.

__call__(images: List[NDArray], alpha: List[float] = [], lightness: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Sharpen images and alpha-blend the result with the original input images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
alpha (List[float], optional) – blending factor for each image. If omitted, the alpha set in op initialization would be used for all images.
lightness (List[float], optional) – lightness/brightness for each image. If omitted, the lightness set in op initialization would be used for all images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import SharpenOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> # create parameters for sharpen
>>> alpha = [0.1, 0.5, 0.9]
>>> lightness = [0, 1, 2]

>>> op = SharpenOp(device)
>>> ret = op(nds, alpha, lightness)

__init__(device: Any, alpha: float = 1.0, lightness: float = 1.0, pad_type: str = 'BORDER_DEFAULT') → None[source]¶

Initialize SharpenOp

Parameters:

device (Any) – the matx device used for the operation
alpha (float, optional) – alpha-blend factor, 1.0 by default, which means only keep the sharpened image.
lightness (float, optional) – lightness/brightness of the sharpened image, 1.0 by default.
pad_type (str, optional) – pixel extrapolation method, if border_type is BORDER_CONSTANT, 0 would be used as border value.

class matx.vision.SolarizeOp(device: Any, threshold: float = 128.0, prob: float = 1.1)[source]¶

Bases: object

Apply solarization on images. i.e. invert the pixel value if the value is above the given threshold.

__call__(images: List[NDArray], threshold: List[float] = [], sync: int = 0) → List[NDArray][source]¶

Apply solarization on images. Only support uint8 images

Parameters:

images (List[matx.runtime.NDArray]) – target images.
threshold (List[float], optional) – solarization threshold for each image. If not given the threshold for op initialization would be used.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import SolarizeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> threshold = [80, 160, 240]

>>> op = SolarizeOp(device)
>>> ret = op(nds, threshold)

__init__(device: Any, threshold: float = 128.0, prob: float = 1.1) → None[source]¶

Initialize SolarizeOp

Parameters:

device (Any) – the matx device used for the operation
threshold (float, optional) – solarization threshold for all images, 128 by default.
prob (float, optional) – probability for solarization on each image. Apply on all by default.

class matx.vision.SplitOp(device: Any)[source]¶

Bases: object

split input image along channel dimension. The input is a single image.

__call__(image: NDArray, sync: int = 0) → List[NDArray][source]¶

split input image along channel dimension.

Parameters:

image (matx.runtime.NDArray) – target image.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import SplitOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> nd = matx.array.from_numpy(image, device_str)

>>> op = SplitOp(device)
>>> ret = op(nd)

__init__(device: Any) → None[source]¶

Initialize SplitOp

Parameters:: device (Any) – the matx device used for the operation

class matx.vision.StackOp(device: Any)[source]¶

Bases: object

Stack images along first dim

__call__(images: List[NDArray], sync: int = 0) → NDArray[source]¶

Parameters:

images (List[matx.runtime.NDArray]) – input images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the param makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will bolcking util the compute is completed. SYNC_CPU – If device is GPU, the whole calculation will bolcking util the compute is completed, then copying the CUDA data to CPU.

Defaults to ASYNC.

Returns:

matx.runtime.NDArray

Examples:

>>> import matx
>>> from matx.vision import ImdecodeOp, StackOp
>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> fd = open("./origin_image.jpeg", "rb")
>>> content = fd.read()
>>> fd.close()
>>> device = matx.Device("gpu:0")
>>> decode_op = ImdecodeOp(device, "BGR")
>>> images = decode_op([content, content])
>>> stack_op = StackOp(device)
>>> r = stack_op(images, sync = matx.vision.SYNC)
>>> r.shape()
[2, 360, 640, 3]

__init__(device: Any) → None[source]¶

Parameters:: device (matx.Device) – device used for the operation

class matx.vision.SumOp(device: Any, per_channel: bool = False)[source]¶

Bases: object

Sum over each image.

__call__(images: List[NDArray], sync: int = 0) → NDArray[source]¶

Sum over each image.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

summation result. For N images, the result would be shape Nx1 if per_channel is False, otherwise NxC where C is the image channel size.

Return type:

matx.runtime.NDArray

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import SumOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]

>>> op = SumOp(device, per_channel = False)
>>> ret = op(nds)

__init__(device: Any, per_channel: bool = False) → None[source]¶

Initialize SumOp

Parameters:

device (Any) – the matx device used for the operation.
per_channel (bool, optional) – if True, sum over each channel; if False, sum over the whole image.

class matx.vision.TransposeNormalizeOp(device: Any, mean: List[float], std: List[float], input_layout: str, output_layout: str, dtype: str = 'float32', global_shift: float = 0.0, global_scale: float = 1.0)[source]¶

Bases: object

Normalize images with mean and std, cast the image data type to target type, stack the images into a single array, and then update the array format (e.g. NHWC or NCHW).

__call__(images: List[NDArray], sync: int = 0) → NDArray[source]¶

Normalize images with mean and std, cast the image data type to target type, stack the images into a single array, and then update the array format (e.g. NHWC or NCHW).

Parameters:

images (List[matx.runtime.NDArray]) – target images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

matx.runtime.NDArray

Example: >>> import cv2 >>> import matx >>> from matx.vision import TransposeNormalizeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> mean = [0.485 * 255, 0.456 * 255, 0.406 * 255]
>>> std = [0.229 * 255, 0.224 * 255, 0.225 * 255]
>>> input_layout = matx.vision.NHWC
>>> output_layout = matx.vision.NCHW

>>> op = TransposeNormalizeOp(device, mean, std, input_layout, output_layout)
>>> ret = op(nds)

__init__(device: Any, mean: List[float], std: List[float], input_layout: str, output_layout: str, dtype: str = 'float32', global_shift: float = 0.0, global_scale: float = 1.0) → None[source]¶

Initialize TransposeNormalizeOp

Parameters:

device (Any) – the matx device used for the operation
mean (List[float]) – mean for normalize
std (List[float]) – std for normalize
input_layout (str) – the data layout format after the stack, e.g. NHWC
output_layout (str) – the target data layout, e.g. NCHW.
dtype (str, optional) – output data type when normalize finished, float32 by default.
global_shift (float, optional) – shift value for all pixels after the normalization, 0.0 by default.
global_scale (float, optional) – scale factor value for all pixels after the normalization, 1.0 by default.

class matx.vision.TransposeOp(device: Any, input_layout: str, output_layout: str)[source]¶

Bases: object

Convert image tensor layout, this operators only support gpu backend.

__call__(images: NDArray, sync: int = 0) → NDArray[source]¶

Transpose image tensor.

Parameters:

images (matx.runtime.NDArray) – input images.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

Transpose images.

Return type:

matx.runtime.NDArray

Example:

>>> import cv2
>>> import matx
>>> import numpy as np
>>> from matx.vision import TransposeOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> batch_image = np.stack([image, image, image, image])
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a NHWC image tensor
>>> nds = matx.array.from_numpy(batch_image, device_str)

>>> op = TransposeOp(device=device,
                     input_layout=matx.vision.NHWC,
                     output_layout=matx.vision.NCHW)
>>> ret = op(nds)

__init__(device: Any, input_layout: str, output_layout: str) → None[source]¶

Initialize TransposeOp

Parameters:

device (Any) – the matx device used for the operation.
input_layout (str) – the input image tensor layout. only suppport NCHW or NHWC.
output_layout (str) – the desired image tensor layout. only support NCHW or NHWC.

class matx.vision.WarpAffineOp(device: Any, pad_type: str = 'BORDER_CONSTANT', pad_values: Tuple[int, int, int] = (0, 0, 0), interp: str = 'INTER_LINEAR')[source]¶

Bases: object

Apply warp affine on images.

__call__(images: List[NDArray], affine_matrix: List[List[List[float]]], dsize: List[Tuple[int, int]] = [], sync: int = 0) → List[NDArray][source]¶

Apply warp affine on images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
affine_matrix (List[List[List[float]]]) – affine matrix for each image, each matrix should be of shape 2x3.
dsize (List[Tuple[int, int]], optional) – target output size (h, w) for affine transformation. If omitted, the image original shape would be used.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import WarpAffineOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> affine_matrix1 = [[0, 1, 0], [-1, 0, 0]] # rotate
>>> affine_matrix2 = [[1, 0, 10], [0, 1, 10]] # shift
>>> affine_matrix3 = [[1, 0, 0], [0.15, 1, 0]] # shear
>>> affine_matrix = [affine_matrix1, affine_matrix2, affine_matrix3]

>>> op = WarpAffineOp(device)
>>> ret = op(nds, affine_matrix)

__init__(device: Any, pad_type: str = 'BORDER_CONSTANT', pad_values: Tuple[int, int, int] = (0, 0, 0), interp: str = 'INTER_LINEAR') → None[source]¶

Initialize WarpAffineOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – border type to fill the target image, use constant value by default.
pad_values (Tuple[int, int, int], optional) – the border value to fill the target image if pad_type is BORDER_CONSTANT, (0, 0, 0) by default.
interp (str, optional) – desired interpolation method. INTER_NEAREST – a nearest-neighbor interpolation; INTER_LINEAR – a bilinear interpolation (used by default); INTER_CUBIC – a bicubic interpolation over 4x4 pixel neighborhood; INTER_LINEAR by default.

class matx.vision.WarpPerspectiveOp(device: Any, pad_type: str = 'BORDER_CONSTANT', pad_values: Tuple[int, int, int] = (0, 0, 0), interp: str = 'INTER_LINEAR')[source]¶

Bases: object

Apply warp perspective on images.

__call__(images: List[NDArray], pts: List[List[List[Tuple[float, float]]]], dsize: List[Tuple[int, int]] = [], sync: int = 0) → List[NDArray][source]¶

Apply warp perspective on images.

Parameters:

images (List[matx.runtime.NDArray]) – target images.
pts (List[List[List[Tuple[float, float]]]]) – coordinate pairs of src and dst points. the shape of pts is Nx2xMx2, where N is the batch size, the left side 2 represents src and dst points respectively, M means the number of points for src/dst, the right side 2 represents the coordinator for each point, which is a 2 element tuple (x, y). If still confused, please see the usage in the example below.
dsize (List[Tuple[int, int]], optional) – target output size (h, w) for perspective transformation. If omitted, the image original shape would be used.
sync (int, optional) –

sync mode after calculating the output. when device is cpu, the params makes no difference.
ASYNC – If device is GPU, the whole calculation process is asynchronous. SYNC – If device is GPU, the whole calculation will be blocked until this operation is finished. SYNC_CPU – If device is GPU, the whole calculation will be blocked until this operation is finished, and the corresponding CPU array would be created and returned.

Defaults to ASYNC.

Returns:

converted images

Return type:

List[matx.runtime.NDArray]

Example:

>>> import cv2
>>> import matx
>>> from matx.vision import WarpPerspectiveOp

>>> # Get origin_image.jpeg from https://github.com/bytedance/matxscript/tree/main/test/data/origin_image.jpeg
>>> image = cv2.imread("./origin_image.jpeg")
>>> height, width = image.shape[:2]
>>> device_id = 0
>>> device_str = "gpu:{}".format(device_id)
>>> device = matx.Device(device_str)
>>> # Create a list of ndarrays for batch images
>>> batch_size = 3
>>> nds = [matx.array.from_numpy(image, device_str) for _ in range(batch_size)]
>>> src1_ptrs = [(0, 0), (width - 1, 0), (0, height - 1), (width - 1, height - 1)]
>>> dst1_ptrs = [(0, height * 0.13), (width * 0.9, 0),
                 (width * 0.2, height * 0.7), (width * 0.8, height - 1)]
>>> src2_ptrs = [(0, 0), (width - 1, 0), (0, height - 1), (width - 1, height - 1)]
>>> dst2_ptrs = [(0, height * 0.03), (width * 0.93, 0),
                 (width * 0.23, height * 0.73), (width * 0.83, height - 1)]

>>> src3_ptrs = [(0, 0), (width - 1, 0), (0, height - 1), (width - 1, height - 1)]
>>> dst3_ptrs = [(0, height * 0.33), (width * 0.73, 0),
                 (width * 0.23, height * 0.83), (width * 0.63, height - 1)]
>>> pts = [[src1_ptrs, dst1_ptrs], [src2_ptrs, dst2_ptrs], [src3_ptrs, dst3_ptrs]]

>>> op = WarpPerspectiveOp(device)
>>> ret = op(nds, pts)

__init__(device: Any, pad_type: str = 'BORDER_CONSTANT', pad_values: Tuple[int, int, int] = (0, 0, 0), interp: str = 'INTER_LINEAR') → None[source]¶

Initialize WarpPerspectiveOp

Parameters:

device (Any) – the matx device used for the operation
pad_type (str, optional) – border type to fill the target image, use constant value by default.
pad_values (Tuple[int, int, int], optional) – the border value to fill the target image if pad_type is BORDER_CONSTANT, (0, 0, 0) by default.
interp (str, optional) – desired interpolation method. INTER_NEAREST – a nearest-neighbor interpolation; INTER_LINEAR – a bilinear interpolation (used by default); INTER_CUBIC – a bicubic interpolation over 4x4 pixel neighborhood; INTER_LINEAR by default.

Subpackages¶

matx.vision.tv_transforms package