sketchkit.vectorization.LineDrawer.CNNVE.network package¶

Submodules¶

sketchkit.vectorization.LineDrawer.CNNVE.network.model module¶

sketchkit.vectorization.LineDrawer.CNNVE.network.model.pad_back(t_big_img, t_mask, rect)[source]¶

Pastes a small mask tensor back into a full-sized tensor at a given rectangle.

The mask is placed according to the rect coordinates. If the rect is partially outside the t_big_img boundaries, it’s clipped. The final mask is also filtered so that it’s zero wherever t_big_img is zero.

Parameters:

t_big_img (torch.Tensor) – The original full-sized image tensor.
t_mask (torch.Tensor) – The small mask tensor to paste.
rect (tuple) – The (x, y, w, h) coordinates where the t_mask was originally extracted from.

Returns:

A full-sized mask tensor with t_mask pasted into it.

Return type:

torch.Tensor

sketchkit.vectorization.LineDrawer.CNNVE.network.model.pad_back_lst(t_big_img, t_masks, rect_lst)[source]¶

Pads back a batch of masks to their original full-size locations.

Applies the pad_back function to each mask and rectangle in the provided lists and concatenates the results into a single batch tensor.

Parameters:

t_big_img (torch.Tensor) – The original full-sized image tensor.
t_masks (list[torch.Tensor]) – A list of small mask tensors.
rect_lst (list[tuple]) – A list of (x, y, w, h) rectangles corresponding to each mask.

Returns:

A batch of full-sized masks (concatenated on dim=0).

Return type:

torch.Tensor

sketchkit.vectorization.LineDrawer.CNNVE.network.model.rect_boudary(t_big_img, rect)[source]¶

Clips a rectangle to the boundaries of a given image tensor.

Parameters:

t_big_img (torch.Tensor) – The reference image tensor, used to determine boundaries. Shape is expected to be (B, C, H, W).
rect (tuple) – A tuple (x, y, w, h) representing the rectangle.

Returns:

A tuple (nx, ny, nw, nh) representing the clipped rectangle,: ensuring it does not go out of the image bounds.

Return type:

tuple

class sketchkit.vectorization.LineDrawer.CNNVE.network.model.stNet(DEVICE='cpu')[source]¶

Bases: Module

A single-stage convolutional neural network.

This network appears to be an encoder-decoder architecture, using strided convolutions for downsampling and PixelShuffle for upsampling.

DEVICE¶

The device to run the model on (‘cpu’ or ‘cuda’).

Type:: str

preprocess¶

Image preprocessing transforms.

Type:: transforms.Compose

conv¶

The main convolutional network structure.

Type:: nn.Sequential

center_crop(input_c2, rect)[source]¶

Crops a tensor to a specified rectangle.

Wrapper for torchvision.transforms.functional.crop.

Parameters:

input_c2 (torch.Tensor) – The input tensor to crop.
rect (tuple) – A tuple (i, j, w, h) defining the crop box. Note: i is x (width-wise) and j is y (height-wise).

Returns:

The cropped tensor.

Return type:

torch.Tensor

forward(x)[source]¶

Defines the standard forward pass for the network.

Parameters:: x (torch.Tensor) – The input tensor.
Returns:: The network’s output tensor.
Return type:: torch.Tensor

forward_no_grad(x)[source]¶

Defines a forward pass without gradient computation.

Note: This method is redundant as the class’s forward method is also decorated with @torch.no_grad().

Parameters:: x (torch.Tensor) – The input tensor.
Returns:: The network’s output tensor.
Return type:: torch.Tensor

load(model_name)[source]¶

Loads pre-trained weights into the model from a URL.

Constructs the URL from WEIGHTS_URL_BASE and model_name.

Parameters:: model_name (str) – The base name of the model file (e.g., “model”). It expects to find a file named ‘{model_name}.pth’ at the base URL.

class sketchkit.vectorization.LineDrawer.CNNVE.network.model.stNet_double(DEVICE='cuda')[source]¶

Bases: Module

A two-stage refinement network using stNet as the base model.

This model contains two stNet instances: one for a “rough” pass and one for a “refine” pass. The refinement is iterative.

DEVICE¶

The device to run the models on.

Type:: str

net_rough¶

The network for the initial “rough” pass.

Type:: stNet

net_refine¶

The network for the “refine” passes.

Type:: stNet

preprocess¶

Image preprocessing transforms.

Type:: transforms.Compose

center_crop(input_c2, rect)[source]¶

Crops a tensor to a specified rectangle.

Wrapper for torchvision.transforms.functional.crop.

Parameters:

input_c2 (torch.Tensor) – The input tensor to crop.
rect (tuple) – A tuple (i, j, w, h) defining the crop box. Note: i is x (width-wise) and j is y (height-wise).

Returns:

The cropped tensor.

Return type:

torch.Tensor

forward(x)[source]¶

Defines the standard forward pass for the two-stage network.

The process is: 1. y1 = net_rough(x) 2. y2 = net_refine([img, y1]) 3. y3 = net_refine([img, y2])

Parameters:

x (torch.Tensor) – Input tensor (B, 2, H, W), expected to be concatenated [image, mask].

Returns:

The output of the rough network (y1).
The output of the second refinement pass (y3).

Return type:

tuple[torch.Tensor, torch.Tensor]

get_result_batch(original_img_c1, point_imgs, rect_lst)[source]¶

Processes a batch of cropped image patches using the full (rough+refine) model.

This function takes a single large original image, a list of smaller mask images (point_imgs), and their corresponding locations (rect_lst). It crops the inputs, runs them through the full forward (rough+refine) model in batches, and pastes the results back to full-size.

Parameters:

original_img_c1 (np.ndarray) – The full-size original image (H, W, 1).
point_imgs (list[np.ndarray]) – A list of numpy arrays, each representing a mask (H, W, 1).
rect_lst (list[tuple]) – A list of (x, y, w, h) rectangles corresponding to each mask.

Returns:

im_res0 (np.ndarray): Batch of rough pass results (N, H, W, 1).
im_res1 (np.ndarray): Batch of final refinement results (N, H, W, 1).
im_mask (np.ndarray): Batch of original input masks (N, H, W, 1).

Return type:

tuple[np.ndarray, np.ndarray, np.ndarray]

get_result_batch_s2(original_img_c1, point_imgs, rect_lst)[source]¶

Processes a batch of cropped image patches using the stage-2 (refine) model.

This function takes a single large original image, a list of smaller mask images (point_imgs), and their corresponding locations (rect_lst). It crops the inputs, runs them through the get_result_s2 (refine-only) model in batches, and pastes the results back to full-size.

Parameters:

original_img_c1 (np.ndarray) – The full-size original image (H, W, 1).
point_imgs (list[np.ndarray]) – A list of numpy arrays, each representing a mask (H, W, 1).
rect_lst (list[tuple]) – A list of (x, y, w, h) rectangles corresponding to each mask.

Returns:

im_res0 (np.ndarray): Batch of first-pass refinement results (N, H, W, 1).
im_res1 (np.ndarray): Batch of second-pass refinement results (N, H, W, 1).
im_mask (np.ndarray): Batch of original input masks (N, H, W, 1).

Return type:

tuple[np.ndarray, np.ndarray, np.ndarray]

get_result_s2(x)[source]¶

Performs a two-step refinement pass using only the refine network.

The process is: 1. y1 = net_refine(x) 2. y2 = net_refine([img, y1])

Parameters:

x (torch.Tensor) – Input tensor (B, 2, H, W), expected to be concatenated [image, mask].

Returns:

The output of the first refinement pass (y1).
The output of the second refinement pass (y2).

Return type:

tuple[torch.Tensor, torch.Tensor]

load(model_name)[source]¶

Loads pre-trained weights for both rough and refine networks from URLs.

Constructs URLs for ‘{model_name}_rough.pth’ and ‘{model_name}_ref.pth’.

Parameters:: model_name (str) – The base name for the model files.

Module contents¶

class sketchkit.vectorization.LineDrawer.CNNVE.network.stNet(DEVICE='cpu')[source]¶

Bases: Module

A single-stage convolutional neural network.

This network appears to be an encoder-decoder architecture, using strided convolutions for downsampling and PixelShuffle for upsampling.

DEVICE¶

The device to run the model on (‘cpu’ or ‘cuda’).

Type:: str

preprocess¶

Image preprocessing transforms.

Type:: transforms.Compose

conv¶

The main convolutional network structure.

Type:: nn.Sequential

center_crop(input_c2, rect)[source]¶

Crops a tensor to a specified rectangle.

Wrapper for torchvision.transforms.functional.crop.

Parameters:

input_c2 (torch.Tensor) – The input tensor to crop.
rect (tuple) – A tuple (i, j, w, h) defining the crop box. Note: i is x (width-wise) and j is y (height-wise).

Returns:

The cropped tensor.

Return type:

torch.Tensor

forward(x)[source]¶

Defines the standard forward pass for the network.

Parameters:: x (torch.Tensor) – The input tensor.
Returns:: The network’s output tensor.
Return type:: torch.Tensor

forward_no_grad(x)[source]¶

Defines a forward pass without gradient computation.

Note: This method is redundant as the class’s forward method is also decorated with @torch.no_grad().

Parameters:: x (torch.Tensor) – The input tensor.
Returns:: The network’s output tensor.
Return type:: torch.Tensor

load(model_name)[source]¶

Loads pre-trained weights into the model from a URL.

Constructs the URL from WEIGHTS_URL_BASE and model_name.

Parameters:: model_name (str) – The base name of the model file (e.g., “model”). It expects to find a file named ‘{model_name}.pth’ at the base URL.

class sketchkit.vectorization.LineDrawer.CNNVE.network.stNet_double(DEVICE='cuda')[source]¶

Bases: Module

A two-stage refinement network using stNet as the base model.

This model contains two stNet instances: one for a “rough” pass and one for a “refine” pass. The refinement is iterative.

DEVICE¶

The device to run the models on.

Type:: str

net_rough¶

The network for the initial “rough” pass.

Type:: stNet

net_refine¶

The network for the “refine” passes.

Type:: stNet

preprocess¶

Image preprocessing transforms.

Type:: transforms.Compose

center_crop(input_c2, rect)[source]¶

Crops a tensor to a specified rectangle.

Wrapper for torchvision.transforms.functional.crop.

Parameters:

input_c2 (torch.Tensor) – The input tensor to crop.
rect (tuple) – A tuple (i, j, w, h) defining the crop box. Note: i is x (width-wise) and j is y (height-wise).

Returns:

The cropped tensor.

Return type:

torch.Tensor

forward(x)[source]¶

Defines the standard forward pass for the two-stage network.

The process is: 1. y1 = net_rough(x) 2. y2 = net_refine([img, y1]) 3. y3 = net_refine([img, y2])

Parameters:

x (torch.Tensor) – Input tensor (B, 2, H, W), expected to be concatenated [image, mask].

Returns:

The output of the rough network (y1).
The output of the second refinement pass (y3).

Return type:

tuple[torch.Tensor, torch.Tensor]

get_result_batch(original_img_c1, point_imgs, rect_lst)[source]¶

Processes a batch of cropped image patches using the full (rough+refine) model.

Parameters:

original_img_c1 (np.ndarray) – The full-size original image (H, W, 1).
point_imgs (list[np.ndarray]) – A list of numpy arrays, each representing a mask (H, W, 1).
rect_lst (list[tuple]) – A list of (x, y, w, h) rectangles corresponding to each mask.

Returns:

im_res0 (np.ndarray): Batch of rough pass results (N, H, W, 1).
im_res1 (np.ndarray): Batch of final refinement results (N, H, W, 1).
im_mask (np.ndarray): Batch of original input masks (N, H, W, 1).

Return type:

tuple[np.ndarray, np.ndarray, np.ndarray]

get_result_batch_s2(original_img_c1, point_imgs, rect_lst)[source]¶

Processes a batch of cropped image patches using the stage-2 (refine) model.

This function takes a single large original image, a list of smaller mask images (point_imgs), and their corresponding locations (rect_lst). It crops the inputs, runs them through the get_result_s2 (refine-only) model in batches, and pastes the results back to full-size.

Parameters:

original_img_c1 (np.ndarray) – The full-size original image (H, W, 1).
point_imgs (list[np.ndarray]) – A list of numpy arrays, each representing a mask (H, W, 1).
rect_lst (list[tuple]) – A list of (x, y, w, h) rectangles corresponding to each mask.

Returns:

im_res0 (np.ndarray): Batch of first-pass refinement results (N, H, W, 1).
im_res1 (np.ndarray): Batch of second-pass refinement results (N, H, W, 1).
im_mask (np.ndarray): Batch of original input masks (N, H, W, 1).

Return type:

tuple[np.ndarray, np.ndarray, np.ndarray]

get_result_s2(x)[source]¶

Performs a two-step refinement pass using only the refine network.

The process is: 1. y1 = net_refine(x) 2. y2 = net_refine([img, y1])

Parameters:

x (torch.Tensor) – Input tensor (B, 2, H, W), expected to be concatenated [image, mask].

Returns:

The output of the first refinement pass (y1).
The output of the second refinement pass (y2).

Return type:

tuple[torch.Tensor, torch.Tensor]

load(model_name)[source]¶

Loads pre-trained weights for both rough and refine networks from URLs.

Constructs URLs for ‘{model_name}_rough.pth’ and ‘{model_name}_ref.pth’.

Parameters:: model_name (str) – The base name for the model files.