lab4d.dataloader package¶

lab4d.dataloader.data_utils¶

class lab4d.dataloader.data_utils.FrameInfo(ref_list)¶

Bases: object

Metadata about the frames in a dataset

Parameters:: ref_list (list(str)) – List of paths to all filtered RGB frames in this video

num_frames¶

Number of frames after filtering out static frames.

Type:: int

num_frames_raw¶

Total number of frames.

Type:: int

frame_map¶

Mapping from JPEGImages (filtered frames) to JPEGImagesRaw (all frames).

Type:: list(int)

lab4d.dataloader.data_utils.config_to_dataset(opts, is_eval=False, gpuid=[])¶

Construct a PyTorch dataset that includes all videos in a sequence.

Parameters:

opts (Dict) – Defined in Trainer::construct_dataset_opts()
is_eval (bool) – Unused
gpuid (List(int)) – Select a subset based on gpuid for npy generation

Returns:

dataset (torch.utils.data.Dataset) – Concatenation of datasets for each video in the sequence opts[“seqname”]

lab4d.dataloader.data_utils.duplicate_dataset(opts, datalist)¶

Duplicate a list of per-video datasets, so that the length matches the desired number of iterations per round during training.

Parameters:: datalist (List(VidDataset)) – A list of per-video datasets
Returns:: datalist_mul (List(VidDataset)) – Duplicated dataset list

lab4d.dataloader.data_utils.eval_loader(opts_dict)¶

Construct the evaluation dataloader.

Parameters:: opts_dict (Dict) – Defined in Trainer::construct_dataset_opts()
Returns:: dataloader (torch.utils.data.DataLoader) – Evaluation dataloader

lab4d.dataloader.data_utils.get_data_info(loader)¶

Extract dataset metadata from a dataloader

Parameters:: loader (torch.utils.data.DataLoader) – Evaluation dataloader
Returns:: data_info (Dict) – Dataset metadata

lab4d.dataloader.data_utils.get_vid_length(inst_id, data_info)¶

Compute the length of a video

Parameters:

inst_id (int) – Video to check
data_info (Dict) – Dataset metadata

lab4d.dataloader.data_utils.load_config(config, dataname, current_dict=None)¶

Load a section from a .config metadata file

Parameters:

config (RawConfigParser) – Config parser object
dataname (str) – Name of section to load
currect_dict (Dict) – If given, load into an existing dict. Otherwise return a new dict

lab4d.dataloader.data_utils.load_small_files(data_path_dict)¶

For a sequence of videos, load small dataset files into memory

Parameters:: data_path_dict (Dict(str, List(str))) – Maps each annotation type to a list of .npy/.txt paths for that type
Returns:: data_info (Dict) – Dataset metadata

lab4d.dataloader.data_utils.merge_dict_list(loader)¶

For a sequence of videos, construct a dict .npy/.txt paths that contain all the frame data and annotations from the whole sequence

Parameters:: loader (torch.utils.data.DataLoader) – Dataloader for a video sequence
Returns:: dict_list (Dict(str, List(str))) – Maps each frame/annotation type to a list of .npy/.txt paths for that type

lab4d.dataloader.data_utils.section_to_dataset(opts, config, vidid)¶

Construct a PyTorch dataset for a single video in a sequence using opts[“dataset_constructor”]

Parameters:

opts (Dict) – Defined in Trainer::construct_dataset_opts()
config (RawConfigParser) – Config parser object
vidid (int) – Which video in the sequence

Returns:

dataset (torch.utils.data.Dataset) – Dataset for the video

lab4d.dataloader.data_utils.train_loader(opts_dict)¶

Construct the training dataloader.

Parameters:: opts_dict (Dict) – Defined in Trainer::construct_dataset_opts()
Returns:: dataloader (torch.utils.data.DataLoader) – Training dataloader

lab4d.dataloader.vidloader¶

class lab4d.dataloader.vidloader.RangeSampler(num_elems)¶

Bases: object

Sample efficiently without replacement from the range [0, num_elems).

Parameters:: num_elems (int) – Upper bound of sample range

init_queue()¶: Compute the next set of samples by permuting the sample range

sample(num_samples)¶

Return a set of samples from [0, num_elems) without replacement.

Parameters:: num_samples (int) – Number of samples to return
Returns:: rand_idx – (num_samples,) Output samples

class lab4d.dataloader.vidloader.VidDataset(opts, rgblist, dataid, ks, raw_size)¶

Bases: Dataset

Frame data and annotations for a single video in a sequence. Uses np.mmap internally to load larger-than-memory frame data from disk.

Parameters:

opts (Dict) – Defined in Trainer::construct_dataset_opts()
rgblist (List(str)) – List of paths to all RGB frames in this video
dataid (int) – Video ID
ks (List(int)) – Camera intrinsics: [fx, fy, cx, cy]
raw_size (List(int)) – Shape of the raw frames, [H, W]

construct_data_list(reflist, prefix, feature_type)¶

Construct a dict of .npy/.txt paths that contain all the frame data and annotations for a particular video

Parameters:

reflist (List(str)) – List of paths to all RGB frames in the video
prefix (str) – Type of data to load (“crop-256” or “full-256”)
feature_type (str) – Type of image features to use (“cse” or “dino”)

Returns:

dict_list (Dict(str, List(str))) – Maps each frame/annotation type to a list of .npy/.txt paths for that type

load_data(im0idx)¶

Sample pixels from a pair of frames

Parameters:: im0idx (int) – First frame id in the pair
Returns:: data_dict (Dict) – Maps keys to (2, …) data

load_data_list(dict_list)¶

Load all the frame data and anotations in this dataset

Parameters:: dict_list (Dict(str, List(str))) – From construct_data_list()
Returns:: mmap_list (Dict) – Maps each key to a numpy array of frame data or a list of annotations

read_depth(im0idx, rand_xy=None)¶

Read depth map for a single frame

Parameters:

im0idx (int) – Frame id to load
rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

depth (np.array) – (H,W,1) or (N,1) Depth map, float16

read_feature(im0idx, rand_xy=None)¶

Read feature map for a single frame

Parameters:

im0idx (int) – Frame id to load
rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

feat (np.array) – (112,112,16) or (N,16) Feature map, float32

read_flow(im0idx, delta, rand_xy=None)¶

Read flow map for a single frame

Parameters:

im0idx (int) – Frame id of flow source
delta (int) – Number of frames from flow source to flow target
rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

flow (np.array) – (H,W,3) or (N,3) Dense flow map, float32

read_mask(im0idx, rand_xy=None)¶

Read segmentation and object-centric bounding box for a single frame

Parameters:

im0idx (int) – Frame id to load
rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

mask (np.array) – (H,W,1) or (N,1) Segmentation mask, bool
vis2d (np.array) – (H,W,1) or (N,1) Mask of whether each pixel is part of the original frame, bool. For full frames, the entire mask is True
crop2raw (np.array) – (4,) Camera-intrinsics-style transformation from cropped (H,W) image to raw image, (fx, fy, cx, cy)

read_raw(im0idx, delta, rand_xy=None)¶

Read video data for a single frame within a pair

Parameters:

im0idx (int) – Frame id to load
delta (int) – Distance to other frame id in the pair
rand_xy (array or None) – (N, 2) pixels to load, if given

Returns:

data_dict (Dict) – Dict with keys “rgb”, “mask”, “depth”, “feature”, “flow”, “vis2d”, “crop2raw”, “dataid”, “frameid_sub”, “hxy”

read_rgb(im0idx, rand_xy=None)¶

Read RGB data for a single frame

Parameters:

im0idx (int) – Frame id to load
rand_xy (np.array or None) – (N, 2) Pixels to load, if given

Returns:

rgb (np.array) – (H,W,3) or (N, 3) Pixels, 0 to 1, float16

sample_delta(index)¶

Sample random delta frame

Parameters:: index (int) – First index in the pair
Returns:: delta (int) – Delta between first and second index

sample_xy()¶

Sample random pixels from an image

Returns:: xy – (N, 2) Sampled pixels