• Docs >
  • lab4d.dataloader package
Shortcuts

lab4d.dataloader package

lab4d.dataloader.data_utils

class lab4d.dataloader.data_utils.FrameInfo(ref_list)

Bases: object

Metadata about the frames in a dataset

Parameters:

ref_list (list(str)) – List of paths to all filtered RGB frames in this video

num_frames

Number of frames after filtering out static frames.

Type:

int

num_frames_raw

Total number of frames.

Type:

int

frame_map

Mapping from JPEGImages (filtered frames) to JPEGImagesRaw (all frames).

Type:

list(int)

lab4d.dataloader.data_utils.config_to_dataset(opts, is_eval=False, gpuid=[])

Construct a PyTorch dataset that includes all videos in a sequence.

Parameters:
  • opts (Dict) – Defined in Trainer::construct_dataset_opts()

  • is_eval (bool) – Unused

  • gpuid (List(int)) – Select a subset based on gpuid for npy generation

Returns:

dataset (torch.utils.data.Dataset) – Concatenation of datasets for each video in the sequence opts[“seqname”]

lab4d.dataloader.data_utils.duplicate_dataset(opts, datalist)

Duplicate a list of per-video datasets, so that the length matches the desired number of iterations per round during training.

Parameters:

datalist (List(VidDataset)) – A list of per-video datasets

Returns:

datalist_mul (List(VidDataset)) – Duplicated dataset list

lab4d.dataloader.data_utils.eval_loader(opts_dict)

Construct the evaluation dataloader.

Parameters:

opts_dict (Dict) – Defined in Trainer::construct_dataset_opts()

Returns:

dataloader (torch.utils.data.DataLoader) – Evaluation dataloader

lab4d.dataloader.data_utils.get_data_info(loader)

Extract dataset metadata from a dataloader

Parameters:

loader (torch.utils.data.DataLoader) – Evaluation dataloader

Returns:

data_info (Dict) – Dataset metadata

lab4d.dataloader.data_utils.get_vid_length(inst_id, data_info)

Compute the length of a video

Parameters:
  • inst_id (int) – Video to check

  • data_info (Dict) – Dataset metadata

lab4d.dataloader.data_utils.load_config(config, dataname, current_dict=None)

Load a section from a .config metadata file

Parameters:
  • config (RawConfigParser) – Config parser object

  • dataname (str) – Name of section to load

  • currect_dict (Dict) – If given, load into an existing dict. Otherwise return a new dict

lab4d.dataloader.data_utils.load_small_files(data_path_dict)

For a sequence of videos, load small dataset files into memory

Parameters:

data_path_dict (Dict(str, List(str))) – Maps each annotation type to a list of .npy/.txt paths for that type

Returns:

data_info (Dict) – Dataset metadata

lab4d.dataloader.data_utils.merge_dict_list(loader)

For a sequence of videos, construct a dict .npy/.txt paths that contain all the frame data and annotations from the whole sequence

Parameters:

loader (torch.utils.data.DataLoader) – Dataloader for a video sequence

Returns:

dict_list (Dict(str, List(str))) – Maps each frame/annotation type to a list of .npy/.txt paths for that type

lab4d.dataloader.data_utils.section_to_dataset(opts, config, vidid)

Construct a PyTorch dataset for a single video in a sequence using opts[“dataset_constructor”]

Parameters:
  • opts (Dict) – Defined in Trainer::construct_dataset_opts()

  • config (RawConfigParser) – Config parser object

  • vidid (int) – Which video in the sequence

Returns:

dataset (torch.utils.data.Dataset) – Dataset for the video

lab4d.dataloader.data_utils.train_loader(opts_dict)

Construct the training dataloader.

Parameters:

opts_dict (Dict) – Defined in Trainer::construct_dataset_opts()

Returns:

dataloader (torch.utils.data.DataLoader) – Training dataloader

lab4d.dataloader.vidloader

class lab4d.dataloader.vidloader.RangeSampler(num_elems)

Bases: object

Sample efficiently without replacement from the range [0, num_elems).

Parameters:

num_elems (int) – Upper bound of sample range

init_queue()

Compute the next set of samples by permuting the sample range

sample(num_samples)

Return a set of samples from [0, num_elems) without replacement.

Parameters:

num_samples (int) – Number of samples to return

Returns:

rand_idx – (num_samples,) Output samples

class lab4d.dataloader.vidloader.VidDataset(opts, rgblist, dataid, ks, raw_size)

Bases: Dataset

Frame data and annotations for a single video in a sequence. Uses np.mmap internally to load larger-than-memory frame data from disk.

Parameters:
  • opts (Dict) – Defined in Trainer::construct_dataset_opts()

  • rgblist (List(str)) – List of paths to all RGB frames in this video

  • dataid (int) – Video ID

  • ks (List(int)) – Camera intrinsics: [fx, fy, cx, cy]

  • raw_size (List(int)) – Shape of the raw frames, [H, W]

construct_data_list(reflist, prefix, feature_type)

Construct a dict of .npy/.txt paths that contain all the frame data and annotations for a particular video

Parameters:
  • reflist (List(str)) – List of paths to all RGB frames in the video

  • prefix (str) – Type of data to load (“crop-256” or “full-256”)

  • feature_type (str) – Type of image features to use (“cse” or “dino”)

Returns:

dict_list (Dict(str, List(str))) – Maps each frame/annotation type to a list of .npy/.txt paths for that type

load_data(im0idx)

Sample pixels from a pair of frames

Parameters:

im0idx (int) – First frame id in the pair

Returns:

data_dict (Dict) – Maps keys to (2, …) data

load_data_list(dict_list)

Load all the frame data and anotations in this dataset

Parameters:

dict_list (Dict(str, List(str))) – From construct_data_list()

Returns:

mmap_list (Dict) – Maps each key to a numpy array of frame data or a list of annotations

read_depth(im0idx, rand_xy=None)

Read depth map for a single frame

Parameters:
  • im0idx (int) – Frame id to load

  • rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

depth (np.array) – (H,W,1) or (N,1) Depth map, float16

read_feature(im0idx, rand_xy=None)

Read feature map for a single frame

Parameters:
  • im0idx (int) – Frame id to load

  • rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

feat (np.array) – (112,112,16) or (N,16) Feature map, float32

read_flow(im0idx, delta, rand_xy=None)

Read flow map for a single frame

Parameters:
  • im0idx (int) – Frame id of flow source

  • delta (int) – Number of frames from flow source to flow target

  • rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:

flow (np.array) – (H,W,3) or (N,3) Dense flow map, float32

read_mask(im0idx, rand_xy=None)

Read segmentation and object-centric bounding box for a single frame

Parameters:
  • im0idx (int) – Frame id to load

  • rand_xy (np.array or None) – (N,2) Pixels to load, if given

Returns:
  • mask (np.array) – (H,W,1) or (N,1) Segmentation mask, bool

  • vis2d (np.array) – (H,W,1) or (N,1) Mask of whether each pixel is part of the original frame, bool. For full frames, the entire mask is True

  • crop2raw (np.array) – (4,) Camera-intrinsics-style transformation from cropped (H,W) image to raw image, (fx, fy, cx, cy)

read_raw(im0idx, delta, rand_xy=None)

Read video data for a single frame within a pair

Parameters:
  • im0idx (int) – Frame id to load

  • delta (int) – Distance to other frame id in the pair

  • rand_xy (array or None) – (N, 2) pixels to load, if given

Returns:

data_dict (Dict) – Dict with keys “rgb”, “mask”, “depth”, “feature”, “flow”, “vis2d”, “crop2raw”, “dataid”, “frameid_sub”, “hxy”

read_rgb(im0idx, rand_xy=None)

Read RGB data for a single frame

Parameters:
  • im0idx (int) – Frame id to load

  • rand_xy (np.array or None) – (N, 2) Pixels to load, if given

Returns:

rgb (np.array) – (H,W,3) or (N, 3) Pixels, 0 to 1, float16

sample_delta(index)

Sample random delta frame

Parameters:

index (int) – First index in the pair

Returns:

delta (int) – Delta between first and second index

sample_xy()

Sample random pixels from an image

Returns:

xy – (N, 2) Sampled pixels


© Copyright 2023, Gengshan Yang, Jeff Tan, Alex Lyons, Neehar Peri, Deva Ramanan, Carnegie Mellon University.

Built with Sphinx using a theme provided by Read the Docs.