• Docs >
  • 3. Reconstruct a cat from multiple videos
Shortcuts

3. Reconstruct a cat from multiple videos

In the previous tutorial, we reconstructed a cat with a single video. In this example, we improve the reconstruction by feeding more videos of the same cat to the pipeline, similar to the setup of BANMo.

Get pre-processed data

First, download pre-processeed data:

bash scripts/download_unzip.sh "https://www.dropbox.com/s/3w0vhh05olzwwn4/cat-pikachu.zip"

To use custom videos, see the preprocessing tutorial.

Training

To train the dynamic neural fields:

# Args: training script, gpu id, input args
bash scripts/train.sh lab4d/train.py 0,1 --seqname cat-pikachu --logname fg-bob --fg_motion bob --reg_gauss_skin_wt 0.01

Note

In this setup, we follow BANMo to use neural blend skinning with 25 bones (–fg_motion bob). We also use a larger weight for the gaussian bone regularization (–reg_gauss_skin_wt 0.01) to encourage the bones to be inside the object.

Note

Since there are more video frames than the previous example, we want to get more samples of rays in each (mini)batch. This can be achieved by specifying a larger per-gpu batch size (e.g., –imgs_per_gpu 224) or using more gpus.

The number of rays per (mini)batch is computed as number of gpus x imgs_per_gpu x pixels_per_image.

Note

The training takes around 20 minutes on two 3090 GPUs. You may find the list of flags at lab4d/config.py. The rendering results in this page assumes 120 rounds.

Visualization during training

Please use tensorboard to monitor losses and intermediate renderings.

Here we show the final bone locations (1st), camera transformations and geometry (2nd).

The camera transformations are sub-sampled to 200 frames to speed up the visualization.

Rendering after training

After training, we can check the reconstruction quality by rendering the reference view and novel views. Pre-trained checkpoints are provided here.

To render reference view of a video (e.g., video 8), run:

# reference view
python lab4d/render.py --flagfile=logdir/$logname/opts.log --load_suffix latest --inst_id 8 --render_res 256

Note

Some of the frames with small motion are not rendered (determined by preprocessing).

To render novel views, run:

# turntable views, --viewpoint rot-elevation-angles
python lab4d/render.py --flagfile=logdir/$logname/opts.log --load_suffix latest  --inst_id 8 --viewpoint rot-0-360 --render_res 256

Exporting meshes and motion parameters after training

To export meshes and motion parameters, run:

python lab4d/export.py --flagfile=logdir/$logname/opts.log --load_suffix latest

Note

The default setting may produce broken meshes. To get better one as shown above, train for more iterations by adding –num_rounds 120. Also see this for an explanation.

Visit other tutorials.


© Copyright 2023, Gengshan Yang, Jeff Tan, Alex Lyons, Neehar Peri, Deva Ramanan, Carnegie Mellon University.

Built with Sphinx using a theme provided by Read the Docs.