2. Reconstruct a cat from a single video ========================================== Previously, we've reconstructed a rigid body (a car). In this example, we show how to reconstruct a deformable object (a cat!). .. raw:: html
Get pre-processed data ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ First, download and extract pre-processeed data:: bash scripts/download_unzip.sh "https://www.dropbox.com/s/mb7zgk73oomix4s/cat-pikachu-0.zip" To use custom videos, see the `preprocessing tutorial `_. Training ^^^^^^^^^^^ To optimize the dynamic neural fields:: # Args: training script, gpu id, input args bash scripts/train.sh lab4d/train.py 0 --seqname cat-pikachu-0 --logname fg-skel --fg_motion skel-quad The difference from the previous example is that we model the object motion with a skeleton-based deformation field, instead of treating it as a rigid body. You may choose `fg_motion` from one of the following motion fields: - rigid: rigid motion field (i.e., root body motion only, no deformation) - dense: dense motion field (similar to `D-NeRF `_) - bob: bag-of-bones motion field (neural blend skinning in `BANMo `_) - skel-human/quad: human or quadruped skeleton motion field (in `RAC `_) - comp_skel-human/quad_dense: composed motion field (with skeleton-based deformation and soft deformation in `RAC `_) .. note:: The optimization uses 13G GPU memory and takes around 21 minutes on a 3090 GPU. You may find the list of flags at `lab4d/config.py `_. To get higher quality, train for more iterations by adding `--num_rounds 120`. To run on a machine with less GPU memory, you may reduce the `--imgs_per_gpu`. Visualization during training ------------------------------------------ Please use tensorboard to monitor losses and intermediate renderings. Here we show the final bone locations (1st), camera transformations and geometry (2nd). .. raw:: html
Rendering after training ---------------------------- After training, we can check the reconstruction quality by rendering the reference view and novel views. Pre-trained checkpoints are provided `here `_. To render reference views of the input video, run:: # reference view python lab4d/render.py --flagfile=logdir/$logname/opts.log --load_suffix latest --render_res 256 .. note:: Some of the frames are skipped during preprocessing (according to static-frame filtering) Those filtered frames are not used for training, and not rendered here. .. raw:: html
To render novel views, run:: # turntable views, --viewpoint rot-elevation-angles --freeze_id frame-id-to-freeze python lab4d/render.py --flagfile=logdir/$logname/opts.log --load_suffix latest --viewpoint rot-0-360 --render_res 256 --freeze_id 50 .. note:: The `freeze_id` is set to 50 to freeze the time at the 50-th frame while rotating the camera around the object. .. raw:: html
To render a video of the proxy geometry and cameras over training iterations, run:: python scripts/render_intermediate.py --testdir logdir/$logname/ .. raw:: html
Exporting meshes and motion parameters after training -------------------------------------------------------- To export meshes and motion parameters, run:: python lab4d/export.py --flagfile=logdir/$logname/opts.log --load_suffix latest .. raw:: html
Reconstruct the total scene ------------------------------------------------------------ Now we have reconstructed the cat, can we put the cat in the scene? To do so, we train compositional neural fields with a foreground and a background component. Run the following to load the pre-trained foreground field and train the composed fields:: # Args: training script, gpu id, input args bash scripts/train.sh lab4d/train.py 0 --seqname cat-pikachu-0 --logname comp-comp-s2 --field_type comp --fg_motion comp_skel-quad_dense --data_prefix full --num_rounds 120 --load_path logdir/cat-pikachu-0-fg-skel/ckpt_latest.pth .. note:: The `file_type` is changed `comp` to compose the background field with the foreground field during differentiable rendering. The `fg_motion` is changed to `comp_skel-quad_dense` to use the composed warping field (with skeleton-based deformation and soft deformation) for the foreground object. To reconstruct the background, the `data_prefix` is changed to `full` to load the full frames instead of frames cropped around the object. .. note:: We load the pretrained foreground model `logdir/cat-pikachu-0-fg-skel/ckpt_latest.pth` to initialize the optimization. The optimization of 120 rounds (24k iterations) takes around 3.5 hours on a 3090 GPU. To render videos from the bird's eye view:: # bird's eye view, elevation angle=20 degree python lab4d/render.py --flagfile=logdir/cat-pikachu-0-comp-comp-s2/opts.log --load_suffix latest --render_res 256 --viewpoint bev-20 .. raw:: html
Visit other `tutorials `_.