Mip-NeRF 360 Dataset
Static NeRF methods such as Zip-NeRF (b) rely on poses from a preprocessing step such as COLMAP to produce good results. Even these poses may be imperfect, leading to blurring and artifacts. The treehill scene in the Mip-NeRF 360 dataset is an example of a scene with noisy camera estimates. Adding a state-of-the-art camera optimization method such as SCNeRF (c) ameliorates this slightly but artifacts in distance parts of the scene still remain. Our method is able to enhance SCNeRF by preconditioning its parameterization thereby achieving significantly sharper results.
Side-by-Side Comparisons on Perturbed Mip-NeRF 360 Dataset
Next we show a side-by-side comparison between SCNeRF with and without preconditioning on the perturbed Mip-NeRF 360 dataset. Preconditioning allows the optimization converge to better estimates for the camera poses resulting in sharper details and less artifacts.
NeRF-Synthetic Dataset
CamP can converge quickly even when the intrinsics of a scene are unknown. We evaluate on a more challenging version of the perturbed NeRF-Synthetic benchmark proposed in BARF where we also perturb the focal length and perspective of the cameras.
We compare our method to an improved version of BARF that has been adapted to the Instant NGP setting (BARF-NGP) for faster convergence. BARF is unable to converge to the correct camera positions leading to visual artifacts and floaters in the reconstruction.
Convergence
(a) BARF-NGP
(b) Ours
When the focal lengths of the cameras are unknown, the BARF formulation fails to find the correct
camera poses due to perspective ambiguities. Our preconditioned camera optimization is able to
quickly converge to the correct solution.
Even in the presence of perspective ambiguities the reconstruction may look correct at first glance.
However, compared to the reconstruction with the correct cameras there are significantly more
artifacts
around the boundaries. Floaters caused by inaccurate boundaries can be clearly seen in the depth.
When the focal lengths of the cameras are unknown, the BARF formulation fails to find the correct camera poses due to perspective ambiguities. Our preconditioned camera optimization is able to quickly converge to the correct solution.
Even in the presence of perspective ambiguities the reconstruction may look correct at first glance. However, compared to the reconstruction with the correct cameras there are significantly more artifacts around the boundaries. Floaters caused by inaccurate boundaries can be clearly seen in the depth.