HyperNeRF
- Presentation for 未踏Jr videos:
SIGGRAPH_Asia_2021
https://hypernerf.github.io/ https://arxiv.org/pdf/2106.13228.pdf
-
It seems necessary to study from topics like NeRF.
- I was interested after listening to the presentation, so it’s a good opportunity to research.
-
I want to summarize this for a 7-minute video in the 未踏Jr YouTube project.
-
Reasons for choosing this:
- I found the idea of dealing with high dimensions interesting.
- Treating the dimensions of space and other dimensions together, focusing on dimensions for generalization.
- Tendency to Think about High-level Things
- It seems interesting, like topology (although I don’t fully understand it).
- It seems interesting in various ways, but I don’t fully understand it, so I want to understand it.
- I found the idea of dealing with high dimensions interesting.
-
It’s nice that the content of the paper is visualized interactively on https://hypernerf.github.io/.
- It feels like they are doing things that cannot be done with paper or PDF.
-
I want to read it while being aware of Ochiai-sensei’s Format for Reading Papers Quickly.
-
DL Reading Group A Higher-Dimensional Representation for Topologically Varying …
-
https://twitter.com/KeunhongP/status/1436505902387843072
- There is a demo on Google Colaboratory, it’s powerful.
Below is a summary of the paper I read.
-
Abstract
-
There are various extension works for performing NeRF on dynamic scenes (objects moving).
- However, they struggle with types of changes in topology.
- This problem was solved by elevating NeRF to a higher dimension.
- By increasing the dimension, even those with different topology can be handled.
- This is called hyper-space.
-
Goals (tasks for evaluation)
- NeRF doesn't do this. - NeRF also does this.
- In short, we want to be able to generate (interpolate) images that did not originally exist, even if we change the position of the t-axis or the position of the x/y/z/viewpoint.
-
And, they can achieve lower error rates and better results than the previous Nerfies.
-
-
Introduction
- Real-world objects undergo changes, including topological changes.
- For example, objects breaking.
- Changes in facial expressions (such as opening and closing the mouth) also involve topological changes if you think about it.
- These changes are not continuous (there are discontinuous timings where the topology changes).
- So it was difficult to handle them with existing algorithms for interpolating between scenes.
- As a solution, there is something called the Level Set Method.
- This paper is, in a nutshell, NeRF x Level Set Method.
- Differences from classical Level Set Method
- Classical methods only increase one dimension, but HyperNeRF can go to any number of dimensions.
- They talk about increasing the ambient dimension.
- What is the ambient space?
- They don’t limit it to the Euclidean space, even though I don’t understand topology.
- They represent non-Euclidean things with Neural Networks.
- Hyperdimentional NeRF becomes HyperNeRF.
- Instead of regularization, they use an optimization strategy.
- I don’t understand it well.
- Maybe it means there is less manual intervention?
- Real-world objects undergo changes, including topological changes.
-
Related Works
- Non-rigid reconstruction
- There are methods that use multi-lens or depth sensors (LIDAR and others), but the device setup is difficult.
- They also mention existing methods when there is only one lens, but I don’t understand the mechanism, so I can’t understand the problems (blu3mo).
- It seems similar to 62026b6d79e11300004e3006.
- I’ll skip it for now and be happy if I can understand it later.
- HyperNeRF solves the problem of topology changes that were present in Nerfies.
- Neural Rendering
- Around 2019, there were various studies on training neural networks to generate images from images.
- However, there was a problem of inconsistency when generating images from various viewpoints.
- After that, neural networks like NeRF emerged to represent scenes themselves.
- This can maintain geometric consistency.- One problem with NeRF is that it struggles with representing moving objects.
- Around 2019, there were various studies on training neural networks to generate images from images.
- That makes sense.
- There are two approaches to solve this problem:
- Deformation-based approach:
- Represents moving objects using a continuous deformation field.
- Similar to the Radiance Field, the Deformation Field is also approximated.
- It has a weakness in that it cannot represent topological changes or transient effects like fire.
- Modulation-based approach:
- Uses a latent code.
- Not much is understood about the mechanism.
- Can cover topological changes and other effects.
- HyperNeRF is an approach that combines both of these approaches.
- It models the scene changes using a deformation field.
- Deformation-based approach:
- Non-rigid reconstruction
Summary from the website:
- Motivation:
- Inspired by the concept of the Level Set Method.
- This method treats something that changes (e.g., a 2D shape) as slices of a higher-dimensional object (e.g., a 3D shape).
- Link to video
- Inspired by the concept of the Level Set Method.
- Architecture: