HawkI: Homography Mutual Information Guidance for 3D-free Single Image to Aerial View
arxiv(2023)
摘要
We present HawkI, for synthesizing aerial-view images from text and an
exemplar image, without any additional multi-view or 3D information for
finetuning or at inference. HawkI uses techniques from classical computer
vision and information theory. It seamlessly blends the visual features from
the input image within a pretrained text-to-2Dimage stable diffusion model with
a test-time optimization process for a careful bias-variance trade-off, which
uses an Inverse Perspective Mapping (IPM) homography transformation to provide
subtle cues for aerialview synthesis. At inference, HawkI employs a unique
mutual information guidance formulation to steer the generated image towards
faithfully replicating the semantic details of the input-image, while
maintaining a realistic aerial perspective. Mutual information guidance
maximizes the semantic consistency between the generated image and the input
image, without enforcing pixel-level correspondence between vastly different
viewpoints. Through extensive qualitative and quantitative comparisons against
text + exemplar-image based methods and 3D/ multi-view based novel-view
synthesis methods on proposed synthetic and real datasets, we demonstrate that
our method achieves a significantly better bias-variance trade-off towards
generating high fidelity aerial-view images.Code and data is available at
https://github.com/divyakraman/HawkI2024.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要