Characterising harmful data sources when constructing multi-fidelity surrogate models
arxiv(2024)
摘要
Surrogate modelling techniques have seen growing attention in recent years
when applied to both modelling and optimisation of industrial design problems.
These techniques are highly relevant when assessing the performance of a
particular design carries a high cost, as the overall cost can be mitigated via
the construction of a model to be queried in lieu of the available high-cost
source. The construction of these models can sometimes employ other sources of
information which are both cheaper and less accurate. The existence of these
sources however poses the question of which sources should be used when
constructing a model. Recent studies have attempted to characterise harmful
data sources to guide practitioners in choosing when to ignore a certain
source. These studies have done so in a synthetic setting, characterising
sources using a large amount of data that is not available in practice. Some of
these studies have also been shown to potentially suffer from bias in the
benchmarks used in the analysis. In this study, we present a characterisation
of harmful low-fidelity sources using only the limited data available to train
a surrogate model. We employ recently developed benchmark filtering techniques
to conduct a bias-free assessment, providing objectively varied benchmark
suites of different sizes for future research. Analysing one of these benchmark
suites with the technique known as Instance Space Analysis, we provide an
intuitive visualisation of when a low-fidelity source should be used and use
this analysis to provide guidelines that can be used in an applied industrial
setting.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要