Random thoughts about Complexity, Data and Models

arxiv（2020）

引用 0|浏览18

暂无评分

摘要

Data Science and Machine learning have been growing strong for the past decade. We argue that to make the most of this exciting field we should resist the temptation of assuming that forecasting can be reduced to brute-force data analytics. This owes to the fact that modelling, as we illustrate below, requires mastering the art of selecting relevant variables. More specifically, we investigate the subtle relation between "data and models" by focussing on the role played by algorithmic complexity, which contributed to making mathematically rigorous the long-standing idea that to understand empirical phenomena is to describe the rules which generate the data in terms which are "simpler" than the data itself. A key issue for the appraisal of the relation between algorithmic complexity and algorithmic learning is to do with a much needed clarification on the related but distinct concepts of compressibility, determinism and predictability. To this end we will illustrate that the evolution law of a chaotic system is compressibile, but a generic initial condition for it is not, making the time series generated by chaotic systems incompressible in general. Hence knowledge of the rules which govern an empirical phenomenon are not sufficient for predicting its outcomes. In turn this implies that there is more to understanding phenomena than learning -- even from data alone -- such rules. This can be achieved only in those cases when we are capable of "good modelling". Clearly, the very idea of algorithmic complexity rests on Turing's seminal analysis of computation. This motivates our remarks on this extremely telling example of analogy-based abstract modelling which is nonetheless heavily informed by empirical facts.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要