Physics inspired models in artificial intelligence

(1)

Physics Inspired Models in Artificial Intelligence

Muhammad Aurangzeb Ahmad

Department of Computer Science University of Washington Tacoma

Seattle, WA, USA maahmad@uw.edu

Şener Özönder

Electrical and Electronics Engineering Department Istinye University

Istanbul, Turkey sener.ozonder@istinye.edu.tr

ABSTRACT

Ideas originating in physics have informed progress in artificial intelligence and machine learning for many decades. However the pedigree of many such ideas is oft neglected in the Computer Sci-ence community. The tutorial focuses on current and past ideas from physics that have helped in furthering AI and machine learning. Recent advances in physics inspired ideas in AI are also explored es-pecially how insights from physics may hold the promise of opening the black box of deep learning. Lastly, current and future trends in this area and outlines of a research agenda on how physics-inspired models can benefit AI machine learning is given.

CCS CONCEPTS

• Computing methodologies → Artificial intelligence; Ma-chine learning; Learning to rank; MaMa-chine learning algorithms; Philosophical/theoretical foundations of artificial intelligence.

KEYWORDS

artificial intelligence, physics, ai and physics, physics inspired mod-els, machine learning and physics

ACM Reference Format:

Muhammad Aurangzeb Ahmad and Şener Özönder. 2020. Physics Inspired Models in Artificial Intelligence. In 26th ACM SIGKDD Conference on Knowl-edge Discovery and Data Mining (KDD ’20), August 23–27, 2020, Virtual Event, USA.ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3394486. 3406464

1 INTRODUCTION

Artificial Intelligence and Machine learning have a long history of cross-fertilization with other domains, e.g., Shapley Values mod-els from game theory, generalized linear modmod-els from statistics, Bayesian rule lists from frequent pattern mining, functional analy-sis from social networks etc. Ideas from Physics have time and again provided fodder for conceptual developments in AI and machine learning. The history of borrowing ideas from physics to AI has shown that mapping computer science problem formulation to the class of physical models with already known behavior and solution can be used to explain the inner workings of the AI models [2] .

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

2 STATISTICAL PHYSICS

Machine Learning has a long history of interaction and influence from Statistical Physics going back to the 1970s. Statistical physics is a branch of physics that mainly employs methods of probability theory and statistics, especially for dealing with large populations and approximations, in solving physical problems. We start with a very high-level brief overview of statistical physics and some well-known early examples of using physics to inform machine learning models e.g., Valiant’s theory of the Learnable [16] and Hopfield’s neural network model of associative memory [1]. These involved application of concepts from spin glass theory to neural networks models. We describe how analytic statistical physics models have also been used to demonstrate that the learning dynamics of some machine learning models can be better explored by using methods from physics as compared to using analysis free PAC bounds.

3 INFORMATION BOTTLENECK

The concept of Information bottleneck [13] has its origins in sta-tistical physics and has been greatly influential in understanding of deep learning theory. The theory of the information bottleneck for deep learning [12] aims to quantify the notion that layers in a neural networks are trading off between keeping enough infor-mation about the input so that the output labels can be predicted, while forgetting as much of the unnecessary information as pos-sible in order to keep the learned representation concise. One of the interesting consequences of this information theoretic analysis is that the traditional capacity, or expressivity dimension of the network, such as the VC dimension, is replaced by the exponent of the mutual information between the input and the compressed hidden layer representation. This implies that every bit of represen-tation compression is equivalent to doubling the training data in its impact on the generalization error.

4 AUTO-ENCODERS

In deep learning, auto-encoders have been applied to a variety of applications with phenomenal success. We describe how variational autoencoders (VAE) [7] are variants of auto-encoders that have a direct analogue to physical model where the autoencoder can be said to be represented via a graphical model. It should be noted that VAEs with a single hidden layer are closely related to widely used techniques in signal processing in physics such as dictionary learning and sparse coding.

5 COMMITTEE MACHINES

Committee machines are a special type of fully-connected neural networks which only learn the weights of the first layer and the weights of the subsequent layers are fixed. These models have

Tutorial Abstract KDD ‘20, August 23–27, 2020, Virtual Event, USA

(2)

been extensively studied in physics [4]: When the size of the input data is small, a weight-configuration that is the same for every hidden unit can lead to minimal error, equivalent to implementing simple regression. When the number of hidden nodes exceeds a threshold, the units in the hidden layer learn different weights and generalization error decreases etc. Committee machine have been used to analyze the consequences of over-parametrization in neural networks [6].

6 PHYSICS AND DEEP LEARNING

Physics-inspired models of GANs is an emerging area of research, e.g. the work on a solvable model of GANs [17] is actually a gener-alization of the earlier statistical physics works on online learning in Perceptrons. There are multiple synergies between physics and deep learning e.g., the training phase of many machine learning algorithms is done via stochastic gradient descent which has direct analogies in the study of complex energy landscapes [9]. In multi-layer neural nets, early multi-layers learn to represent the input data at a finer scale than the later layers. In physics, this can be mapped to renormalization groups that are used to extract macroscopic behavior from microscopic rules [10].

7 STOCHASTIC BLOCK MODELS

Phenotyping and clustering employ stochastic block models. In un-supervised learning, work on stochastic block model for detection of clusters/communities in sparse networks was built off of work from physics on low-rank matrix decomposition. The problem of community detection has been studied extensively in by the physics community [5]. It is important to note that the exact solution and understanding of algorithmic limitations in the stochastic block model came from the spin glass theory in Physics [3]. Additionally, a conjecture about belief propagation algorithm that came from physics was the foundation for the discovery of a new class of spectral algorithms for sparse data [8].

8 BOLTZMANN MACHINES

Restricted Boltzmann Machines are algorithms that are used for un-supervised learning and are in fact directly inspired from physics. In fact, the Boltzmann machine is often called the inverse Ising model in the physics literature. We describe how the idea of Restricted Boltzmann Machines came about and how these models represent data and recent advances in physics around where they are used to model protein families from their sequence information [14].

9 SYMBOLIC REGRESSION

Symbolic regression is an area of AI which that focuses on a sym-bolic expression that accurately matches a given data set. This area has seen cross fertilization from physics. A pioneering work in this area is by Schmidt et al [11] who described Discovering laws from data. More recent work by Udrescu and Tegmark [15].

10 CONCLUSION

While physics inspired AI models hold great promise, the current models in this domain have limitations e.g., many physics-based models employ simplified models of the world which do not gener-alize. Many models from physics are solvable and can be computed

in closed forms, it contrasts with the aims of traditional learning theory which generally focuses on worst cases error bounds [2]. Another complimentary approach that is fast emerging in this area is to use machine learning to find approximate solutions to prob-lems in physics and then using the approaches in physics to help advance the field in machine learning. We describe how physics-based models can help further the cause of explainable AI in general and what the future may hold for this area. Lastly, we extrapolate from current trends and propose how a research programme for physics inspired machine learning would look like and what are the likely future trends in this field.

REFERENCES

[1] YASER Abu-Mostafa and J St Jacques. 1985. Information capacity of the Hopfield model. IEEE Transactions on Information Theory 31, 4 (1985), 461–464. [2] Giuseppe Carleo, Ignacio Cirac, Kyle Cranmer, Laurent Daudet, Maria Schuld,

Naftali Tishby, Leslie Vogt-Maranto, and Lenka Zdeborová. 2019. Machine learn-ing and the physical sciences. Reviews of Modern Physics 91, 4 (2019), 045002. [3] Aurelien Decelle, Florent Krzakala, Cristopher Moore, and Lenka Zdeborová.

2011. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physical Review E 84, 6 (2011), 066106. [4] Andreas Engel and Christian Van den Broeck. 2001. Statistical mechanics of

learning. Cambridge University Press.

[5] Santo Fortunato. 2010. Community detection in graphs. Physics reports 486, 3-5 (2010), 75–174.

[6] Sebastian Goldt, Madhu Advani, Andrew M Saxe, Florent Krzakala, and Lenka Zdeborová. 2019. Dynamics of stochastic gradient descent for two-layer neu-ral networks in the teacher-student setup. In Advances in Neuneu-ral Information Processing Systems. 6981–6991.

[7] Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013).

[8] Florent Krzakala, Cristopher Moore, Elchanan Mossel, Joe Neeman, Allan Sly, Lenka Zdeborová, and Pan Zhang. 2013. Spectral redemption in clustering sparse networks. Proceedings of the National Academy of Sciences 110, 52 (2013), 20935– 20940.

[9] Chunyuan Li, Changyou Chen, David E Carlson, and Lawrence Carin. 2016. Pre-conditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks.. In AAAI, Vol. 2. 4.

[10] Pankaj Mehta and David J Schwab. 2014. An exact mapping between the varia-tional renormalization group and deep learning. arXiv preprint arXiv:1410.3831 (2014).

[11] Michael Schmidt and Hod Lipson. 2009. Distilling free-form natural laws from experimental data. science 324, 5923 (2009), 81–85.

[12] Ravid Shwartz-Ziv and Naftali Tishby. 2017. Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810 (2017). [13] Naftali Tishby, Fernando C Pereira, and William Bialek. 2000. The information

bottleneck method. arXiv preprint physics/0004057 (2000).

[14] Jérôme Tubiana, Simona Cocco, and Rémi Monasson. 2019. Learning protein constitutive motifs from sequence data. Elife 8 (2019), e39397.

[15] Silviu-Marian Udrescu and Max Tegmark. 2020. AI Feynman: A physics-inspired method for symbolic regression. Science Advances 6, 16 (2020), eaay2631. [16] Leslie G Valiant. 1984. A theory of the learnable. Commun. ACM 27, 11 (1984),

1134–1142.

[17] Chuang Wang, Hong Hu, and Yue Lu. 2019. A Solvable High-Dimensional Model of GAN. In Advances in Neural Information Processing Systems. 13782–13791.

Tutorial Abstract KDD ‘20, August 23–27, 2020, Virtual Event, USA