Download pptx - 2010. 5. 25 박 한 샘

Page 1: 2010. 5. 25 박    한    샘

Soft Computing Lab.Dept. Computer Science

Yonsei Univ. Korea

Distilling Free-Form Natural Laws from Ex-perimental Data

Michael Schmidt and Hod Lipson,Science, vol. 324, no. 81, pp. 81-85, April, 2009

2010. 5. 25박 한 샘

Page 2: 2010. 5. 25 박    한    샘

Outline• Overview of this paper

• Background & Motivation

• Algorithm

• Experiments

• Conclusion


Page 3: 2010. 5. 25 박    한    샘

Overview of This Paper• Mining physical systems

– Capture the angles and angular velocities over time using motion tracking– Search for equations that describe a single natural law relating these variables

without any prior knowledge about physics or geometry– Turns out to be the double pendulum’s Hamiltonian

• The proposed approach is demonstrated – Using a simple harmonic oscillator and a chaotic double-pendulum


Actual pendulum, data and results

Page 4: 2010. 5. 25 박    한    샘

Symbolic Regression• Symbolic regression

– Searches both the parameters and the form of equations unlike traditional linear and nonlinear regression methods

• Process (evolutionary computation)– Initial expressions are formed by randomly combining mathematical building

blocks such as algebraic operators {+, -, x, /}, analytical functions (for example, sine and cosine), constants, and state variables

– New equations are formed by recombining previous equations and probabilisti-cally varying their sub-expressions

– Algorithm retains equations that model the experimental data better than oth-ers and abandons unpromising solutions

– After equations reach a desired level of accuracy, the algorithm terminates returning a set of equations that are most likely to correspond to the intrinsic mecha-nisms underlying the observed system



Page 5: 2010. 5. 25 박    한    샘

Challenge• It is a major challenge even for a human scientists to identify

nontrivial relations

• ? Nontrivial conservation equation should be able to predict connections among derivatives of groups of variables over time, relations that we can also calculate from new experimental data

• ? One instance of such a metric is the partial derivatives between pairs of variables



Page 6: 2010. 5. 25 박    한    샘

Algorithm to Detect Conservation Laws

• One can control the type of law, to an extent, by choosing what vari-ables to provide to an algorithm

• If we provide velocities, the algorithm is biased to find energy laws

• If we additionally supply accelerations, the algorithm is biased to find force identities and equations of motion

• Given other types of variables, other or previously unknown analytical laws may exist



Page 7: 2010. 5. 25 박    한    샘

Data Collection• This paper collected data from typical systems:

an air-track oscillator and a double pendulum

• Motion tracking cameras and software were used– Infrared markers are placed on the experimental device– Its dynamics are captured– Motion tracking software produces time-series data of 3-dimensional Eu-

clidean position coordinates for each infrared marker



Page 8: 2010. 5. 25 박    한    샘

Setting• Two configurations of the air track

– Two-spring single-mass• Minimal noise

– Three-spring double-mass• Considerable noise

• Two configurations of a pendulum– A pendulum– A double pendulum

• Higher measurement noise



Page 9: 2010. 5. 25 박    한    샘

Summary of Laws Inferred



Page 10: 2010. 5. 25 박    한    샘

Summary of Laws Inferred• Given position and velocity data over time

– The algorithm converged on the energy laws of each system (Hamiltonian and La-grangian equations)

• Given acceleration data also– It produced the differential equation of motion corresponding to Newton’s second

law for the harmonic oscillator and pendulum systems• Given only position data for the pendulum

– The algorithm converged on the equation of a circle, indicating that the pendulum is confined to a circle

• In the absence of appropriate building blocks, the algorithm developed approximations

– For example, eliminating cosine but not sine drove the algorithm to converge on the equality cos(Ө)=sin(Ө+π/2) or more complex equivalences

One can control the type of law



Page 11: 2010. 5. 25 박    한    샘

Accuracy/Complexity Tradeoff• Consider the relationship between equation complexity and accuracy

– Extremely complex equations with near perfect accuracy• Taylor series, neural networks, and Fourier series

– Simple, single-parameter models with baseline accuracy

• The Pareto front for the double pendulum– Equation at the cliff corresponds to the exact energy conservation law– Dramatical jump means capturing some significant relationships of the system



Page 12: 2010. 5. 25 박    한    샘

Time to Detect Solutions

• The computation time increases with the dimensionality (# of variables), law equation complexity, and noise

– In the worst case, the time to converge on the law equations • Depends exponentially on the complexity of the law expression itself, and • Depends roughly quadratically on the system dimensionality• The bootstrapped double pendulum is an exception

– In a 32-core implementation, the time required ranged from a few minutes (the har-monic oscillator) to 30 hours (the double pendulum)

– Noise reduces the ability to find accurate law equations substantially• It simply requires more time to compute, or • It obscure the law equation entirely depending on the noise strength



Page 13: 2010. 5. 25 박    한    샘

Bootstrapping • Bootstrapping search reduced the search time from 30~40 hours of

computation to 7~8 hours

• It uses the terms from simpler systems as a seed

• We can guess that bootstrapping may be critical for detecting laws in higher-order systems that are veiled in complexity



Page 14: 2010. 5. 25 박    한    샘

Conclusion • Summary

– This paper demonstrated the discovery of physical laws directly from experimentally captured data with the use of a computational search

– It is used to detect nonlinear energy conservation laws, Newtonian force laws, geometric invariants and system manifolds

• Discussion– The concise analytical expressions that we found

• Are amendable to human interpretation and • Help to reveal the physics underlying the observed phenomenon

– This process will not diminish the role of future scientists, but help to focus on interesting phenomena more rapidly

