Natural Evolution Strategies

NES

A family of numerical optimization algorithms for black-box problems which iteratively update a search distribution by using an estimated gradient on its distribution parameters.

The core idea is to use search gradients to update parameters where the search gradient is the sampled gradient of expected fitness. Let θ be the parameters of the search distribution π(x|θ) and f(x) the fitness function evaluated at x. The objective is to maximize the expected fitness under the distribution.

J(θ)=Eθ[f(x)]=f(x) π(x|θ) dx

The gradient can be written as:

θJ(θ)=Eθ[f(z) θ log π(z|θ)]

The Monte Carlo estimate from samples z1,...,zλ as:

θJ(θ)1λk=1λf(zk) θ log π(zk|θ)

The parameters are then updated with a standard ascent scheme:

θθ+ηθJ(θ)

The natural gradient accounts for uncertainty in the second order updates by removing the dependence on parameterization by instead relying on the KL divergence between probability distributions where the Fisher information matrix defines the local curvature in distribution space.

(1)F=π(z|θ) θ log π(z|θ) θ log π(z|θ)dz(2)1λk=1λθ log π(zk|θ) θ log π(zk|θ)

The ascent scheme then becomes:

θθ+ηF1θJ(θ)