it's only fitting
Here's a quick-n-dirty Gaussian curve fitter, which I seem to reinvent about twice a year. There are a number of canned solutions that I can never remember how to use, but I also frequently find myself wanting to fit weird functions.
Below in a minimum working example. But the short-short version is
def func(x, *params): ... x,y = read_some_data() guess_params = [...] from scipy import optimize better_params, covariance = optimize.curve_fit( func, x, y, p0=guess_params)
First, we need to define the function to fit. I think you can't have
keyword arguments, but I suppose you could kludge around that with a
wrapper or with functools.partial
. Here, the gaussian, with
normalizations for area and also normalizations for fitting to the
peak:
def gaussian_norm(x, mean, std_dev, integral, offset): import numpy as np mu,sigma = mean, std_dev # standard symbols bell = np.exp( -(x-mu)**2 / (2*sigma**2) ) norm = integral / np.sqrt(2*np.pi*sigma**2) return norm*bell + offset def gaussian_peak(x, mean, std_dev, peak, offset): import numpy as np integral = peak * np.sqrt(2*np.pi*std_dev**2) return gaussian_norm(x, mean, std_dev, integral, offset)
Next, generate some noisy data (perhaps by doing an experiment):
import numpy as np import matplotlib.pyplot as plt # Use messy bins because data always sucks frequency = np.linspace(2900,3100, 1253) # Pick some default parameters mean, std_dev, amplitude, offset = 3027, 15, 20, 5 # Generate some noisy data nice_data = gaussian_peak(frequency, mean, std_dev, amplitude, offset) data = nice_data + np.random.randn(*nice_data.shape) plt.plot(frequency, data)
Because this is quick-n-dirty, it's necessary to guess at some parameters:
from scipy import optimize guess = (3000, 50, 30, 0) better_guess, covariance = optimize.curve_fit( gaussian_peak, frequency, data, p0 = guess) plt.plot(frequency, data, alpha=0.5, label='data') plt.plot(frequency, gaussian_peak(frequency, *guess), label='intial parameters') plt.plot(frequency, gaussian_peak(frequency, *better_guess), label='fit result', color='red')
The fit results seem to be pretty good:
specified | fitted |
---|---|
3027.00 | 3026.91 |
15.00 | 15.01 |
20.00 | 19.97 |
5.00 | 5.01 |
Note that gaussian_norm
, while its integral is better-defined,
develops a huge covariance between the integral and the width of the
distribution. It's easier to read the parameters for gaussian_peak
off of the graph, and the parameters seem to be much more orthogonal
to each other.