打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Fitting data

This page shows you how to fit experimental data and plots the resultsusing matplotlib.

Fit examples with sinusoidal functions

Generating the data

Using real data is much more fun, but, just so that you can reproducethis example I will generate data to fit

In [1]:
import numpy as npfrom numpy import pi, r_import matplotlib.pyplot as pltfrom scipy import optimize# Generate data points with noisenum_points = 150Tx = np.linspace(5., 8., num_points)Ty = TxtX = 11.86*np.cos(2*pi/0.81*Tx-1.32) + 0.64*Tx+4*((0.5-np.random.rand(num_points))*np.exp(2*np.random.rand(num_points)**2))tY = -32.14*np.cos(2*np.pi/0.8*Ty-1.94) + 0.15*Ty+7*((0.5-np.random.rand(num_points))*np.exp(2*np.random.rand(num_points)**2))

Fitting the data

We now have two sets of data: Tx and Ty, the time series, and tX and tY,sinusoidal data with noise. We are interested in finding the frequencyof the sine wave.

In [2]:
# Fit the first setfitfunc = lambda p, x: p[0]*np.cos(2*np.pi/p[1]*x+p[2]) + p[3]*x # Target functionerrfunc = lambda p, x, y: fitfunc(p, x) - y # Distance to the target functionp0 = [-15., 0.8, 0., -1.] # Initial guess for the parametersp1, success = optimize.leastsq(errfunc, p0[:], args=(Tx, tX))time = np.linspace(Tx.min(), Tx.max(), 100)plt.plot(Tx, tX, "ro", time, fitfunc(p1, time), "r-") # Plot of the data and the fit# Fit the second setp0 = [-15., 0.8, 0., -1.]p2,success = optimize.leastsq(errfunc, p0[:], args=(Ty, tY))time = np.linspace(Ty.min(), Ty.max(), 100)plt.plot(Ty, tY, "b^", time, fitfunc(p2, time), "b-")# Legend the plotplt.title("Oscillations in the compressed trap")plt.xlabel("time [ms]")plt.ylabel("displacement [um]")plt.legend(('x position', 'x fit', 'y position', 'y fit'))ax = plt.axes()plt.text(0.8, 0.07,         'x freq :  %.3f kHz \n y freq :  %.3f kHz' % (1/p1[1],1/p2[1]),         fontsize=16,         horizontalalignment='center',         verticalalignment='center',         transform=ax.transAxes)plt.show()

A clever use of the cost function

Suppose that you have the same data set: two time-series of oscillatingphenomena, but that you know that the frequency of the two oscillationsis the same. A clever use of the cost function can allow you to fit bothset of data in one fit, using the same frequency. The idea is that youreturn, as a "cost" array, the concatenation of the costs of your twodata sets for one choice of parameters. Thus the leastsq routine isoptimizing both data sets at the same time.

In [3]:
# Target functionfitfunc = lambda T, p, x: p[0]*np.cos(2*np.pi/T*x+p[1]) + p[2]*x# Initial guess for the first set's parametersp1 = r_[-15., 0., -1.]# Initial guess for the second set's parametersp2 = r_[-15., 0., -1.]# Initial guess for the common periodT = 0.8# Vector of the parameters to fit, it contains all the parameters of the problem, and the period of the oscillation is not there twice !p = r_[T, p1, p2]# Cost function of the fit, compare it to the previous example.errfunc = lambda p, x1, y1, x2, y2: r_[                fitfunc(p[0], p[1:4], x1) - y1,                fitfunc(p[0], p[4:7], x2) - y2            ]# This time we need to pass the two sets of data, there are thus four "args".p,success = optimize.leastsq(errfunc, p, args=(Tx, tX, Ty, tY))time = np.linspace(Tx.min(), Tx.max(), 100) # Plot of the first data and the fitplt.plot(Tx, tX, "ro", time, fitfunc(p[0], p[1:4], time),"r-")# Plot of the second data and the fittime = np.linspace(Ty.min(), Ty.max(),100)plt.plot(Ty, tY, "b^", time, fitfunc(p[0], p[4:7], time),"b-")# Legend the plotplt.title("Oscillations in the compressed trap")plt.xlabel("time [ms]")plt.ylabel("displacement [um]")plt.legend(('x position', 'x fit', 'y position', 'y fit'))ax = plt.axes()plt.text(0.8, 0.07,         'x freq :  %.3f kHz' % (1/p[0]),         fontsize=16,         horizontalalignment='center',         verticalalignment='center',         transform=ax.transAxes)
Out[3]:
<matplotlib.text.Text at 0x7fab996e5b90>

Simplifying the syntax

Especially when using fits for interactive use, the standard syntax foroptimize.leastsq can get really long. Using the following script cansimplify your life:

In [4]:
import numpy as npfrom scipy import optimizeclass Parameter:    def __init__(self, value):            self.value = value    def set(self, value):            self.value = value    def __call__(self):            return self.valuedef fit(function, parameters, y, x = None):    def f(params):        i = 0        for p in parameters:            p.set(params[i])            i += 1        return y - function(x)    if x is None: x = np.arange(y.shape[0])    p = [param() for param in parameters]    return optimize.leastsq(f, p)

Now fitting becomes really easy, for example fitting to a gaussian:

In [5]:
# giving initial parametersmu = Parameter(7)sigma = Parameter(3)height = Parameter(5)# define your function:def f(x): return height() * np.exp(-((x-mu())/sigma())**2)# fit! (given that data is an array with the data to fit)data = 10*np.exp(-np.linspace(0, 10, 100)**2) + np.random.rand(100)print fit(f, [mu, sigma, height], data)
(array([ -1.7202343 ,  12.29906459,  10.74194291]), 1)

Fitting gaussian-shaped data

Calculating the moments of the distribution

Fitting gaussian-shaped data does not require an optimization routine.Just calculating the moments of the distribution is enough, and this ismuch faster.

However this works only if the gaussian is not cut out too much, and ifit is not too small.

In [6]:
gaussian = lambda x: 3*np.exp(-(30-x)**2/20.)data = gaussian(np.arange(100))plt.plot(data, '.')X = np.arange(data.size)x = np.sum(X*data)/np.sum(data)width = np.sqrt(np.abs(np.sum((X-x)**2*data)/np.sum(data)))max = data.max()fit = lambda t : max*np.exp(-(t-x)**2/(2*width**2))plt.plot(fit(X), '-')
Out[6]:
[<matplotlib.lines.Line2D at 0x7fab9977d990>]

Fitting a 2D gaussian

Here is robust code to fit a 2D gaussian. It calculates the moments ofthe data to guess the initial parameters for an optimization routine.For a more complete gaussian, one with an optional additive constant androtation, seehttp://code.google.com/p/agpy/source/browse/trunk/agpy/gaussfitter.py.It also allows the specification of a known error.

In [7]:
def gaussian(height, center_x, center_y, width_x, width_y):    """Returns a gaussian function with the given parameters"""    width_x = float(width_x)    width_y = float(width_y)    return lambda x,y: height*np.exp(                -(((center_x-x)/width_x)**2+((center_y-y)/width_y)**2)/2)def moments(data):    """Returns (height, x, y, width_x, width_y)    the gaussian parameters of a 2D distribution by calculating its    moments """    total = data.sum()    X, Y = np.indices(data.shape)    x = (X*data).sum()/total    y = (Y*data).sum()/total    col = data[:, int(y)]    width_x = np.sqrt(np.abs((np.arange(col.size)-y)**2*col).sum()/col.sum())    row = data[int(x), :]    width_y = np.sqrt(np.abs((np.arange(row.size)-x)**2*row).sum()/row.sum())    height = data.max()    return height, x, y, width_x, width_ydef fitgaussian(data):    """Returns (height, x, y, width_x, width_y)    the gaussian parameters of a 2D distribution found by a fit"""    params = moments(data)    errorfunction = lambda p: np.ravel(gaussian(*p)(*np.indices(data.shape)) -                                 data)    p, success = optimize.leastsq(errorfunction, params)    return p

And here is an example using it:

In [8]:
# Create the gaussian dataXin, Yin = np.mgrid[0:201, 0:201]data = gaussian(3, 100, 100, 20, 40)(Xin, Yin) + np.random.random(Xin.shape)plt.matshow(data, cmap=plt.cm.gist_earth_r)params = fitgaussian(data)fit = gaussian(*params)plt.contour(fit(*np.indices(data.shape)), cmap=plt.cm.copper)ax = plt.gca()(height, x, y, width_x, width_y) = paramsplt.text(0.95, 0.05, """x : %.1fy : %.1fwidth_x : %.1fwidth_y : %.1f""" %(x, y, width_x, width_y),        fontsize=16, horizontalalignment='right',        verticalalignment='bottom', transform=ax.transAxes)
Out[8]:
<matplotlib.text.Text at 0x7fab9d8a4dd0>

Fitting a power-law to data with errors

Generating the data

Generate some data with noise to demonstrate the fitting procedure. Datais generated with an amplitude of 10 and a power-law index of -2.0.Notice that all of our data is well-behaved when the log is taken... youmay have to be more careful of this for real data.

In [9]:
# Define function for calculating a power lawpowerlaw = lambda x, amp, index: amp * (x**index)########### Generate data points with noise##########num_points = 20# Note: all positive, non-zero dataxdata = np.linspace(1.1, 10.1, num_points)ydata = powerlaw(xdata, 10.0, -2.0)     # simulated perfect datayerr = 0.2 * ydata                      # simulated errors (10%)ydata += np.random.randn(num_points) * yerr       # simulated noisy data

Fitting the data

If your data is well-behaved, you can fit a power-law function by firstconverting to a linear equation by using the logarithm. Then use theoptimize function to fit a straight line. Notice that we are weightingby positional uncertainties during the fit. Also, the best-fitparameters uncertainties are estimated from the variance-covariancematrix. You should read up on when it may not be appropriate to use thisform of error estimation. If you are trying to fit a power-lawdistribution, thissolutionis more appropriate.

In [10]:
########### Fitting the data -- Least Squares Method########### Power-law fitting is best done by first converting# to a linear equation and then fitting to a straight line.# Note that the `logyerr` term here is ignoring a constant prefactor.##  y = a * x^b#  log(y) = log(a) + b*log(x)#logx = np.log10(xdata)logy = np.log10(ydata)logyerr = yerr / ydata# define our (line) fitting functionfitfunc = lambda p, x: p[0] + p[1] * xerrfunc = lambda p, x, y, err: (y - fitfunc(p, x)) / errpinit = [1.0, -1.0]out = optimize.leastsq(errfunc, pinit,                       args=(logx, logy, logyerr), full_output=1)pfinal = out[0]covar = out[1]print pfinalprint covarindex = pfinal[1]amp = 10.0**pfinal[0]indexErr = np.sqrt( covar[1][1] )ampErr = np.sqrt( covar[0][0] ) * amp########### Plotting data##########plt.clf()plt.subplot(2, 1, 1)plt.plot(xdata, powerlaw(xdata, amp, index))     # Fitplt.errorbar(xdata, ydata, yerr=yerr, fmt='k.')  # Dataplt.text(5, 6.5, 'Ampli = %5.2f +/- %5.2f' % (amp, ampErr))plt.text(5, 5.5, 'Index = %5.2f +/- %5.2f' % (index, indexErr))plt.title('Best Fit Power Law')plt.xlabel('X')plt.ylabel('Y')plt.xlim(1, 11)plt.subplot(2, 1, 2)plt.loglog(xdata, powerlaw(xdata, amp, index))plt.errorbar(xdata, ydata, yerr=yerr, fmt='k.')  # Dataplt.xlabel('X (log scale)')plt.ylabel('Y (log scale)')plt.xlim(1.0, 11)
[ 1.00341313 -2.00447676][[ 0.01592265 -0.0204523 ] [-0.0204523   0.03027352]]
Out[10]:
(1.0, 11)
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
使用Python进行曲线拟合
确定连续变量的cutoff值 三次样条插值法
数据挖掘流程
matplotlib 必知的 15 个图
ML之NB:基于news新闻文本数据集利用纯统计法、kNN、朴素贝叶斯(高斯/多元伯努利/多项式)、线性判别分析LDA、感知器等算法实现文本分类预测
数量经济学的Python实现(六)|初探卡尔曼滤波器
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服