===========================
This version involves significant internal change relative to the last
version, much of which will be invisible to most users. Significant pieces
of lsqfit and gvar were refactored for simplicity, with replacements for a
number of awkward constructions that reflected earlier but now obsolete
ideas about how the code would be used. A somewhat inconvenient change is
renaming the gdev module to gvar (for "gaussian variable"): every
instance of 'gdev' is now replaced by 'gvar', as is every 'GDev' by 'GVar'.
The old names were wrong and therefore misleading. (A tiny 'gdev.py' file
is included that aliases the new names with the old names, for use with old
code.) More usefully, the interfaces for many functions in lsqfit and
especially gvar were made more uniform: for example, almost any gvar
function that took an array of GVars as an argument can now also accept a
single GVar or a dictionary whose values are single GVars or arrays of
GVars. This is motivated by the overall design notion that multidimensional
distributions should be represented by collections of GVars: either as
arrays, or as dictionaries containing GVars and/or arrays of GVars, the
latter providing a much more flexible interface. These changes should make
the modules easier to learn and use, and certainly makes them easier to
maintain.
The bigger changes include:
- The names gdev and GDev are everywhere replaced by gvar and GVar (for
"gaussian variable"). A new gdev.py module is included that aliases the
new names to the old names, for use with old code. gdev.py is not
installed with the rest of the code; if you need it (for old code)
install it, for example, using "make install-gdev"; or copy it to the the
directory containing the old code. Obviously, a better solution is to get
rid of the old names.
- Correctly handles situations where priors are correlated with the fit
data. Previously such correlations were ignored. This is the most
significant change in functionality. It is a situation that arises rather
rarely, but which is mishandled by older versions.
- Removed minor bug in lsqfit.wavg (used to ignore svdcut<0).
- Fit functions that depend only on the fit parameters (that is, have no
dependence on an independent "x" variable) are now supported. This is
signaled either by setting x=False in the fit data (data=(x,y)) or by
leaving x out altogether (data=y) in nonlinear_fit.
- Rearranged gvar and lsqfit into packages instead of simple modules. This
makes maintenance easier. It also reduces the number of names added to
the module space.
- Relocated BufferDict into gvar. BufferDicts can still be constructed from
dictionaries but no longer directly from arrays. This makes for a cleaner
data type. BufferDicts are used internally in several of gvar's functions
as the standard dictionary class (the standard array class is a numpy
array). Unlike regular dictionaries, BufferDicts can be pickled even when
filled with GVars; this is currently the only way to pickle GVars.
- Removed class GPrior from lsqfit. It isn't really needed any more since a
dictionary works just as well. (GPrior is now an alias to
gvar.BufferDict, which should allow older code to continue working,
mostly.) Also removed classes BasePrior and NullPrior.
- svdcut and svdnum in nonlinear_fit still specify svd cuts for the fit
data, but now can also specify svd cuts for the prior (no other easy way
to do this now that GPriors are effectively gone). To specify a cut for
the prior make svdcut and/or svdnum into 2-tuples, where the first entry
is for the data and the second is for the priors.
- fit.svdcorrection is list with one or two elements. Either element can be
a (1-d) vector or None. Can now be used directly as an input in
fmt_errorbudget() (don't need/want to put [ ] around it).
- Merged class LSQFit and function nonlinear_fit from lsqfit into a new
class called nonlinear_fit. nonlinear_fit is used as before, but is now
actually initializing the class when it is fitting. Given standard usage,
there was no reason to keep these two separate. (The old LSQFit class was
originally meant to represent a fitter, but was mostly used to hold the
results of a single fit; the new class nonlinear_fit class represents the
result of a fit.)
- Redefined gvar.mean, gvar.sdev, gvar.var, gvar.evalcov, gvar.raniter, etc
so that they all work with dictionaries as well as arrays. The
dictionaries are converted to BufferDicts internally and results are
returned as BufferDicts.
- The name of fmt_partialsdev is now changed to the more understandable
fmt_errorbudget. Also it is part of module gvar, as well as being a
method in nonlinear_fit objects. The name fmt_partialsdev is retained as
an alias, to benefit older code.
- Allow arguments to GVar.partialvar and GVar.partialsdev to be None or
single GVars or arrays/dictionaries of GVars. Arguments to
gvar.fmt_errorbudget are also now allowed to be None, single GVars or
lists of arrays/dictionaries of GVars. Previously each of these routines
was more restrictive.
- Added a bootstrap_iter function to gvar to create bootstrap copies of
collections of GVars (arrays or dictionaries).
- lsqfit's nonlinear_fit.bootstrap_iter does bootstrap fits on a list of
bootstrap copies of the fit data. Now the list of bootstrapped data can
be omitted and bootstrap copies are generated internally, from the means
and covariance matrix of the data set. This is useful if the data has
small errors (ie, is gaussian) which is often the case even if the fit
parameters turn out to be non-gaussian (and therefore require
bootstrapping).
- Created new options for gvar.gvar arguments: eg,
gvar.gvar(["0(1)",(2,1)]) returns array [gvar(0,1),gvar(2,1)].
- Added new tools in gvar.dataset for handling random samples from
distributions. These include functions avg_data(data),
bootstrap_iter(data), and bin_data(data,binsize), as well as class
Dataset for collecting random samples (in a dictionary). These additions
are meant to supplant the old dataset.py module.
- Internal changes to how the data and covariance matrices are inverted
could lead to small differences in results, due to roundoff error.
- nonlinear_fit.check_roundoff() now issues a warning, rather than an
error, if large roundoff errors are suspected.
- svd analysis is handled by function gvar.svd which is now applied to a
dictionary or array of GVars. It uses class gvar.SVD which is applied to
a covariance matrix.
- nonlinear_fit.kappa no longer exists. It can be obtained using gvar.SVD.
- renamed nonlinear_fit.dump_parameters with nonlinear_fit.dump_pmean. Also
added nonlinear_fit.dump_p and nonlinear_fit.load_parameters.
- Documentation streamlined. The Overview and Tutorial section was
simplified a little, and has a new section on Troubleshooting.
- Speed is about the same except in cases where there are correlations
between the priors and the fit data (where it is somewhat slower now,
because it is doing the right thing).
Created by G. Peter Lepage (Cornell University) on 2012-04-29.
Copyright (c) 2008-2017 G. Peter Lepage.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
any later version (see <http://www.gnu.org/licenses/>).
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.