Formulaic

Latest version: v1.0.2

Safety actively analyzes 681866 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

0.5.0

This is a major new release with some minor API changes, some ergonomic
improvements, and a few bug fixes.

**Breaking changes:**

* Accessing named substructures of `Formula` objects (e.g. `formula.lhs`) no
longer returns a list of terms; but rather a `Formula` object, so that the
helper methods can remain accessible. You can access the raw terms by
iterating over the formula (`list(formula)`) or looking up the root node
(`formula.root`).

**New features and improvements:**

* The `ModelSpec` object is now the source of truth in all `ModelMatrix`
generations, and can be constructed directly from any supported specification
using `ModelSpec.from_spec(...)`. Supported specifications include formula
strings, parsed formulae, model matrices and prior model specs.
* The `.get_model_matrix()` helper methods across `Formula`,
`FormulaMaterializer`, `ModelSpec` and `model_matrix` objects/helpers
functions are now consistent, and all use `ModelSpec` directly under the hood.
* When accessing substructures of `Formula` objects (e.g. `formula.lhs`), the
term lists will be wrapped as trivial `Formula` instances rather than returned
as raw lists (so that the helper methods like `.get_model_matrix()` can still
be used).
* `FormulaSpec` is now exported from the top-level module.

**Bugfixes and cleanups:**

* Fixed `ModelSpec` specifications being overriden by default arguments to
`FormulaMaterializer.get_model_matrix`.
* `Structured._flatten()` now correctly flattens unnamed substructures.

0.4.0

This is a major new release with some new features, greatly improved ergonomics
for structured formulae, matrices and specs, and a few small breaking changes
(most with backward compatibility shims). All users are encouraged to upgrade.

**Breaking changes:**

* `include_intercept` is no longer an argument to `FormulaParser.get_terms`;
and is instead an argument of the `DefaultFormulaParser` constructor. If you
want to modify the `include_intercept` behaviour, please use:
python
Formula("y ~ x", _parser=DefaultFormulaParser(include_intercept=False))

* Accessing terms via `Formula.terms` is deprecated since `Formula` became a
subclass of `Structured[List[Terms]]`. You can directly iterate over, and/or
access nested structure on the `Formula` instance itself. `Formula.terms`
has a deprecated property which will return a reference to itself in order to
support legacy use-cases. This will be removed in 1.0.0.
* `ModelSpec.feature_names` and `ModelSpec.feature_columns` are deprecated in
favour of `ModelSpec.column_names` and `ModelSpec.column_indices`. Deprecated
properties remain in-place to support legacy use-cases. These will be removed
in 1.0.0.

**New features and enhancements:**

* Structured formulae (and their derived matrices and specs) are now mutable.
Internally `Formula` has been refactored as a subclass of
`Structured[List[Terms]]`, and can be incrementally built and modified. The
matrix and spec outputs now have explicit subclasses of `Structured`
(`ModelMatrices` and `ModelSpecs` respectively) to expose convenience methods
that allow these objects to be largely used interchangeably with their
singular counterparts.
* `ModelMatrices` and `ModelSpecs` arenow surfaced as top-level exports of the
`formulaic` module.
* `Structured` (and its subclasses) gained improved integration of nested tuple
structure, as well as support for flattened iteration, explicit mapping
output types, and lots of cleanups.
* `ModelSpec` was made into a dataclass, and gained several new
properties/methods to support better introspection and mutation of the model
spec.
* `FormulaParser` was renamed `DefaultFormulaParser`, and made a subclass of the
new formula parser interface `FormulaParser`. In this process
`include_intercept` was removed from the API, and made an instance attribute
of the default parser implementation.

**Bugfixes and cleanups:**

* Fixed AST evaluation for large formulae that caused the evaluation to hit the
recursion limit.
* Fixed sparse categorical encoding when the dataframe index is not the standard
range index.
* Fixed a bug in the linear constraints parser when more than two constraints
were specified in a comma-separated string.
* Avoid implicit changing of the sparsity structure of CSC matrices.
* If manually constructed `ModelSpec`s are provided by the user during
materialization, they are updated to reflect the output-type chosen by the
user, as well as whether to ensure full rank/etc.
* Allowed use of older pandas versions. All versions >=1.0.0 are now supported.
* Various linting cleanups as `pylint` was added to the CI testing.

**Documentation:**

* Apart from the `.materializer` submodule, most code now has inline
documentation and annotations.

0.3.4

This is a backward compatible major release that adds several new features.

**New features and enhancements:**

* Added support for customizing the contrasts generated for categorical
features, including treatment, sum, deviation, helmert and custom contrasts.
* Added support for the generation of linear constraints for `ModelMatrix`
instances (see `ModelMatrix.model_spec.get_linear_constraints`).
* Added support for passing `ModelMatrix`, `ModelSpec` and other formula-like
objects to the `model_matrix` sugar method so that pre-processed formulae can
be used.
* Improved the way tokens are manipulated for the right-hand-side intercept and
substitutions of `0` with `-1` to avoid substitutions in quoted contexts.

**Bugfixes and cleanups:**

* Fixed variable sanitization during evaluation, allowing variables with
special characters to be used in Python transforms; for example:
bs(`my|feature%is^cool`).
* Fixed the parsing of dictionaries and sets within python expressions in the
formula; for example: `C(x, {"a": [1,2,3]})`.
* Bumped requirement on `astor` to >=0.8 to fix issues with ast-generation in
Python 3.8+ when numerical constants are present in the parsed python
expression (e.g. "bs(x, df=10)").

0.3.3

This is a minor patch release that migrates the package tooling to [poetry](https://python-poetry.org/); solving a version inconsistency when packaging for conda.

0.3.2

This is a minor patch release that fixes an attempt to import `numpy.typing` when numpy is not version 1.20 or later. (thanks for noticing this and fixing it bashtage ).

0.3.1

This is a minor patch release that fixes the maintaining of output types, NA-handling, and assurance of full-rank for factors that evaluate to pre-encoded columns when constructing a model matrix from a pre-defined `ModelSpec`. The benchmarks were also updated.

Page 3 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.