Formulaic

Latest version: v1.1.1

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 5

1.1.1

**New features and enhancements:**

* `Formula.differentiate()` is now considered stable, with
`ModelMatrix.differentiate()` to follow in a future release. (236)

**Bugfixes and cleanups:**

* Fixed a regression introduced in v1.1.0 regarding ordering of terms in a
differentiated formula. (236)

1.1.0

This is a major feature release that was motivated in many aspects by the migration of `statstmodels` from `patsy` to `formulaic`. Many thanks to bashtage for driving those invasive changes forward. There are some semantic breaking changes, but unless you are deep in the internals of `formulaic` (which I do not believe to be the case for any external library) these are not expected to break common usage.

**Breaking changes:**

- `Formula` is no longer always "structured" with special cases to handle the
case where it has no structure. Legacy shims have been added to support old
patterns, with `DeprecationWarning`s raised when they are used. It is not
expected to break anyone not explicitly checking whether the `Formula.root` is
a list instance (which formerly should have been simply assumed) [it is a now
`SimpleFormula` instance that acts like an ordered sequence of `Term`
instances].
- The column names associated with categorical factors has changed. Previously,
a prefix was unconditionally added to the level in the column name like
`feature[T.A]`, whether nor not the encoding will result in that term acting
as a contrast. Now, in keeping with `patsy`, we only add the prefix if the
categorical factor is encoded with reduced rank. Otherwise, `feature[A]` will
be used instead.
- `formulaic.parsers.types.structured` has been promoted to
`formulaic.utils.structured`.

**New features and enhancements:**

- `Formula` now instantiates to `SimpleFormula` or `StructuredFormula`, the
latter being a tree-structure of `SimpleFormula` instances (as compared to
`List[Term]`) previously. This simplifies various internal logic and makes the
propagation of formula metadata more explicit. (222)
- Added support for restricting the set of features used by the default formula
parser so that libraries can more easily restrict the structure of output
formulae. (207)
- `dict` and `recarray` types are no associated with the `pandas` materializer
by default (rather than raising), simplifying some user workflows. (225)
- Added support for the `.` operator (which is replaced with all variables not
used on the left-hand-side of formulae). (216)
- Added **experimental** support for nested formulae of form `[ ... ~ ... ]`.
This is useful for (e.g.) generating formulae for IV 2SLS. (108)
- Add support for subsettings `ModelSpec[s]` based on an arbitrary
strictly reduced `FormulaSpec`. (208)
- Added `Formula.required_variables` to more easily surface the expected data
requirements of the formula. (205)
- Added support for extracting rows dropped during materialization. (197)
- Added cubic spline support for cyclic (`cc`) and natural (`cr`). See
`formulaic.materializers.transforms.cubic_spline.cubic_spline` for
more details.
- Added a `lag()` transform.
- Constructing `LinearConstraints` can now be done from a list of strings (for
increased parity with `patsy`). (201)
- Categorical factors are now preceded with (e.g.) `T.` when they actully
describe contrasts (i.e. when they are encoded with reduced rank). (220)
- Contrasts metadata is now added to the encoder state via `encode_categorical`;
which is surfaced via `ModelSpec.factor_contrasts`. (204)
- `Operator` instances now received `context` which is optionally specified by
the user during formula parsing, and updated by the parser. This is what makes
the `.` implementation possible. (216)
- Given the generic usefulness of `Structured`, it has been promoted to
`formulaic.utils`. (223)
- Added explicit support and testing for Python 3.13. (202)

**Bugfixes and cleanups:**

- Fixed nested ordering of `Formula` instance. (200)
- Allow Python tokens to multiple chained parentheses and brackets without using
quotes as long as the parentheses are balanced. (214, 218)
- Reduced the number of redundant initialisation operations in `Structured`
instances. (200)
- Fixed pickling `ModelMatrix` and `FactorValues` instances (whenever wrapped
objects are picklable). (209; thanks bashtage)
- `basis_spline`: Fixed evaluation involving datasets with null values, and
disallow out-of-bounds knots. (217; thanks bashtage)
- Improved robustness of data contexts involving PyArrow datasets.
- We now use the same sentiles throughout the code-base, rather than having
module specific sentinels in some places.
- Migrated to `ruff` for linting, and updated `mypy` and `pre-commit` tooling.
- Automatic fixes from `ruff` are automatically applied when using
`hatch run lint:format`.

**Documentation:**

- Fixed and updated docsite build, as well as other minor tweaks.

1.0.2

**Bugfixes and cleanups:**

* Fix compatibility with `pandas` >=3.
* Fix `mypy` type inference in materializer subclasses.

**Documentation:**

* Add column name extraction to `sklearn` integration example.
* Add section to allow users to indicate their usage of formulaic.

1.0.1

This is identical to v1.0.0, but with the package status marked to production/stable rather than beta [**facepalm**].

1.0.0

This is the first officially stable release of formulaic, with a relatively small diff from the 0.6.x series.

**Breaking changes:**

* Python tokens are now canonically formatted (see below).
* Methods deprecated during the 0.x series have been removed: `Formula.terms`,
`ModelSpec.feature_names`, and `ModelSpec.feature_indices`.

**New features and enhancements:**

* Python tokens are now sanitized and canonically formatted to prevent
ambiguities and better align with `patsy`.
* Added official support for Python 3.12 (no code changes were necessary).
* Added the `hashed` transform for categorically encoding deterministically
hashed representations of a dataset. [Contributed by rishi-kulkarni]

**Bugfixes and cleanups:**

* Fixed transform state not propagating correctly when Python code tokens were
not canonically formatted.
* Literals in formulae will no longer be silently ignored, and feature scaling
is now fully supported.
* Improved code parsing and formatting utilities and dropped the requirement for
`astor` for Python 3.9 and newer.
* Fixed all warnings emitted during unit tests.

**Documentation:**

* Removed incompleteness warnings.
* Added some lightweight developer documents.
* Fixed some broken links.

0.6.6

This is minor release with one important bugfix.

**Bugfixes and cleanups:**

* Fixes a regression introduced by 0.6.4 whereby missing variables will be
silently dropped from the formula., rather than raising an exception.

Page 1 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.