Exhibit

Latest version: v0.9.9

Safety actively analyzes 723625 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.9.9

Minor version release updating dependencies.

Notable version upgrades
- Default Python version used in automated tests is changed to 3.12
- Pandas updated to 2.2.2 version
- SQLAlchemy updated to 2.x API
- Numpy updated to 2.x API

0.9.8

Enhancements
- Experimental support for using SQL to generate anonymising sets of values. This feature is available for all column types except numerical.
- `make_distinct` custom action now works on date columns.
- You can now easily add a column with current date and time by using `'sysdate'` as a derived column.
- Pseudo-CHI numbers can be now generated by passing `pseudo_chi` as anonymising set to UUID columns.
- Numerical column weights for categorical values are now optional. This should speed up the process of manually composing a specification.

Bug fixes
- Minor bugs fixed in `shift_distribution`, `make_outlier` and `make_distinct`.
- Fixed a bug in regex distribution where the target number of uniques wasn't respected.
- Fixed date column not being recognized if source data had missing values.

Package version upgrades
- Python version changes to 3.10
- Pandas updated to 2.x version

0.9.7

Enhancements
- Using Exhibit as an importable library is now easier. Please see the scripting recipe for more details and examples.
- `anon.db` is now called `exhibit.db`. You can also now use 3rd party databases to store associated specification / aliasing data, as long as you have the required SQL Alchemy dialect installed. Set `EXHIBIT_DB_SCHEMA` and `EXHIBIT_DB_URL` environment variables and Exhibit will use those instead of the local `exhibit.db`.
- New custom action & filter pairs: `shift_distribution_right / left` and `COLUMN_NAME with_high / low_frequency`
- You can now save probabilities for columns you marked as linked in the CLI.
- UUID columns can now be generated using incrementing integer values by setting `anonymising_set` to `range`. You can also set different seeds for each UUID column.

Bug fixes
- Improved the calculation of weights for numerical columns.
- Various other minor bug fixes and improvements to error messages.

Package version upgrades
- added `scipy` and `sqlalchemy` as dependencies.

0.9.6

Enhancements
- Added experimental support for using pickled machine learning models as plug-ins. See the `Create Exhibit-compatible ML model.ipynb` recipe for details.
- Added an option to save probabilities of values in columns that are put into the DB.
- Added performance and memory benchmarking.
- You can now reference custom lookups you added to the DB directly in the specification as long as specification columns and DB columns match.
- Added a `make_almost_same` custom action.

Bug fixes
- Fixed a bug where generating a dataset without any categorical columns would give an error.
- Fixed a DB bug that gave missing data an equal chance to appear for columns where number of uniques exceeded the in-line limit.

Package version upgrades
- added `dill` as a dependency.

0.9.5

Enhancements
- Added 4 new custom actions to manipulate timeseries: given a numerical column and a timeseries column, create artificial skew (left or right) or add peak / valley.
- `generate_as_sequence` custom action has a new variant that lets you generate repeated sequences of values in the order that they appear in the spec, regardless of the probability vector.
- You can now apply a single custom action to multiple columns by providing them as a comma-separated target string. The same applies to actions. The processing of custom constraints happens in the order in which column names / actions were specified.

Bug fixes
- Fixed an issue where custom constraints wouldn't always respect original column types (float or Int64).
- Fixed an issue where column values generated from a regular expression pattern were inadvertently repeated under certain conditions.
- Fixed a bug with missing values in user linked columns.
- Fixed a bug that could result in linked column groups being in different order when re-running the generation of the same specification.

Package version upgrades
- numpy bumped to 1.22.

0.9.4

Enhancements
- Added experimental support for generating geospatial data. You can now generate point geometry with latitude and longitude coordinates sampled from H3 hexagons.
- Additionally, you can create random, but geographically-valid regions to match the partitions in the data. This is done using a new custom action called geo_make_regions.
- You can now add noise to user linked column groups. Rather than mirroring the original relationships exactly, links can be formed between random column values based on a specified probability.
- Added a new custom action: generate_as_sequence. This action is useful when generated values much follow a specific order in a partition, like vaccine doses administered to an individual: "full schedule" before "booster". This is different from sorting because sorting happens after the data has been generated, whereas generate_as_sequence will ensure that "booster" is never generated by itself - only when preceded by "full schedule".

Bug fixes
- When generating missing data, there was a chance that missing values will be generated in the same rows for different columns rather than independently.
- Fixed a number of issues around nullable integer type.

Package version upgrades
- pandas bumped to 1.4.2.
- numpy bumped to 1.21.5.
- PyYAML bumped to 6.0.

Page 1 of 2

Releases

Has known vulnerabilities

Exhibit

Page 1 of 2

0.9.9

0.9.8

0.9.7

0.9.6

0.9.5

0.9.4

Page 1 of 2

Links

Releases