Dataprofiler

Latest version: v0.13.3

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 8 of 10

0.6.0

Profiler
* Structured Profiler can now take in duplicate columns 315
- this is an api Change to access to the data in the report, data_stats is now a list
* Categorical Profile now includes top 5 counts 299
* Add new categorical statistics: gini impurity and unalikeability 308, 320
* Unstructured Data Labeler profile now includes entity percentages 305
* Add Pearson's correlation to the Structured Profiler 281, 307, 317
* Unstructured Profiler Text vocab now outputs a top k highest vocab counts 304, 314

Runtime Changes
* Categorical Profiler keeps an internal count of categories 296
* Text in Unstructured profiler now keep a count of vocab 304
* Data Reader's `is_match function can now take in StringIO/ByteIO 292 ,306, 326

Bug fixes
* Bug fix to make sure samples being stored by UnstructuredProfiler save 313

Other Changes
* Documentation on contributions added 310, 311, 312, 333,
* Github Pages updated 309, 316, 322, 323, 329, 330, 331, 334

0.5.3

Bug fixes
* remove unused import causing profiler error 290

0.5.2

Profiler
* A library level seed value is now settable by the user to make the sampling during Profiling deterministic `dp.set_seed` 271
* NumericalStats now include *skewness*, *kurtosis*, *Counter Zeros*, and *Count Negatives* 266, 267, 272, 273
* User can turn off bias correction for variance, skewness, and kurtosis 269
* Sum is returned in NumericalStats Profiles 264

Runtime Changes
* Warnings will be issued when invalid is received by the NumericalStats profilers 280

Bug fixes
* Default values for variance, skewness, and kurtosis are `np.nan` 275
* Options no longer propagate to all levels when setting a single level property unless a wildcard is specified e.g. `*.is_enabled` 270

Other Changes
* Documentation on contributions added 268
* Github Pages updated 284 285, 287, 288

0.5.1

Bug fixes
* Fix merging UnstructuredProfiler 255
* Fix bug in saving profiles without a labeler 257

Other Changes
* Documentation: Add UnstructuredProfiler examples 252

0.5.0

Runtime Changes

Major release, unstructured profiles can now be generated

Profiler

* Unstructured Profiler enabled, profiles can be generated on the TextData class
* Factory Class automatically selects UnstructuredProfiler vs StructuredProfiler

0.4.6

Bug fixes

* Fix histogram index out of range 217
* Locking to required TensorFlow < 2.5.0, Tensorflow==2.5.0 has an issue 220
* Remove depreciated AVRO file formats 220
* Fix padding issue related to numpy 225
* Remove pad in output of labeler 226

Other changes

* histogram utils now use the builtin numpy functions 213

Page 8 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.