Dbldatagen

Latest version: v0.3.6

Safety actively analyzes 623541 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.3.5

Changed
* Added formatting of generated code as Html for script methods
* Allow use of inferred types on `withColumn` method when `expr` attribute is used
* Added ``withStructColumn`` method to allow simplified generation of struct and JSON columns
* Modified pipfile to use newer version of package specifications

0.3.4

Changed
* Modified option to allow for range when specifying `numFeatures` with `structType='array'` to allow generation
of varying number of columns
* When generating multi-column or array valued columns, compute random seed with different name for each column
* Additional build ordering enhancements to reduce circumstances where explicit base column must be specified

Added
* Scripting of data generation code from schema (Experimental)
* Scripting of data generation code from dataframe (Experimental)
* Added top level `random` attribute to data generator specification constructor

0.3.3post2

Changed
* Fixed use of logger in _version.py and in spark_singleton.py
* Fixed template issues
* Document reformatting and updates, related code comment changes

Fixed
* Apply pandas optimizations when generating multiple columns using same `withColumn` or `withColumnSpec`

Added
* Added use of prospector to build process to validate common code issues

0.3.2

Changed
* Adjusted column build phase separation (i.e which select statement is used to build columns) so that a
column with a SQL expression can refer to previously created columns without use of a `baseColumn` attribute
* Changed build labelling to comply with PEP440

Fixed
* Fixed compatibility of build with older versions of runtime that rely on `pyparsing` version 2.4.7

Added
* Parsing of SQL expressions to determine column dependencies

Notes
* The enhancements to build ordering does not change actual order of column building -
but adjusts which phase columns are built in

0.3.1

Changed
* Refactoring of template text generation for better performance via vectorized implementation
* Additional migration of tests to use of `pytest`

Fixed
* added type parsing support for binary and constructs such as `nvarchar(10)`
* Fixed error occurring when schema contains map, array or struct.

Added
* Ability to change name of seed column to custom name (defaults to `id`)
* Added type parsing support for structs, maps and arrays and combinations of the above

Notes
* column definitions for map, struct or array must use `expr` attribute to initialize field. Defaults to `NULL`

0.3.0

Changes
* Validation for use in Delta Live Tables
* Documentation updates
* Minor bug fixes
* Changes to build and release process to improve performance
* Modified dependencies to base release on package versions used by Databricks Runtime 9.1 LTS
* Updated to Spark 3.2.1 or later
* Unit test updates - migration from `unittest` to `pytest` for many tests

Page 1 of 2

Releases

Has known vulnerabilities

Dbldatagen

Page 1 of 2

0.3.5

0.3.4

0.3.3post2

0.3.2

0.3.1

0.3.0

Page 1 of 2

Links

Releases