Bigquery-schema-generator

Latest version: v1.6.1

Safety actively analyzes 625051 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

1.6.1

* 1.6.1 (2024-01-12)
* **Bug Fix**: Prevent amnesia that causes multiple type mismatches warnings
* If a data set contains multiple records with a column which do not
match each other, then the old code would *remove* the corresponding
internal `schema_entry` for that column, and print a warning message.
* This means that subsequent records would recreate the `schema_entry`,
and a subsequent mismatch would print another warning message.
* This also meant that if there was a second record after the most
recent mismatch, the script would output a schema entry for the
mismatching column, corresponding to the type of the last record which
was not marked as a mismatch.
* The fix is to use a tombstone entry for the offending column, instead
of deleting the `schema_entry` completely. Only a single warning
message is printed, and the column is ignored for all subsequent
records in the input data set.
* See
[Issue98](https://github.com/bxparks/bigquery-schema-generator/issues/98]
which identified this problem which seems to have existed from the
very beginning.

1.6.0

* 1.6.0 (2023-04-01)
* Allow `null` fields to convert to `REPEATED` because `bq load` seems
to interpret null fields to be equivalent to an empty array `[]`.
See [90](https://github.com/bxparks/bigquery-schema-generator/issues/90).
* Add `input_format='csvdictreader'` option. Similar to `'dict'` but
intended to be used with the `csv.DictReader` class to read CSV and TSV
files with various options. More documentation and discussions at:
* [`SchemaGenerator.deduce_schema()` from
csv.DictReader](README.mdSchemaGeneratorDeduceSchemaFromCsvDictReader),
* [Discussion91](https://github.com/bxparks/bigquery-schema-generator/discussions/91).

1.5.1

* 1.5.1 (2022-12-04)
* Add `examples/*.py` to demonstrate how to use `SchemaGenerator` as a
library.
* Update README.md to state that `bq load --autodetect` uses the first
500 records. Previously, it scanned only the 100 records.
* This is a maintenance release with no new features or bug fixes.

1.5

* 1.5 (2021-11-14)
* Make the column order in the BQ schema file match the order of appearance
in the JSON data file using the `--preserve_input_sort_order` flag.
Thanks to kdeggelman in
[PR75](https://github.com/bxparks/bigquery-schema-generator/pull/75).

1.4.1

* 1.4.1 (2021-08-23)
* Add documentation for the `input_format='dict'` option.
* Add additional inpout format 'json' and 'dict' test cases.
* Maintenance release, no functional change in core code.

1.4

* 1.4 (2020-12-09)
* Add 'dict' as a third `input_format` when `SchemaGenerator` is used as a
library. This can be useful when the data has already been transformed
into a list of native Python `dict` objects (see 58, thanks to
ZiggerZZ).
* Expand the pattern matchers for quoted integers and quoted floating point
numbers to be more compatible with the patterns recognized by `bq load
--autodetect`.
* Add Table of Contents to READMD.md. Add usage info for the
`schema_map=existing_schema_map` and the `input_format='dict'` parameters
in the `SchemaGenerator()` constructor.

Page 1 of 4

Releases

Has known vulnerabilities

Bigquery-schema-generator

Page 1 of 4

1.6.1

1.6.0

1.5.1

1.5

1.4.1

1.4

Page 1 of 4

Links

Releases