Merlin-core

Latest version: v23.8.0

Safety actively analyzes 625010 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

23.08.00

23.05.00

Whatโ€™s Changed

โš  Breaking Changes

- Adjust the `DaskExecutor` API methods to take `Dataset`s instead of ddfs karlhigley (299)

๐Ÿœ Bug Fixes

- Add some additional mutually exclusive tags to the collisions list karlhigley (316)
- Fix Pandas extension dtype mapping for newer versions of Pandas karlhigley (314)
- Provide better string alias support for dtypes, allow external types to resolve to unknown, fix cuDF struct support karlhigley (313)
- Make `ColumnSelector.all` a property instead of a manually set attribute karlhigley (296)

๐Ÿš€ Features

- Add `Rename` to the set of core DAG ops that work in all DAGs karlhigley (312)
- Enable Compound Tag Selection and Removal to work with atomic tags and strings oliverholworthy (317)
- Add optional `schema` parameter to `from_df` method on `TensorTable` oliverholworthy (286)
- Add `as_tensor_type` method to `TensorTable` for framework column conversion oliverholworthy (285)
- Add support for cuDF's struct dtype karlhigley (309)

๐Ÿ“„ Documentation

- Skip errors if branch tracking fails in `docs-sched-rebuild` oliverholworthy (327)
- Pin numpy version for docs build to ensure we can build the API docs for recent versions oliverholworthy (326)
- Create stable branch locally in `docs-sched-rebuild` to enable stable docs build oliverholworthy (325)
- Build docs for stable branch and make default oliverholworthy (322)

๐Ÿ”ง Maintenance

- remove cupy-cuda11x from tox test environment nv-alaiacano (323)
- Handle schema inference in Dataset with empty list col oliverholworthy (319)
- Convert data formats before executing each op in `LocalExecutor` karlhigley (280)
- Add problem matcher for actionlint to annotate errors oliverholworthy (315)
- Add `actionlint` to pre-commit-config to check for valid GitHub Workflow config oliverholworthy (290)
- Remove optional dependencies from Conda Recipe oliverholworthy (298)
- Remove warning about compound tags deprecation oliverholworthy (256)
- Add workflows to check base branch and set stable branch oliverholworthy (310)
- Update tag pattern in GitHub Workflows oliverholworthy (311)
- Skip package release jobs for dev tags oliverholworthy (305)
- Remove use of deprecated numpy aliases of builtin types oliverholworthy (308)
- don't re-run tests on closed PR nv-alaiacano (307)
- Revert "Adjust the `DaskExecutor` API methods to take `Dataset`s instโ€ฆ karlhigley (306)
- Update packages workflow, separating PyPI from conda build oliverholworthy (300)
- Move build-docs to separate job in packages workflow with Python 3.9 oliverholworthy (302)
- Add Workflow to update the stable branch ref to the latest tag oliverholworthy (303)
- Adjust the `DaskExecutor` API methods to take `Dataset`s instead of ddfs karlhigley (299)
- CI: add quotes to workflow name nv-alaiacano (295)

23.04.00

Whatโ€™s Changed
:warning: Breaking Changes
- Preserve original Dask partitions by default in `[Dataset.to](http://dataset.to/)_parquet` rjzamora (#254)
- Change the location and filename of schema.pbtxt to .merlin/schema.json edknv (249)
:ant: Bug Fixes
- Return a dataframe type that matches `reader` passed to `fetch_table_data` oliverholworthy (287)
- add hack to handle tf not recognizing bool dtype in dlpack jperez999 (276)
- update numpy version to handle dlpack jperez999 (275)
- fix cuda import logic from numba and device memsize jperez999 (274)
- change cpu conversion for tf to convert-to-tensor jperez999 (271)
- fix gpu numpy conversion offsets jperez999 (269)
- Disable strict dtype checking by default karlhigley (268)
- Propagate `_unsafe` flag through column constructors properly karlhigley (264)
- Propagate the `_unsafe` mode flag from `TensorTable` to `TensorColumn` karlhigley (260)
- add import pytest to file jperez999 (229)
:rocket: Features
- Add `column_type` property to `TensorTable` karlhigley (283)
- Extend mapping of nullable types for pandas oliverholworthy (278)
- add 3d tensor support to creating tensor columns jperez999 (246)
- Run with import without gpu jperez999 (261)
- Check environment supports target device in Dataset constructor oliverholworthy (243)
- Support `Dataset` cpu-mode in environment with GPUs that have not been detected oliverholworthy (236)
- Allow casting a `Dimension` to an integer when min and max are the same karlhigley (252)
- Add predicate function argument to `select_by_tag` oliverholworthy (94)
- Add row_group_size argument to [Dataset.to](http://dataset.to/)_parquet rjzamora (#218)
- Enable Schema selection using `select_by_tag` with string representation of `Tags` enum. oliverholworthy (242)
- Add Schema `copy` method oliverholworthy (240)
:wrench: Maintenance
- Update `pull_apart_list` to use `pd.concat` instead of deprecated `Series.append` oliverholworthy (291)
- Install protobuf version compatible with tensorflow 2.9 for Merlin Models tests oliverholworthy (289)
- Add support for from_dlpack with numpy 1.23.0 oliverholworthy (284)
- Save schema in old location for backwards compatibility oliverholworthy (267)
- Refactor `LocalExecutor` into more discrete steps that can be overridden karlhigley (279)
- Preserve type of shape dims as ints when re-loading schema from disk oliverholworthy (281)
- uses compat everywhere to allow container bypass when gpus not present jperez999 (277)
- update numpy version to handle dlpack jperez999 (275)
- fix cuda import logic from numba and device memsize jperez999 (274)
- migrate compat into a separate folder and separate tf and torch import jperez999 (272)
- change cpu conversion for tf to convert-to-tensor jperez999 (271)
- compat imports update jperez999 (270)
- fix gpu numpy conversion offsets jperez999 (269)
- fix configure tf function to id all gpus available jperez999 (266)
- migrate configure tensorflow to core, separate has_gpu from compat jperez999 (265)
- add 3d tensor support to creating tensor columns jperez999 (246)
- Revert 261 and 262 (`merlin.core.compat` changes) karlhigley (263)
- Run with import without gpu jperez999 (261)
- Update `merlin.core.compat` to use `HAS_GPU` and add add'l libraries karlhigley (262)
- Rework DLpack conversion dispatching to allow caching dispatched methods karlhigley (259)
- Add an `unsafe` mode to `TensorTable`/`TensorColumn` (for internal use) karlhigley (258)
- Make `TensorColumn` shape and dtype properties lazy but memoized karlhigley (257)
- Bump `dask`, `distributed`, `fsspec` versions karlhigley (201)
- Move common steps to run tox env into reusable workflow oliverholworthy (247)
- Improve check for array types in `is_list_dtype` oliverholworthy (253)
- Support cupy and numpy array types in `flatten_list_column_values` oliverholworthy (251)
- Update `is_list_dtype` to handle additional types oliverholworthy (250)
- Remove use of HAS_GPU from `dispatch` functions oliverholworthy (244)
- Change the location and filename of schema.pbtxt to .merlin/schema.json edknv (249)
- Add workflow for testing dataloader oliverholworthy (186)
- add import pytest to file jperez999 (229)
- Add correct job dependency for release in `cpu-packages` oliverholworthy (241)

23.02.01

What's Changed

Patch release on top of [v23.02.00](https://github.com/NVIDIA-Merlin/core/releases/tag/v23.02.00)

๐Ÿ”ง Maintenance

- Add pynvml dependency oliverholworthy (237)

**Full Changelog**: https://github.com/NVIDIA-Merlin/core/compare/v23.02.00...v23.02.01

23.02.00

Whatโ€™s Changed

โš  Breaking Changes

- Remove use of `is_list`/`is_ragged` and replace with setting shapes karlhigley (215)
- Add a new `shape` field to `ColumnSchema` karlhigley (195)

๐Ÿœ Bug Fixes

- Save schema with consistent dtype when `dtypes` is used oliverholworthy (182)

๐Ÿš€ Features

- Update HAS_GPU variable to account for `CUDA_VISIBLE_DEVICES` oliverholworthy (221)
- Clean up of make_df function jperez999 (205)
- separate cupy import from rapids jperez999 (211)
- Support partially specified value_count when used with `is_ragged=False` oliverholworthy (213)
- Fix for updated versions of cudf to parquet jperez999 (204)
- Create standard Merlin dtypes in the `merlin.dtypes` module karlhigley (170)

๐Ÿ”ง Maintenance

- Remove use of `is_list`/`is_ragged` and replace with setting shapes karlhigley (215)
- Reduce the overhead of using `LocalExecutor` (esp. dtype validation) karlhigley (219)
- Clean up of make_df function jperez999 (205)
- Add util functions for un/grouping column values/offsets in dicts karlhigley (216)
- Fill in some missing docstrings karlhigley (217)
- Serialize shapes to and from Merlin schema files karlhigley (214)
- Fix for updated versions of cudf to parquet jperez999 (204)
- add gcp label to jenkinsfile AyodeAwe (181)
- Add a new `shape` field to `ColumnSchema` karlhigley (195)
- Increase upper bound of `pandas` version from 1.4 to 1.6 oliverholworthy (210)
- Update pre-commit config with latest versions of repos oliverholworthy (208)
- Install latest version of NVTabular/dataloader with systems tests oliverholworthy (209)
- Add note on why we're using `device_get_count` instead of `cuda.gpus` oliverholworthy (207)
- Add Formatter (Prettier) for YAML and Markdown files karlhigley (199)
- Change the name of the package building action karlhigley (198)
- Split CPU tests and building packages for release into separate actions karlhigley (197)
- Simplify `ColumnSchema.with` methods using `dataclasses.replace()` karlhigley (194)
- Handle executor transform case when parent node provides no new columns oliverholworthy (226)
- Update Models/NVTabular test config oliverholworthy (185)
- skip notebook tests in models test edknv (193)
- add a build pandas column api for easier multihot column creation jperez999 (183)
- Use pre-commit for linting in GitHub Actions Workflow oliverholworthy (184)
- Convert to cudf.Series in create_multihot_col oliverholworthy (187)
- adding workflow for GPU CI on gha jperez999 (191)

0.10.0

Whatโ€™s Changed

๐Ÿœ Bug Fixes

- Fix file-count warning in Dataset.to_parquet rjzamora (159)
- Remove the property annotation from Transformable.columns karlhigley (166)
- Update value_count serialization/deserialization to be consistent with original schema oliverholworthy (111)
- Fix feature.shape attribute in from_merlin_schema rjzamora (169)
- Add the schema to the output of the .repartition() method sararb (192)

๐Ÿš€ Features

- Read parquet statistics to optimize len when they are missing rjzamora (178)
- Change is_ragged property based on value_count in with_properties oliverholworthy (172)
- add is_list detection for merlin columns jperez999 (180)
- Enable partial value count to be specified oliverholworthy (171)

๐Ÿ“„ Documentation

- docs: Add temp semver to calver banner mikemckiernan (161)

๐Ÿ”ง Maintenance

- Remove specifying is_ragged in LocalExecutor _transform_data oliverholworthy (173)
- Add Jenkinsfile AyodeAwe (167)
- Fix concat_columns for DataFrames with list features oliverholworthy (165)
- update drafter to work on tags & update cpu ci to target branches jperez999 (174)
- Remove explicit DictArray reference from merlin.core.dispatch karlhigley (163)

Page 1 of 3

ยฉ 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.