Arcticdb

Latest version: v5.1.2

Safety actively analyzes 688775 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 6

4.0.1

This is a patch release to version 4.0 that backports some fixes from master.

🚀 Features
- Allow ampersand in symbol names (952)
🐛 Fixes
- Backport minor fixes to 4.0.x (954), not user facing


---
> The wheels are on [Pypi](https://pypi.org/project/arcticdb/). Below are for debugging:

4.0.0

⚠️ API changes

For `Library.get_description_batch`, `Library.read_metadata_batch` and `Library.write_batch`, a `DataError` object will now be returned in the position in the list returned corresponding to the symbol/version pair there was an issue reading/writing. Note this may require code changes to support the **new error handling behaviour** - as a result it is being considered a breaking change as described above.

- get description batch method: method rationalisation (814)
- read metadata batch method: method rationalisation (814)
- Write batch method: method rationalisation (814)

🚀 Features

- **Pandas 2.0 support** (343) (540) (804) (846)
- Modifications have been made to the normalisation and denormalisation processes for `pandas.Series` and `pandas.DataFrame` to match the new defaults in pandas 2.0.
- Handling of 0-row DataFrames for improved correctness and usability.
- Empty Column are now properly handled, especially regarding the change of defaults for empty collections for Pandas 1.X and Pandas 2.X.
- Extended the tests to reflect changes in behaviour due to pandas 2.0's new defaults.
- Please note, **PyArrow remains unsupported in this integration**.
- conda-build: Bring support for Azure Blob Storage (840) (854) (853) (857)
- Add uri support for mongodb (761)
- Code coverage analysis and report workflow (783) (784)
- Add documentation with doxygen (736)

🐛 Fixes

- Update support status: Pandas DataFrame and Series backed by PyArrow are not supported (882)
- Added pymongo to the list of installation dependencies (891)
- Resolved dependency issues for the mergeability check step (822)
- Fixed issue where AWS authentication wasn't used, even though the option was enabled (843)
- Resolved issue of early read termination in 'has_symbol' (836)
- Test: Ensured that `QueryBuilder` is pickleable with all possible clauses (861)
- Fixed issue with the 'latest_only' option for the 'list_versions' method (839)
- Added the ability for users to specify LMDB map size in the Arctic URI (811)
- Fixed issue 767: Segfault in batch write with string columns has been resolved (827)(874)
- Renamed ArcticNativeNotYetImplemented in a way that maintains backward compatibility, to fix issue 774 (821)
- Modified Azure SDK to favour winhttp over libcurl on Windows for improved SSL verification (851)
- Updated the maximum batch size for Azure operations (878)


<details>
<summary>Uncategorized</summary>

- Maintenance: Added a minimal Security Policy (823)
- Fixed documentation following an exception renaming (824)
- Resolved issues in the publish step (825)
- Added documentation for setting LMDB map size (826)
- Incorporated notebooks into the documentation (844)
- Maintenance: Removed unused definitions from protocol buffers (856)
- Enhanced error handling to fail document build on Sphinx errors (883)
- Maintenance: Replaced deprecated ZSTD_getDecompressedSize function (855)
- Refactored non-functional library manager, addressing Issue 812 (828)
- Made minor improvements to the documentation (841)
- Improved handling of the deprecated S3 option "force_uri_lib_config" (833)
- Corrected the release date of version 3.0.0 in README.md (858)

</details>

---
> The wheels are on [Pypi](https://pypi.org/project/arcticdb/). Below are for debugging:

3.0.0

🔒 Security + Forwards Incompatible Change
- S3 and Azure: Do not save sensitive or ephemeral config in the config library (https://github.com/man-group/ArcticDB/pull/803)

This fixes a security issue with ArcticDB where creds were kept in storage for:
- Azure
- AWS if the access keys are supplied in the URI instead of aws_auth=True.

[These instructions](https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/upgrade_storage.md) explain how to upgrade your storage to remove the credentials. See also issue https://github.com/man-group/ArcticDB/issues/802 .

Compatibility matrix

<table>
<tr>
<th>Storage</th>
<th>Library created with < v3. <br>Library accessed with >= v3.</th>
<th>Library created with or upgraded to >= v3. <br>Library accessed with < v3. </th>
</tr>
<tr>
<th>S3 with <code>aws_auth=True</code></th>
<td>Continues to work</td>
<td>
Raises <code>InternalException: E_INVALID_ARGUMENT S3 Endpoint must be specified</code>.<br>
Will work again if <code>access=_RBAC_&secret=_RBAC_&force_uri_lib_config=true</code> is in the URI passed to <code>Arctic()</code></td>
</tr>
<tr>
<th>S3 with <code>access</code> and <code>secret</code>.</th>
<td rowspan="2"><p>Will now use the creds passed to <code>Arctic()</code>, but should continue to work if the creds are sufficient.
<p>A future release might print a warning with instructions to upgrade.</td>
<td>Raises <code>InternalException: E_INVALID_ARGUMENT S3 Endpoint must be specified</code>.<br>
Will work if <code>force_uri_lib_config=true</code> is in the URI passed to <code>Arctic()</code></td>
</tr>
<tr>
<th>Azure</th>
<td>Operations on the library will fail with various internal error messages</td>
</tr>
</table>

Full details:

What's happened?
Whilst reviewing our codebase we discovered a way that access-keys for ArcticDB storage backends could be saved into the storage in clear text.

This behavior was by design, but there is a chance that this has happened for some third-party users without being obvious.
This depends on the backend used and how you connect to the storage.

What is the exact scope of the issue?
If you created an ArcticDB library, either with an S3 bucket and passed the access-keys as part of the URI, or with Azure Blob Storage with the access-keys as part of the connection-string, then the credentials were saved into the storage account as part of the ArcticDB library config.
If you then shared that storage account with others using different roles or access-keys, then those users would in theory have been able to access the credentials used to create the library.

What have you done to address this?
We've updated ArcticDB so that all new libraries do not do this, even if the credentials are passed in with the URI/connection-string.
We've prepared a storage-update script which you can run to see if the credentials are there, and then remove them if they are.

What is the impact if I am affected?
If you have shared that storage account with anyone else using different roles/credentials, then your original credentials have also been accessible to those users.
It's possible those users recorded the credentials, and because those credentials must have had write-access to create the library, they could have made changes to the data or otherwise used those credentials.

What can I do to check if I'm affected?
See these [instructions](https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/upgrade_storage.md).

If needed you can check on previous versions of ArcticDB using the code referenced on github:
https://github.com/man-group/ArcticDB/issues/802#issuecomment-1697814768

What should I do if I am affected?
Follow these [instructions](https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/upgrade_storage.md).

This change is not forwards compatible, so users on earlier clients may need to upgrade:
- S3 libraries created with 3.0.0 will not be readable by earlier ArcticDB versions unless force_uri_lib_config=True in their connection string.
- Azure libraries created with 3.0.0 will not be readable by earlier ArcticDB versions.

Then,
- Rotate your credentials.
- If you've shared access to that storage account then please also check the integrity of your data and anything else accessible via those credentials.

What was the cause?
Previous use cases of ArcticDB had split storage accounts. One account was used to configure libraries and other accounts held the data for those libraries. Credentials to read those data-libraries were then stored into the configuration account and passed to users as needed for access to the data. This code was not caught during our review, and so was not disabled or removed when we made ArcticDB available to others. When we added Azure Blob storage support subsequently, the side-effect of saving anything in the connection-string to storage was not anticipated.

Having reviewed the codebase again we are confident that this was the only way that credentials could be saved into storage using our public API.

We plan to continue supporting our split storage solution for some users, but it should always be very clear when access-keys are being stored and what the risks are for that.

🚀 Features
- Conda-forge build now supports Azure Blob Storage
- Enhancement/728/make iclause responsible for processing structure (https://github.com/man-group/ArcticDB/pull/752)
- Add more info in the CI readme; Prepare var for real storage tests (https://github.com/man-group/ArcticDB/pull/663)
- **Enhancement 702: Add option to create library if it does not exist when calling get_library (https://github.com/man-group/ArcticDB/pull/775)**
- **Enhancement 714: Expose library methods to list symbols with staged data, and to delete staged data (https://github.com/man-group/ArcticDB/pull/778)**
- Enhancement 737: Support empty-type columns in QueryBuilder operations (https://github.com/man-group/ArcticDB/pull/794)
- conda-build: Adapt C++ test suite for Linux (https://github.com/man-group/ArcticDB/pull/713)

🐛 Fixes
- conda-build: Use default compilers for macOS (https://github.com/man-group/ArcticDB/pull/662)
- Bugfix/nativeversionstore write metadata batch should never return dataerror objects (https://github.com/man-group/ArcticDB/pull/782)
- Add handling of unspecified ca path in azure uri (https://github.com/man-group/ArcticDB/pull/771)
- Add dep. on packaging (https://github.com/man-group/ArcticDB/pull/795)
- Fix get_num_rows for NativeVersionStore (https://github.com/man-group/ArcticDB/pull/800)

<details>
<summary>Uncategorized</summary>

- First version of AWS S3 setup guide (https://github.com/man-group/ArcticDB/pull/708)
- fix(docs): central docs URL from API docs homepage (https://github.com/man-group/ArcticDB/pull/755)
- Add none type (https://github.com/man-group/ArcticDB/pull/646)
- Azure getting started guide (https://github.com/man-group/ArcticDB/pull/749)
- Docs fixes (https://github.com/man-group/ArcticDB/pull/762)
- Decouple storage headers from implementations & storage.hpp (https://github.com/man-group/ArcticDB/pull/763)
- Bugfix 554: Remove unused argument from write_batch (https://github.com/man-group/ArcticDB/pull/769)
- Partially revert https://github.com/man-group/ArcticDB/pull/763 for consistency (https://github.com/man-group/ArcticDB/pull/766)
- Make it clear to not commit directly to ArcticDB feedstock but use PRs instead (https://github.com/man-group/ArcticDB/pull/741)
- maint: pandas 2.0 forward compatible changes (https://github.com/man-group/ArcticDB/pull/540)
- test: Test the absence of implace modification on datetime64 normalization for pandas 2.0 (https://github.com/man-group/ArcticDB/pull/801)
- Update README.md (https://github.com/man-group/ArcticDB/pull/799)
- test: Remove test for fallback to pickle (https://github.com/man-group/ArcticDB/pull/805)
- Docs - update release number (https://github.com/man-group/ArcticDB/pull/816)
- conda-build: Pin cmake (https://github.com/man-group/ArcticDB/pull/815)
- Update releasing.md (https://github.com/man-group/ArcticDB/pull/817)
- ArcticDB 3.0.0 update BSL table (https://github.com/man-group/ArcticDB/pull/820)
</details>

2.0.0

Not secure
This version contains breaking changes to the ArcticDB API. As per the [SemVer versioning scheme](https://semver.org/), we have bumped the major version.

⚠️ API changes

- Write batch metadata method: method rationalisation (476)

- Append batch metadata method: method rationalisation (548)

For Library.write_metadata_batch and Library.append_batch, a DataError object will now be returned in the position in the list returned corresponding to the symbol/version pair there was an issue writing. Note this may require code changes to support the new error handling behaviour - as a result it is being considered a breaking change as described above.

See the [docs](https://docs.arcticdb.io/api/library#arcticdb.version_store.library.Library.read_batch) for `read_batch`, which uses the same exception return mechanism.

- (Minor) The internal protobuf field `arcticc.pb2.descriptors_pb2.TypeDescriptor.MICROS_UTC` has changed name to `NANOSECONDS_UTC`. This is only visible via the `Arctic` API as a string via `get_description` & `get_description_batch` on `dtype` attributes, so external users will only be affected by this if you are parsing these strings.

🚀 Features

- Projections, group-by, and aggregations added to the processing framework (712)
- Reduce memory footprint of head and tail methods (583)
- Per symbol parallelisation for write batch metadata method (476)
- This can result in significant performance improvements when using this method over many symbols.
- Per symbol parallelisation for append batch metadata method (548)
- This can result in significant performance improvements when using this method over many symbols.

🐛 Fixes

- Ensure content hash is copied during restore version + fixing timestamp-uniqueness-related flaky tests (600)
- Restrict supported string types to type equality rather than has isinstance (704)
- Incorrect initialisation of LoadParameter::load_from_time_ (697)
- Ensure compact_incomplete and recursive normalization obey the library setting for pruning (705)

<details>
<summary>Uncategorized</summary>

- Update release process to detail the process for pre-releases (688)
- Unify release and pre-release hotfixing (725)
- Skip test_diff_long_stream_descriptor_mismatch on MacOS (693)
- maint: Remove `VariantStorage` (695)
- maint: Rename `datetime64[ns]`-related fields and datatypes (592)
- run C++ tests for conda build / ci (486)

</details>

---
> The wheels are on [Pypi](https://pypi.org/project/arcticdb/). Below are for debugging:

1.6.2

Not secure
This release is a patch release, backporting bug-fixes to v1.6.1

🐛 Fixes
---
- Fixed a bug in key data retrieval, which could lead to incorrect behavior and segmentation faults (912 )

---
> The wheels are on [Pypi](https://pypi.org/project/arcticdb/). Below are for debugging:

1.6.1

Not secure
🐛 Fixes

- Add a more strict check for chars in the symbol names (627)
- Fix `as_of` with timestamp reading entire version chain rather than just reading up-to the required version (596)
- `as_of=<timestamp>` reads will be significantly faster for symbols with many versions
- Fix to ensure batch prune previous methods clean up index and data keys as well as version keys (https://github.com/man-group/ArcticDB/pull/623)
- Only log ErrorCategory::INTERNAL errors (https://github.com/man-group/ArcticDB/pull/676)
- Enable importing DataError from arcticdb (https://github.com/man-group/ArcticDB/pull/657)
- Refactor underlying segment write scheduling (532)
- This can result in a significant performance improvement for large writes

<details>
<summary>Uncategorized</summary>

- Remove SemVer version validation as it doesnt support version strings such as 1.6.1rc0 (678)
- Docs 672: Add docstring for DataError class (686)
- Black output diff rather than just erroring in build (685)
- Correct docs for read_batch return type (https://github.com/man-group/ArcticDB/pull/664)

</details>

---
> The wheels are on [Pypi](https://pypi.org/project/arcticdb/). Below are for debugging:

Page 4 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.