Th2-data-services

Latest version: v2.0.0

Safety actively analyzes 706267 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 6

2.0.0

User impact and migration instructions

By installing the package you will no longer get RDP package.
If you want to use RDP you have to specify dependency in square brackets `[ ]`

1. [I] Adapter interface got required handle_stream method.\
[M] Implement new method for your adapters.

2. [I] It's no longer possible to import Data object directly from
th2_data_services package.\
[M] All records should be changed from "from th2_data_services import Data"
to "from th2_data_services.data import Data".

3. [I] Provider module is removed.\
[M] You should use data source implementations, like th2-ds-source-lwdp.

4. [I] INTERACTIVE_MODE cannot be accessed like
th2_data_services.INTERACTIVE_MODE anymore.\
[M] It's now changed to th2_data_services.config.options.INTERACTIVE_MODE

5. [I] EventsTree renamed to EventTree\
[M] All records should be changed to EventTree

6. [I] Message utils method `expand_message` moved into `MessageFieldResolver`.\
[M] Implement new method in your resolver.

7. [I] Data iteration logic is changed.\
Why? Current behavior causes problems in some cases. E.g. when we don't want
to iterate objects inside the DataSet.

[I.1] Lists and tuples used in building Data objects are treated as single
item and items inside them aren't iterated anymore.\
[M.1] Update Data objects initialized with lists or tuples.

[I.2] Change in iteration logic also changed how `map` function behaves.
If `map` function returns lists or tuples their content won't be iterated
anymore.\
[M.2.1] If you are interest previous `map` function behavior, just update `map`
to `map_yield`.

[M.2.2] Update `data.map(mfr.expand_message)` to `data.map_yield(mfr.expand_message)`

[I.3] Data object will not iterate over contents of its stream if any of the
items are iterables (but not Data object).\
It means that Data object will not iterate lists and tuples inside the
provided DataSet and will return they as is.\
Only exception will be if all of the items are Data objects themselves.\
[M.3] Update nested lists in Data object initializations to either Data
objects or switch to using addition operator.\
[Examples]\
`d1 = Data(['a', 'b'])`\
a. `Data([1, 2, [3, 4], d1])` will yield 1,2,[3,4],d1. Prev. behavior: 1,2,3,4,'a','b'\
b. `Data([d1, d2])` where d1 and d2 are Data objects. It will yield from d1, and after that yield from d2.\
c. You can update the example from `a` to `Data([1,2,3,4])` or to `new_data = Data([1,2]) + Data([3,4]) + d1`.\
d. You also can return prev behaviour doing the following: `new_data = Data([1, 2, [3, 4], d1]).map_yield(lambda r: r)`

8. [I] A new version of `orjson` lib require python 3.8+.
[M] Change your python version if you use 3.7 to 3.8+.

Features

1. [TH2-4128] pip no longer installs RDP by default
2. [TH2-4128][TH2-4738] extra dependencies can be installed using square brackets after
package name.
- Example: `pip install th2-data-services[lwdp]`

Available data sources implementations:

| dependency name | provider version |
|:-----------------:|---------------------------------------|
| lwdp | latest version of lwdp |
| lwdp2 | latest version of lwdp v2 |
| lwdp3 | latest version of lwdp v3 |
| utils-rpt-viewer | latest version of utils-rpt-viewer |
| utils-rpt-viewer5 | latest version of utils-rpt-viewer v5 |
| utils-advanced | latest version of ds-utils |

3. [TH2-4493] Adapter interface got handle_stream method.
4. [TH2-4490] Added `map_stream` method to Data.
- Almost same as `map`, except it's designed to handle a stream of data
rather than a single record.
- Method accepts a generator function or a class which implements
IStreamAdapter with generator function.
5. [TH2-4582] IAdapter interface removed.
- IStreamAdapter interface added to handle streams.
- IRecordAdapter interface added to handle single record.
- Method accepts Generator function or IStreamAdapter interface class with
Generator function.
6. [TH2-4609] Data.filter implementation changed to use `yield`.
7. [TH2-4491] metadata attribute added to Data. It will contain request urls.
8. [TH2-4577] map method now can take either Callable function or Adapter which
implements IRecordAdapter.
9. [TH2-4611] DatetimeConverter, ProtobufTimestampConverter converters added.
10. [TH2-4646]
- metadata gets carried when using Data methods.
- update_metadata method added to update metadata.
11. [TH2-4684] Tree names changed from plural to singular. (e.g Event**s**
Tree -> EventTree)
12. [TH2-4693] Implemented namespace packages structure, allowing other th2
libraries to be grouped together.
13. [TH2-4713] Added options module which enables user to tweak library
settings.
14. `DummyDataSource` added.
15. [TH2-4881] `Data.from_json` method was added.
16. [TH2-4919] `Data.from_any_file` method was added.
17. [TH2-4928] `Data.from_csv` method was added.
18. [TH2-4932] `Data.to_json` method was added. Puts your data to a valid json
object.
19. [TH2-4957] Added `gzip` option for `Data.to_json` method.
20. [TH2-4957] Added `decompress_gzip_file` function to utils.converters.
21. Added `to_csv` method to `PerfectTable` class.
22. `utils.converters.flatten_dict` converter added.
23. Added `Data.to_jsons` method that put your data object to jsons file
(file where every line is separate json-format line. That's not a valid json
format.)
Renamed `to_jsons` to `to_json_lines` later.
- to_jsons -- is deprecated now.
24. [TH2-5049] Added ExpandedMessageFieldResolver
25. [TH2-5053] Added `pickle_version` to Data.from_cache_file method.
26. `decode_base64` function added to converter utils.
27. [TH2-5156] `UniversalDatetimeStringConverter` and `UnixTimestampConverter`
added.
28. [TH2-5167] `Data.is_sorted`, `event_utils.is_sorted`, `message_utils.is_sorted`
and `stream_utils.is_sorted` methods were added.
29. [TH2-5176] `to_th2_timestamp` method was added for converters.
30. [TH2-5081] Added `map_yield` function, that should behave similar to
old `map` method.
That means that `map_yield` will iterate lists and tuples if the user map
function returns them.
31. [TH2-5197] Added the function `read_all_pickle_files_from_the_folder`
to get Data object from the folder with pickle files.
32. [TH2-5213] Added `Data.to_csv` method, that converts data to valid csv.
33. [TH2-4900] Added `Data.sort` method, that also works with large amount of Data.

BugFixes

1. [TH2-4711] EventTreeCollection max_count parameter of findall functions
worked wrongly.
2. [TH2-4917] Readme duplicates removed.
3. [TH2-5083] Fixed comparison line formatting. Every event in block isn't
formatted as failed now if parent is failed.
4. [TH2-5081] Fixed iteration bug for case where Data object was made using
lists and tuple.
5. [TH2-5100] Fixed bug when we get Recursion Exception if we have too much
number of Data objects that iterate each other.
6. [TH2-5190] Fixed Data.to_json
7. [TH2-5193] orjson versions 3.7.0 through 3.9.14 library has vulnerability
https://devhub.checkmarx.com/cve-details/CVE-2024-27454/.
8. [TH2-5201] Fixed DatetimeStringConverter.to_th2_timestamp() bug which occurred for inputs not ending with 'Z'.
9. [TH2-5902] Fixed bug when cache file was removed after calling data.show().
10. [TH2-5220] Fixed bug when Data.update_metadata() would change a string into a list.
11. [TH2-5101] Fixed bug when merging date objects via + or += overwrites the source file.

Improvements

1. Added vulnerabilities scanning
2. [TH2-4828] EventNotFound and MessageNotFound now return error description as
argument instead of pre-written one.
3. [TH2-4775] Speed up `Data.build_cache` by disabling garbage collection at the
time of storing pickle file.
4. [TH2-4901] Added gap_mode and zero_anchor parameters for message and event
utils get_category_frequencies methods.
[See doc](documentation/frequencies.md)
5. [TH2-5048] Added typing hints for resolver methods.
6. [TH2-5172] Add faster implementations of the following
ProtobufTimestampConverter functions: to_microseconds, to_milliseconds,
to_nanoseconds.
7. [TH2-5081] `Data.__str__` was changed --> use `Data.show()` instead of `print(data)`
8. [TH2-5201] Performance improvements have been made to converters:
9. [TH2-5101] Data.update_metadata() now takes `change_type` argument (values: `update` default, `change` which denotes
whether to update or overwrite with new values.
10. [TH2-5099] Fixed slow iteration for Data objects created with many addition operators.

Benchmark.
- 1mln iterations per test
- input: 2022-03-05T23:56:44.123456789Z

| Converter | Method | Before (seconds) | After (seconds) | Improvement (rate) |
|----------------------------------|------------------|------------------|-----------------|--------------------|
| DatetimeStringConverter | parse_timestamp | 7.1721964 | 1.4974268 | x4.78 |
| | to_datetime | 8.9945099 | 0.1266325 | x71.02 |
| | to_seconds | 8.6180093 | 1.5360991 | x5.62 |
| | to_microseconds | 7.9066440 | 1.7628856 | x4.48 |
| | to_nanoseconds | 7.6787507 | 1.7114960 | x4.48 |
| | to_milliseconds | 7.6059985 | 1.7688387 | x4.29 |
| | to_datetime_str | 8.3861742 | 2.3781561 | x3.52 |
| | to_th2_timestamp | 7.7702033 | 1.5942235 | x4.87 |
| UniversalDatetimeStringConverter | parse_timestamp | 7.4161371 | 1.5752227 | x4.7 |
| | to_datetime | 8.2108218 | 0.1267797 | x64.76 |
| | to_seconds | 7.7745484 | 1.6453126 | x4.72 |
| | to_microseconds | 7.7569293 | 1.8240784 | x4.25 |
| | to_nanoseconds | 7.7879700 | 1.7930200 | x4.34 |
| | to_milliseconds | 7.8168710 | 1.8308856 | x4.26 |
| | to_datetime_str | 8.7388529 | 2.4592992 | x3.55 |
| | to_th2_timestamp | 7.8972679 | 1.6856898 | x4.68 |

Other converters also have some not big speed improvements.

9. [TH2-5213] Extend cache_files_reading_speed with csv support.

1.3.1

Improvements

1.3.0

User impact and migration instructions
This release implements performance bug fixes and provides Data object cache file saving and loading.

1. [I] Logging were removed from library. Only special builds will have logging.
User cannot use `add_stderr_logger` and `add_file_logger` logging functions.
[M] Remove DS lib logging usage anywhere.
2. [I] Since `v1.3.0`, the library doesn't provide data source dependencies.

[M] You should provide it manually during installation.
You just need to add square brackets after library name and put dependency name.


pip install th2-data-services[dependency_name]


**Dependencies list**

| dependency name | provider version |
|:--------:|:-------:|
| RDP5 | 5 |
| RDP6 | 6 |

**Example**


pip install th2-data-services[rdp5]


Features
1. [TH2-4289] Data.build_cache and Data.from_cache_file features were added.
2. Added `Data.cache_status` property

Improvements
1. [TH2-4379] Speed improvements in json deserialization.
- StreamingSSEAdapter will now handle bytes from sse-stream into Dict objects.
- SSEAdapter is now deprecated class.
2. Data object will generate a warning if you put to it an object that has generator type.

BugFixes
1. [TH2-4385] Logging in Data object slows down the ds library very much.
- Logging was removed.
- `add_stderr_logger` and `add_file_logger` are not available anymore.
2. [TH2-4380] Fixed apply_adpater feature for GetMessages / GetEvents / GetEventById / GetMessageById
3. [TH2-3767] Fixed bug with limit of Data object in Windows.
4. [TH2-4460] Fixed bug where GRPC omitted fields with None value in response.

1.2.3

BugFixes

1. [TH2-4234] The library can now be run on Windows.

1.2.2

BugFixes

1. [TH2-4195] EventsTree without parent raises `EventIdNotInTree` exception when trying to use `get_parent()` method

1.2.1

BugFixes
1. Added missing library importlib_metadata

Page 1 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.