Daffodil

Latest version: v0.5.4

Safety actively analyzes 688823 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.5.4

Add sort_by_colnames(self, colnames:T_ls, reverse: bool=False, length_priority: bool=False)
Add daf_utils.sort_lol_by_cols()
Add argument 'omit_nulls' to .to_list() method.
Change references to klist.values to ._values to avoid amiguity with property getter and setters.
Add annotate_daf(self, other_daf: 'Daf', my_to_other_dict: T_ds) to effectively join two tables.
Fix value_counts_daf() by adding .to_list for total.

0.5.3

added tests:
flatten()
to_json() not completely working.
from_json() not completely working.
added __format__ to allow use of {:,} and other f-string formatting. Invokes .to_value()
added alias for valuecounts_for_colname() to value_counts() to match Pandas syntax.

extend .iloc to support klist and list rtypes.
Added .to_klist() to return a record as KeyedList type.
extended .assign_col() to insert a column if the colname does not exist.
Enhanced KeyedList() to allow both args to be None, and thus initialize to empty KeyedList.
insert_col_in_lol_at_icol():
fix bug if icol resolves to add a column. --> '>' changed to '>='
allow empty lol and create a lol with one column if col_la exists.
Add .iter_list() to allow iteration over just lol without cols defined.
fixed __format__ so unadorned daf name prints summary. It takes more than {daf:} in fstring to cause formatting.
Improved robustness of num_cols() to check first few rows.
TODO: It will probably be better to keep a value of the num cols and not calculate evertime.
changed name of values in KeyedList to _values and created accessors.
added support for Iterables passed for row and col selection.
Added method "remove_dups()" which returns unique records and duplicated records based on keyfield.
Changed operation of assign_col to append col to right if colname not exist.

worked around error in Pympler.asizeof.asizeof() function, used in daf_benchmarks.
this appears to be resolved in future updates of pympler.

0.5.2

v0.5.1 (2024-05-25)
changed dependencies in pyproject.toml so they would allow newer versions.
Upgraded to Python 3.11 and upgraded all libraries to the latest.
Using venv311

v0.5.2 (2024-05-30)
Added .iter_dict() and .iter_klist() to force iteration to produce either dicts or KeyedLists.
Producing KeyedLists means the list is not copied into a dict but can be mutated and the lol will be mutated.
Correct calculation of slice_len to correct column assignment from another column
This may still have some ambiguity if a nested list structure is meant to be assigned to an array cell.
collist = my_daf[:, 'colname'].to_list() this will return a list, but sometimes of only one value.
my_daf[:, 'colname2'] = collist there is ambiguity here as to whether the list with one
item should be placed in the cell or if just the value.

0.5.0

v0.5.0 (2024-05-23)
Added split_where(self, where: Callable) which makes a single pass and splits the daf array in two
true_daf, false_daf.
Added to Daffodil multi_groupby(), reduce_dodaf_to_daf() and multi_groupby_reduce()
Added class KeyedList() to provide a new data item that functions like a dict but is a hd plus list.
can result in much better performance by not redistributing values in the dict structure.
This is not yet integrated into daffodil fully.

Removed '_da' from many Daffodil methods and for keyword parameters, to allow future upgrade to KeyedList.
select_record_da() -> select_record()
record_append()
_basic_get_record_da -> _basic_get_record
assign_record_da() -> assign_record()
assign_record_da_irow -> assign_record_irow
update_by_keylist()
update_record_da_irow -> update_record_irow
changed test_daf accordingly.

Added _build_hd() to consistently build header dict structure.
Added to_json() and from_json() methods to allow generation of custom JSONEncoder.
Changed nomenclature in KeyedList class from dex to hd.
Added from_json and to_json to KeyedList class to allow custom JSONEncoder to be developed.

select_record() silently returns {} if self is empty.

fixed _itermode vs. itermode.
Added .strip() method.
correct icols when providing a single str column name, and when column names have more than one character each.
Added 'flatten' in '.to_list' method which will combine lol to a single list.
Added .num_rows() which will more robustly calculate the number of rows in edge cases.
Fix unflattening issue discovered when running edge_test_utils.py.
Updated documentation to reflect new approach to dtypes and flattening.

0.4.2

v0.4.2 (2024-05-01)
Modified packaging for package distribution on PyPI to hopefully make it compatible with installing into AWS Lambdas.
Tried to use pyproject.toml and flit, but flit has poor toml parsing it seems, and could not find a suitable toml file.
Went back to setup.py and setuptools, but reorganized files into daffodil/src folder which will be included in the distro.
To use --editable mode for local development, must set PYTHONPATH to refer to the daffodil/src folder.
In that folder is daffodil/daf.py and daffodil/lib/daf_(name).py
To import supporting py files from lib, must use import daffodil.lib.daf_utils as utils, for example.

0.4.1

v0.4.0 (2024-04-30)
v0.4.0 Better dtypes support; apply_dtypes(), flatten(), copy()
added disabling of garbage collection during timing, getting more consistent results, but does not explain anomaly.
Improved philosophy of apply_dtypes() and flatten()
Upon loading of csv file, set dtypes and then use my_daf.apply_dtypes()
Before writing, use my_daf.flatten() to flatten any list or dict types, if applicable.

apply_dtypes() now handles the entire array, and will skip str entries if initially_all_str is True.

unflatten_cols() DEPRECATED. use apply_dtypes()
unflatten_by_dtypes() DEPRECATED. use apply_dtypes()
flatten_cols() DEPRECATED. use flatten()
flatten_by_dtypes() Renamed: use flatten()

added optional dtypes parameter in apply_dtypes() which will be used to initialize dtypes in daf object and
use it to convert types within the array.
Changed from la type to interable in reduce()
added disabling of garbage collection in daf_benchmarks.py
deprecated functions dealing with hdlol type which was a precursor to daf.
added convert_type_value() to convert a single value to a desired type and unflatten if enabled.
removed use of set_dict_dtypes from apply_dtypes() and instead it is done on the entire daf array for efficiency.
added in daf_utils.py unflatten_val(), json_decode(), validate_json_with_error_details, and safe_convert_json_to_obj.
Added .copy(deep:bool) method to match pandas syntax.
Added reference to 1994 workshop in flatten() method docstr.
Changed packaging code from setup.py approach to pyproject, but still not able to correctly import in Lambdas container.

v0.4.1 (2024-04-30)
fixed tests to reflect changes to type conversion paradigm.
Changed apply_dtypes parameter 'initially_all_str' to 'from_str'
fixed set_dict_dtypes() in the case of dtypes = {}; Changed parameter to 'dtypes' for uniformity.
set_dict_dtypes() now also modifies types in-place.

Fixes: https://github.com/raylutz/daffodil/issues/10

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.