Dimcat

Latest version: v3.2.0

Safety actively analyzes 623541 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

3.2.0

[3.2.0](https://github.com/DCMLab/dimcat/compare/v3.1.0...v3.2.0) (2024-01-30)

Highlights

* new property `DimcatResource.metadata`, populated by`Dataset`
* New `PrevalenceAnalyzer` and its result types:
* PrevalenceMatrix
* RelativePrevalenceMatrix
* CulledPrevalenceMatrix
* CulledRelativePrevalenceMatrix
* GroupwisePrevalenceMatrix
* additional plot types
* make_line_plot
* make_scatter_3d_plot

Features

* adds basis of PrevalenceAnalyzer and PrevalenceMatrix(Result) ([1f5469f](https://github.com/DCMLab/dimcat/commit/1f5469fc54fd49be678f9fe90a056f1d274d1c1d))
* adds DimcatResource.join_on_index() ([269c12c](https://github.com/DCMLab/dimcat/commit/269c12c10aca289f76c5149df32319c2975ff0c7))
* adds methods .get_culled_matrix() and .get_relative_matrix() to PrevalenceMatrix, together with the relevant subclasses as result types ([523418d](https://github.com/DCMLab/dimcat/commit/523418dce8b1f660078c88d61ca5afc0e2c873db))
* adds plotting.make_line_plot(), factoring out _make_plots() boilerplate that all plotting function share ([a1ba183](https://github.com/DCMLab/dimcat/commit/a1ba1832ce71bdab085dfe92245ae66cd892170a))
* adds plotting.make_scatter_3d_plot() ([ec13ba8](https://github.com/DCMLab/dimcat/commit/ec13ba83072a2afe6a24ffe7f152df16e54bf476))
* adds property DimcatResource.metadata which the Dataset populates upon feature extraction but is not serialized ([c5e2eee](https://github.com/DCMLab/dimcat/commit/c5e2eeea3ffaf3f67f3aaebe6ccacbc6a97eafe5))
* adds relevant properties and methods to PrevalenceMatrix ([328fff6](https://github.com/DCMLab/dimcat/commit/328fff623bbb10264775925ef3c91bafd078621b))
* adds utils.str2pd_interval ([da24cb9](https://github.com/DCMLab/dimcat/commit/da24cb9853b347d77bc569512fb76d6e1fdc41d1))
* allows friendly comparison for FriendlyEnums, such as "desc" == SortOrder.DESCENDING -> True ([03d57f6](https://github.com/DCMLab/dimcat/commit/03d57f6315810ca6d74e65c10366fe9a0b1b6346))
* implements PrevalenceAnalyzer.compute() staticmethod, .__init__(), .groupby_apply() and Schema ([25d558b](https://github.com/DCMLab/dimcat/commit/25d558b376aa5fc3842da20ee1ca32c6636fc6ec))
* implements PrevalenceMatrix.combine_results() ([35fbb39](https://github.com/DCMLab/dimcat/commit/35fbb393b7b8fcc26783f1b142e7707fa6e9c8ec))

Bug Fixes

* catches pandas 2.2.0 warning(s) ([88e86bd](https://github.com/DCMLab/dimcat/commit/88e86bd60b2466afceb9dcaa3127f05b335568f2))
* enables DimcatResource.from_resource_path() by expecting a "corpus" and a "piece" column ([6c70e61](https://github.com/DCMLab/dimcat/commit/6c70e617485b09b382abc4aa68e327cc53c92981))
* import Self from typing_extensions (not typing) to maintain Python 3.10 compatibility ([a104358](https://github.com/DCMLab/dimcat/commit/a1043586a877e4d04b1ff35656f8b8748c8a271b))
* infer_schema_from_df() now can deal with column MultiIndex that involves integer values ([a43fdf7](https://github.com/DCMLab/dimcat/commit/a43fdf706f473d9e944b0ced0aeafcfee94ce6ee))
* plotting functions allow for a single string as argument to 'hover_data' ([57230df](https://github.com/DCMLab/dimcat/commit/57230df02c130475ba8e24c271e99c02c8638e64))

Documentation

* docstring for DimcatConfig ([fe35049](https://github.com/DCMLab/dimcat/commit/fe35049352d5de349d604ee6fb720f1040212876))

3.1.0

[3.1.0](https://github.com/DCMLab/dimcat/compare/v3.0.0...v3.1.0) (2024-01-16)

Features

* adds analyzers.PhraseDataAnalyzer() which takes features.PhraseAnnotations and produces results.PhraseData ([a4a7dd5](https://github.com/DCMLab/dimcat/commit/a4a7dd5daa27c3c9763fb4e9e33b898299065582))
* adds basic HarmonyLabelSlicer ([a9b48a8](https://github.com/DCMLab/dimcat/commit/a9b48a885009773ec67f851bde3d7d815df85171))
* adds convenience module `dimcat.enums` for easily importing any enum from DiMCAT. ([f626f90](https://github.com/DCMLab/dimcat/commit/f626f90a4a9a0ff9694122bc59827f6b4e0b9a8e))
* adds DimcatResource.store_resource() ([57a12f5](https://github.com/DCMLab/dimcat/commit/57a12f5cca7439b597109c3be955d887a5fb7b06))
* adds helper functions to resources.utils ([b6f5cd2](https://github.com/DCMLab/dimcat/commit/b6f5cd279d72ccef21bf5b596c973e8f8affc3d9))
* adds Metadata.get_corpus_names() to retrieve the names in chronological order ([a5c3988](https://github.com/DCMLab/dimcat/commit/a5c398861761cf3fe44784dd2958fcd1626da03f))
* adds methods .get_steps() and .get_last_step() to Pipeline and to Dataset ([61af067](https://github.com/DCMLab/dimcat/commit/61af067a6fe3ba83cac195196f1094c28faf8f61))
* adds plotting.make_box_plot() ([313fe12](https://github.com/DCMLab/dimcat/commit/313fe12316405868f5fe39b25519385edb311a0a))
* adds resources.utils.transpose_notes_to_c() ([7f37551](https://github.com/DCMLab/dimcat/commit/7f3755190b5b0d86e8e61b17ef2832076df4c500))
* adds SmallestUnit.CORPUS_GROUP member for completeness and streamlines .get_grouping_levels() methods ([2de9524](https://github.com/DCMLab/dimcat/commit/2de95249b0a929dcfc4110df4d6abbd74ea1fffe))
* enhances DimcatConfig.meatches() with 'variant' and 'covariant' arguments; adds base.make_config_from_specs() to mirror base.make_object_from_specs() ([f097ed5](https://github.com/DCMLab/dimcat/commit/f097ed5523da2ec53d0bdd89f0a80e1f2fe7dc68))
* enhances utility functions and adds Resources.get_resource_name() ([9c59c6b](https://github.com/DCMLab/dimcat/commit/9c59c6b9e43dddd3ff61141595da58979953fd23))
* first version of make_phrase_selection_masks() ([72f03fd](https://github.com/DCMLab/dimcat/commit/72f03fd5d9f079a4305404410c2a2da77f739fc7))

Bug Fixes

* adds 'ignore_exceptions' argument to Dataset.extract_feature() which defaults to True (remedy for prviously unprocessed features added to the Dataset in the case of exceptions) ([28e91fd](https://github.com/DCMLab/dimcat/commit/28e91fda43bdf6c838cd5331d5dada6960a7a142))

Documentation

* updates notebooks submodule ([12b4818](https://github.com/DCMLab/dimcat/commit/12b4818175e9a7a6d5d319ed51de749f07f943c9))

3.0.0

[3.0.0](https://github.com/DCMLab/dimcat/compare/v2.3.0...v3.0.0) (2023-12-13)

⚠ BREAKING CHANGES

* eliminates .apply_steps() in favour of a single .apply_step(*step), that is, with variadic argument. For backward compatibility, the method still accepts a single list or tuple

Features

* adds four additional columns to HarmonyLabels and BassNotes which contain the (main) chord tones expressed as scale degrees ([396dce9](https://github.com/DCMLab/dimcat/commit/396dce9ad3aa036d13f4e623a5d2954e98dbbdba))
* adds Result.compute_entropy() and Transitions.compute_information_gain() ([c1257a8](https://github.com/DCMLab/dimcat/commit/c1257a84e7073544ec252c6ae3e5a9fe47e8c4bc))
* AdjacencyGroupSlicers now process the required_feature during .fit_to_dataset(), store it as property .slice_metadata and join it onto any processed Metadata object. In the future, there could also be a mode where this metadata is joined onto any processed feature. ([cea586e](https://github.com/DCMLab/dimcat/commit/cea586e6c318da224382a7119baedfb6be0add22))
* empowers NgramTable to make_bi/ngram_tables and NgramTuples with components made up from different columns and with individual join_str and fillna settings ([c8488cf](https://github.com/DCMLab/dimcat/commit/c8488cf9d3bb22b3480af021c24af2add0b96cb6))
* enables adding context_columns for the NgramTable's methods .get_bi/ngram_tuples() and get_bi/ngram_table(). The NgramAnalyzer therefore adds the relevant column names in post-processing. ([fe7ee3a](https://github.com/DCMLab/dimcat/commit/fe7ee3a969712528849dfbedfa28f6914a6e2e6c))
* enables applying Slicers to Metadata by joining them on the SliceIntervals (DimcatIndex) ([a4c3929](https://github.com/DCMLab/dimcat/commit/a4c39291cdfc09f5d99e50d072eaa49ebb625274))
* enables dropping ngram rows which include/correspond to terminals ([89a2552](https://github.com/DCMLab/dimcat/commit/89a25522ff1d02e94e549ccedd1935c931ea2a8c))
* enables the detailed control of terminals which may differ for different n-gram components (except the first one). ([f6a807f](https://github.com/DCMLab/dimcat/commit/f6a807f4417ccb706343b832ef5ca764ac82d709))
* HarmonyLabels and BassNotes features now come with an intervals_over_bass and (for the former) with an intervals_over_root column ([be8d06d](https://github.com/DCMLab/dimcat/commit/be8d06d461dc1cabf9802b9af39cac8d845df1a1))
* includes "root" as auxiliary column for BassNotes ([3f7bd35](https://github.com/DCMLab/dimcat/commit/3f7bd35db460011cb362c812f12dbdc3c9026703))
* makes the 'data' argument to PipelineStep.process() a variadic one, too (concordant with .apply_step()), while still accepting a single argument that can be a list or tuple ([7a37aaa](https://github.com/DCMLab/dimcat/commit/7a37aaaf1bb7607035292efcc072a649b69c3f77))
* Metadata.get_composition_years() now with 'group_cols' parameter to compute composition year means of groups (e.g. corpora) ([fef9860](https://github.com/DCMLab/dimcat/commit/fef986000a4cbd2eaf37d89a61d05442fe64ee5b))
* methods .make_ngram_table() and .make_bigram_table() of NgramTable now actually return a new NgramTable, whereas the previous functions of that name (which returned dataframes) have been renamed to .make_bigram_df() and .make_ngram_df(). ([8dbff20](https://github.com/DCMLab/dimcat/commit/8dbff202f8d71c54ecbbf2bcc32e00d0363a45f4))
* NgramTable gets the convenience method .compute_information_gain() to skip an intermediate call to .get_transitions() ([5b37414](https://github.com/DCMLab/dimcat/commit/5b374141cdb55216fa86cbb668b7867fbead2a6f))
* NgramTable._get_transitions() is cached and now complete with the terminal_symbols argument ([bd12568](https://github.com/DCMLab/dimcat/commit/bd1256825f7ddc128459519068ce6d2955f73eac))
* reduces the amount of parentheses in n-grams by not turning 'single' components (with only one column) into tuples ([02f91d4](https://github.com/DCMLab/dimcat/commit/02f91d40d9935996e0e12625801ab1e646d54fb2))
* streamlines turning n-grams into strings and allows for doing it recursively (useful when columns making up n-gram components contain tuples themselves) ([745df2e](https://github.com/DCMLab/dimcat/commit/745df2ed8471ceafc8eb3d41028e205161724f20))

Bug Fixes

* adapts scipy.stats.entropy() to fix bug caused by pd.Float64Dtype ([4938170](https://github.com/DCMLab/dimcat/commit/4938170539e6072bcd9795797ef29c863cd63f60))
* allow DimcatResource.filter_index_level() to just drop the level without filtering rows ([5c07d97](https://github.com/DCMLab/dimcat/commit/5c07d9794ca1c5beed49aa7a282f0b9e5e9a8e04))
* applying a Grouper needs to be an inner join. Also, the index levels should come in systematic order, first the grouper levels, then the remaining ones ([8f80fc2](https://github.com/DCMLab/dimcat/commit/8f80fc2bc07582ea6584c2b494384daf5ac1199a))
* enables (de-)serialization for Filter objects ([976c179](https://github.com/DCMLab/dimcat/commit/976c1797561d1d2076fd1a1479da64815d247fa9))
* fills up missing 'quarterbeats_all_endings' column for older parts of the dataset ([390e0a5](https://github.com/DCMLab/dimcat/commit/390e0a5ce99c3b3f77ddd9d0d9be903fd854caad))
* Groupers that use metadata now should use Dataset.get_metadata(raw=True) ([5d35b20](https://github.com/DCMLab/dimcat/commit/5d35b20044a24a47cfa3848dc4d2dbd226ba4a8f))
* grouping by a single level that contains tuples resulted in several levels in the resulting MultiIndex; this fix applied for completeness before the whole function is simplified ([0ed6091](https://github.com/DCMLab/dimcat/commit/0ed6091e323074670281c7389603b4605412e494))
* NgramTable.get_default_analysis() returns Transitions ([b90f0ae](https://github.com/DCMLab/dimcat/commit/b90f0aea80240077085f2b85bafacc3124222be4))
* omit duplicate computation of 'proportions' by Transitions._sort_combined_result() ([2f09bbb](https://github.com/DCMLab/dimcat/commit/2f09bbb600919c7ccb80749306ea449dcfc2b238))
* raise NotImplementedError when trying to use convenience methods directly on Transitions object ([7cf61c4](https://github.com/DCMLab/dimcat/commit/7cf61c47b07e91e928c749836469f7e4d4b7e5f3))
* re-inserts missing import ([02bd96c](https://github.com/DCMLab/dimcat/commit/02bd96c7c26711e2fb634ac6cd34be2c42517ac6))
* singular ngram_components should also become strings (even if they are not joined on 'join_str') ([3987162](https://github.com/DCMLab/dimcat/commit/398716226ce3ca3e9b0c421a68d15583f25b909e))
* when an index level is dropped, make sure to remove it from the default_groupby ([260f8f1](https://github.com/DCMLab/dimcat/commit/260f8f1ae011a917026ea1f4d20e0c50cc8a7035))
* when applying a Filter with drop_level=True, do not turn a Dataset into a GroupedDataset (as per virtue of the respective parent Grouper) ([937002c](https://github.com/DCMLab/dimcat/commit/937002c9eed97418bbc2d75267e9626289eb1eb9))
* when Counter is used with smallest_unit=GROUP, it recurs to self.compute() ([737b6a6](https://github.com/DCMLab/dimcat/commit/737b6a6051f7256596adf8d7f0d665de13962e12))

Reverts

* eliminates .apply_steps() in favour of a single .apply_step(*step), that is, with variadic argument. For backward compatibility, the method still accepts a single list or tuple ([fab8e13](https://github.com/DCMLab/dimcat/commit/fab8e13fafc895ee6338aad2c499dfc497ebfabc))

2.3.0

[2.3.0](https://github.com/DCMLab/dimcat/compare/v2.2.0...v2.3.0) (2023-12-09)

Features

* all schemas retrieved via the .schema or .pickled_schema property allow for loading dicts without 'dtype' key by assuming their own dtype as default ([9ff060e](https://github.com/DCMLab/dimcat/commit/9ff060ea0718242ce7a3b05cad50163bd3c5dc58))
* new category of objects: Filters. They extend any Grouper by adding the init args 'keep_values', 'drop_values', and 'drop_level' to it. They use these arguments to post-process any resource first processed by the corresponding grouper. This required renaming the relatively new HasCadenceAnnotations and HasHarmonyLabels to HasCadenceAnnotationsGrouper and HasHarmonyLabelsGrouper, to differentiate them from the new HasCadenceAnnotationsFilter and HasHarmonyLabelsFilter. The other two filters that have been implemented so far are the CorpusFilter and the PieceFilter. As an aside, Groupers do not complain anymore when they are applied to a resource that has already been grouped by a Grouper of the same type. If the grouping level exists but isn't the first one, it is systematically made the first one. This applies, by extension, to the Filters (for now) ([ec3d1f7](https://github.com/DCMLab/dimcat/commit/ec3d1f7980b5a18a01b0cdb37e93823f1549236d))

Bug Fixes

* adapts NgramAnalyzer's init args & schema ([3e51f97](https://github.com/DCMLab/dimcat/commit/3e51f979d06ee275672ffba8b36e4ad9cff51e61))
* align_with_grouping() did not work for NgramTables because pandas prevents merge with diverging column nlevels, even if one of the sides has no columns ([e51625f](https://github.com/DCMLab/dimcat/commit/e51625fd6b7fd004ff1eb7afeb93139f5d728adc))
* allows passing a list of list (instead of a list of tuples) to DimcatIndex.from_tuples(), useful for de-serializing from JSON ([0cff3c1](https://github.com/DCMLab/dimcat/commit/0cff3c1bd0405b9dda786cb1d8c14e1f0e4c7779))
* extends app_tests.test_analyze() to the actual plotting; warns about non-Analyzer PipelineSteps applied after an Analyzer ([72ef210](https://github.com/DCMLab/dimcat/commit/72ef210eb007db3ef5a5158d7890ddf3add6ee38))
* facet titles be strings ([ed185f7](https://github.com/DCMLab/dimcat/commit/ed185f792115ecee571e0b24a903a8e2604d35cd))
* improves (de-)serialization of DimcatIndex and, by extension, the MappingGroupers' 'grouped_units' field ([f59673d](https://github.com/DCMLab/dimcat/commit/f59673da2e1a3075d8881d51cc5e3ce60c84e7db))
* parses music21.key.KeySignature the same way as usic21.key.Key ([5aa7902](https://github.com/DCMLab/dimcat/commit/5aa790269d43c18f0ebfeebe1c0a097e5b1029a6))
* the frictionless workaround for copying a resource with no path specified is now complete ([5d1426d](https://github.com/DCMLab/dimcat/commit/5d1426d33dbe81f7f97dec2edb967895fe82883b))
* the frictionless workaround for copying a resource with no path specified is now complete ([98ee01d](https://github.com/DCMLab/dimcat/commit/98ee01deda6f6e1e92a88bea416fc4b210794e37))

2.2.0

[2.2.0](https://github.com/DCMLab/dimcat/compare/v2.1.0...v2.2.0) (2023-12-07)

Features

* adds HasHarmonyLabels grouper ([4fa92de](https://github.com/DCMLab/dimcat/commit/4fa92debd399a7649ed59c59c98bc334d1228b95))
* enables .get_feature("metadata") for Dataset and DimcatPackage which, in return, enables Dataset.get_metadata(raw=False) (default), i.e. returning a processed Metadata feature (old behaviour, i.e. without processing, via Dataset.get_metadata(raw=True)) ([731c4d1](https://github.com/DCMLab/dimcat/commit/731c4d1385ba35beaad374c777274ab2331caf67))

Bug Fixes

* align_with_grouping() makes sure to be unpacking DimcatIndex ([a279691](https://github.com/DCMLab/dimcat/commit/a279691f940c2fd220386cec835e25f768009081))
* Analyzer.Schema() adapted ([57748ca](https://github.com/DCMLab/dimcat/commit/57748caacaaf7172aaaf1c8052e68c540fc5577c))
* DimcatResource.from_resource_and_dataframe() also detaches new resource from filepath, if necessary ([6665fdd](https://github.com/DCMLab/dimcat/commit/6665fddaf60c991be505df7b9bc86731e9b8fbd8))

2.1.0

[2.1.0](https://github.com/DCMLab/dimcat/compare/v2.0.0...v2.1.0) (2023-12-07)

Features

* adds 'dimension_column' as argument for all Analyzers; enables default_analyzer for Metadata ([d192a07](https://github.com/DCMLab/dimcat/commit/d192a074af0b1b4799aadd06e222ef8fd8380a8d))
* adds convenience module `dimcat.enums` for easily importing any enum from DiMCAT. ([7cd3a3f](https://github.com/DCMLab/dimcat/commit/7cd3a3fc11d07d37d3dd3a3de5437334798e25de))
* adds PieceGrouper ([55c6d54](https://github.com/DCMLab/dimcat/commit/55c6d546635440c7d623d0db23ad75e1c990bd37))
* enable .make_ranking_table() for NgramTable (convenience for calling .make_ngram_tuples() first) ([5a788d5](https://github.com/DCMLab/dimcat/commit/5a788d5ee96b1ad66a13944c49177a309d005dc8))
* enables group_cols and group_modes for bubble_plots, too ([0d8ba17](https://github.com/DCMLab/dimcat/commit/0d8ba177542e60499f029f4110022e96559c3b0c))
* includes the UnitOfAnalysis enum as 'group_cols' argument for Result's methods ([efd7fdb](https://github.com/DCMLab/dimcat/commit/efd7fdb625b0c7219718d7ea2b397e4231c6c836))
* introduces new HarmonyLabelsFormat "ROMAN_REDUCED" ([0a08952](https://github.com/DCMLab/dimcat/commit/0a089525379b8bf80e6da75daea9fb4a3d994828))
* NgramTable.get_transitions() returns new result type Transitions ([0723fe1](https://github.com/DCMLab/dimcat/commit/0723fe15bad6013f60b0415ae0db18053a984c68))
* NgramTable.make_ngram_tuples() now actually returns tuples, not tables (which are retrieved via .make_ngram_table()). They come as a new Result type, NgramTuples, which also allows for .make_ranking_table() ([3746437](https://github.com/DCMLab/dimcat/commit/3746437c7084fcc4f7705525f8a39443980804d4))
* NgramTable() uses the new Transitions for both .plot() and .plot_grouped() ([b12b130](https://github.com/DCMLab/dimcat/commit/b12b1302a0655f415ab77eabb72b0e18bbe67d12))
* PieceGrouper and CorpusGrouper move the respective index level to level 0 ([1b7f436](https://github.com/DCMLab/dimcat/commit/1b7f436746cfd5c709bbe7c619e0de3ff558e852))
* Transitions result type plots methods return Plotly heatmaps ([3df7880](https://github.com/DCMLab/dimcat/commit/3df78807b7c1de1be8b57534d460b918820915a9))

Bug Fixes

* base.resolve_object_spec() needs to check if config first, then if DimcatObject ([8c8c4d2](https://github.com/DCMLab/dimcat/commit/8c8c4d2955ae4eff23fac762b2d84799e5703e8a))
* do not convert "count" column to "Int64" by default (because of Plotly bug); instead convert integer columns when making ranking tables to prevent counts coming as floats ([59bd92a](https://github.com/DCMLab/dimcat/commit/59bd92a7ee96be0e6daa5f122dc4f470d66e9a97))
* Pipeline calls step.process_resource() instead of ._process_resource() because otherwise the call to .check_resource() is skipped ([024bf65](https://github.com/DCMLab/dimcat/commit/024bf65ec50587f02a9c83f28eb9855d3ccbf173))

Documentation

* moves error to dedicated errors.md notebook. fixes [61](https://github.com/DCMLab/dimcat/issues/61) ([390b76e](https://github.com/DCMLab/dimcat/commit/390b76ea3788206a666c147b36ebe9ff6c71b71c))

Page 1 of 3

Releases

Has known vulnerabilities

Dimcat

Page 1 of 3

3.2.0

3.1.0

3.0.0

2.3.0

2.2.0

2.1.0

Page 1 of 3

Links

Releases