Eda-toolkit

Latest version: v0.0.15

Safety actively analyzes 715032 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 5

0.0.8

We are excited to announce the release of version 0.0.8, which introduces significant enhancements and new features to improve the usability and functionality of our toolkit.

**New Features:**

1. **Optional `file_prefix` in `stacked_crosstab_plot` Function**
- The `stacked_crosstab_plot` function has been updated to make the `file_prefix` argument optional. If the user does not provide a `file_prefix`, the function will now automatically generate a default prefix based on the `col` and `func_col` parameters. This change streamlines the process of generating plots by reducing the number of required arguments.
- **Key Improvement:**
- Users can now omit the `file_prefix` argument, and the function will still produce appropriately named plot files, enhancing ease of use.
- Backward compatibility is maintained, allowing users who prefer to specify a custom `file_prefix` to continue doing so without any issues.

2. **Introduction of 3D and 2D Partial Dependence Plot Functions**
- Two new functions, `plot_3d_pdp` and `plot_2d_pdp`, have been added to the toolkit, expanding the visualization capabilities for machine learning models.
- **`plot_3d_pdp`:** Generates 3D partial dependence plots for two features, supporting both static visualizations (using Matplotlib) and interactive plots (using Plotly). The function offers extensive customization options, including labels, color maps, and saving formats.
- **`plot_2d_pdp`:** Creates 2D partial dependence plots for specified features with flexible layout options (grid or individual plots) and customization of figure size, font size, and saving formats.
- **Key Features:**
- **Compatibility:** Both functions are compatible with various versions of scikit-learn, ensuring broad usability.
- **Customization:** Extensive options for customizing visual elements, including figure size, font size, and color maps.
- **Interactive 3D Plots:** The `plot_3d_pdp` function supports interactive visualizations, providing an enhanced user experience for exploring model predictions in 3D space.

**Impact:**

- These updates improve the user experience by reducing the complexity of function calls and introducing powerful new tools for model interpretation.
- The optional `file_prefix` enhancement simplifies plot generation while maintaining the flexibility to define custom filenames.
- The new partial dependence plot functions offer robust visualization options, making it easier to analyze and interpret the influence of specific features in machine learning models.

We encourage users to explore these new features and provide feedback on their experience. As always, we remain committed to continuous improvement and welcome suggestions for future updates.

0.0.8c

Summary of Changes:

1. New Features & Enhancements:
- **`plot_3d_pdp` Function:**
- **Added `show_modebar` Parameter:** Introduced a new boolean parameter, `show_modebar`, to allow users to toggle the visibility of the mode bar in Plotly interactive plots.
- **Custom Margins and Layout Adjustments:**
- Added parameters for `left_margin`, `right_margin`, and `top_margin` to provide users with more control over the plot layout in Plotly.
- Adjusted default values and added options for better customization of the Plotly color bar (`cbar_x`, `cbar_thickness`) and title positioning (`title_x`, `title_y`).
- **Plotly Configuration:**
- Enhanced the configuration options to allow users to enable or disable zoom functionality (`enable_zoom`) in the interactive Plotly plots.
- Updated the code to reflect these new parameters, allowing for greater flexibility in the appearance and interaction with the Plotly plots.
- **Error Handling:**
- Added input validation for `html_file_path` and `html_file_name` to ensure these are provided when necessary based on the selected `plot_type`.

- **`plot_2d_pdp` Function:**
- **Introduced `file_prefix` Parameter:**
- Added a new `file_prefix` parameter to allow users to specify a prefix for filenames when saving grid plots. This change streamlines the naming process for saved plots and improves file organization.
- **Enhanced Plot Type Flexibility:**
- The `plot_type` parameter now includes an option to generate both grid and individual plots (`both`). This feature allows users to create a combination of both layout styles in one function call.
- Updated input validation and logic to handle this new option effectively.
- **Added `save_plots` Parameter:**
- Introduced a new parameter, `save_plots`, to control the saving of plots. Users can specify whether to save all plots, only individual plots, only grid plots, or none.
- **Custom Margins and Layout Adjustments:**
- Included the `save_plots` parameter in the validation process to ensure paths are provided when needed for saving the plots.

2. Documentation Updates:
- **Docstrings:**
- Updated docstrings for both functions to reflect the new parameters and enhancements, providing clearer and more comprehensive guidance for users.
- Detailed the use of new parameters such as `show_modebar`, `file_prefix`, `save_plots`, and others, ensuring that the function documentation is up-to-date with the latest changes.

3. Refactoring & Code Cleanup:
- **Code Structure:**
- Improved the code structure to maintain clarity and readability, particularly around the new functionality.
- Consolidated the layout configuration settings for the Plotly plots into a more flexible and user-friendly format, making it easier for users to customize their plots.

---

This version enhances the usability of the `plot_3d_pdp` and `plot_2d_pdp` functions, introduces new features for greater flexibility in plot customization, and ensures that the functions are well-documented and easy to use. The updates are backward-compatible and aim to provide a more seamless user experience in generating and saving both 3D and 2D partial dependence plots.

0.0.8b

We are excited to announce the release of version 0.0.8b, which introduces significant enhancements and new features to improve the usability and functionality of our toolkit.

**New Features:**

1. **Optional `file_prefix` in `stacked_crosstab_plot` Function**
- The `stacked_crosstab_plot` function has been updated to make the `file_prefix` argument optional. If the user does not provide a `file_prefix`, the function will now automatically generate a default prefix based on the `col` and `func_col` parameters. This change streamlines the process of generating plots by reducing the number of required arguments.
- **Key Improvement:**
- Users can now omit the `file_prefix` argument, and the function will still produce appropriately named plot files, enhancing ease of use.
- Backward compatibility is maintained, allowing users who prefer to specify a custom `file_prefix` to continue doing so without any issues.

2. **Introduction of 3D and 2D Partial Dependence Plot Functions**
- Two new functions, `plot_3d_pdp` and `plot_2d_pdp`, have been added to the toolkit, expanding the visualization capabilities for machine learning models.
- **`plot_3d_pdp`:** Generates 3D partial dependence plots for two features, supporting both static visualizations (using Matplotlib) and interactive plots (using Plotly). The function offers extensive customization options, including labels, color maps, and saving formats.
- **`plot_2d_pdp`:** Creates 2D partial dependence plots for specified features with flexible layout options (grid or individual plots) and customization of figure size, font size, and saving formats.
- **Key Features:**
- **Compatibility:** Both functions are compatible with various versions of scikit-learn, ensuring broad usability.
- **Customization:** Extensive options for customizing visual elements, including figure size, font size, and color maps.
- **Interactive 3D Plots:** The `plot_3d_pdp` function supports interactive visualizations, providing an enhanced user experience for exploring model predictions in 3D space.

**Impact:**

- These updates improve the user experience by reducing the complexity of function calls and introducing powerful new tools for model interpretation.
- The optional `file_prefix` enhancement simplifies plot generation while maintaining the flexibility to define custom filenames.
- The new partial dependence plot functions offer robust visualization options, making it easier to analyze and interpret the influence of specific features in machine learning models.

We encourage users to explore these new features and provide feedback on their experience. As always, we remain committed to continuous improvement and welcome suggestions for future updates.

0.0.8a

0.0.7

Add `flex_corr_matrix` function for customizable correlation matrix visualization

This release introduces a new function, `flex_corr_matrix`, which allows users to generate both full and upper triangular correlation heatmaps with a high degree of customization. The function includes options to annotate the heatmap, save the plots, and pass additional parameters to `seaborn.heatmap()`.

Summary of Changes:
- **New Function**: `flex_corr_matrix`
- **Functionality**:
- Generates a correlation heatmap for a given DataFrame.
- Supports both full and upper triangular correlation matrices based on the `triangular` parameter.
- Allows users to customize various aspects of the plot, including colormap, figure size, axis label rotation, and more.
- Accepts additional keyword arguments via `**kwargs` to pass directly to `seaborn.heatmap()`.
- Includes validation to ensure the `triangular`, `annot`, and `save_plots` parameters are boolean values.
- Raises an exception if `save_plots=True` but neither `image_path_png` nor `image_path_svg` is specified.

Usage:

python
Full correlation matrix example
flex_corr_matrix(df=my_dataframe, triangular=False, cmap="coolwarm", annot=True)

Upper triangular correlation matrix example
flex_corr_matrix(df=my_dataframe, triangular=True, cmap="coolwarm", annot=True)


Contingency table df to object type

Convert all columns in dataframe to object, to prevent issues with numerical columns.

python
df = df.astype(str).fillna("")

0.0.6

Add validation for `plot_type` parameter in `kde_distributions` function

This release adds a validation step for the `plot_type` parameter in the `kde_distributions` function. The allowed values for `plot_type` are `"hist"`, `"kde"`, and `"both"`. If an invalid value is provided, the function will now raise a `ValueError` with a clear message indicating the accepted values. This change improves the robustness of the function and helps prevent potential errors due to incorrect parameter values.


python
Validate plot_type parameter
valid_plot_types = ["hist", "kde", "both"]
if plot_type.lower() not in valid_plot_types:
raise ValueError(
f"Invalid plot_type value. Expected one of {valid_plot_types}, "
f"got '{plot_type}' instead."
)

Page 3 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.