Hypex

Latest version: v0.1.1.post1

Safety actively analyzes 623395 Python packages for vulnerabilities to keep your Python projects secure.

0.1.1

HypEx Release 0.1.1 Summary

We're excited to announce the release of HypEx 0.1.1, which includes a range of updates aimed at improving functionality, enhancing usability, and fixing known issues. Here's what's new:

New Features and Enhancements

- **Added Support for Python 3.11 and 3.12:** Ensuring HypEx remains compatible with the latest Python versions, we've tested and adjusted HypEx to work seamlessly with Python 3.11 and 3.12.

- **Enhancements to `group_col` Handling:** Improved the flexibility and accuracy of `group_col` parameter handling within HypEx, allowing for more robust operation in sorting, null-value handling, and group concatenation.

- **Introduction of `fill_gaps` Parameter:** A new feature in the Matcher class that automatically fills NaN values in categorical columns used for grouping, streamlining data preparation.

- **New `max_categories` Parameter:** This update introduces a limit to the number of categories a column can have before being excluded from conversion into dummy variables, preventing memory issues with high-cardinality columns.

- **Performance Optimization in `abn_test`:** Addressed performance issues and corrected a bug in the formula used for determining hypotheses outcomes, enhancing execution efficiency and accuracy.

- **Documentation Update - Code of Conduct:** Added a Code of Conduct to our repository to outline expectations for behavior and provide a process for handling misconduct, fostering a more inclusive and respectful community.

- **Moved** `limit_distribution` to `abn_test `
- **Removed** multitarget in `Matcher` and `validate_result` in `Matcher` due to mathematical reason. Return it back later.

Bug Fixes

- **Fixed `group_col` List Handling:** Addressed an issue where `group_col` as a list was not functioning correctly, ensuring proper operation across various use cases.

- **Speed and Hypothesis Selection in `limit_distribution`:** Optimized the function to reduce execution time and fixed a bug in hypothesis selection, ensuring reliable outcomes.

Documentation and Community

- **Enhanced Documentation:** Updated documentation to reflect new features, parameter introductions, and usage examples, making it easier for users to get started and utilize HypEx effectively.

- **Community Engagement:** Encouraged community feedback and contributions by clarifying contribution guidelines and promoting an open, collaborative environment.

This release represents a significant effort from the HypEx team to address user feedback, improve the library's functionality, and ensure it meets the community's needs. We thank our contributors for their invaluable input and look forward to continuing to develop HypEx together.

0.1.0

HypEx Release Summary

This release of HypEx introduces a range of new features, significant enhancements, and critical bug fixes aimed at improving the usability, functionality, and reliability of the framework. Below is an overview of the major changes:

New Features and Enhancements

1. **Feature Selection Integration**: Utilizing CatBoost and LightGBM for integrating feature importance algorithms, directly affecting the matching tasks within HypEx.

2. **AATest Class Enhancements**: Refactoring critical methods into the AATest class, simplifying user interaction and extending the functionality to handle direct DataFrame inputs for a more intuitive experience.

3. **TQDM Import Compatibility**: Addressed compatibility issues with older versions of tqdm, ensuring smoother operations across different environments.

4. **Validate Group Col Functionality**: Expanded the validate_group_col function to accept both strings and lists, enabling more complex data validation scenarios.

5. **Enhancements to Group Matching**: Introduced a mechanism to bypass categories preventing successful Cholesky decomposition, thereby enhancing the stability of the matching process.

6. **Automated Emissions Handling**: Automated detection and management of extreme outliers, with customization options for handling, thus improving data analysis reliability.

7. **Delta_t Attribute for Bias Quantification**: New attribute to quantify the bias contribution to ATT, offering insights into the impact of bias on the analysis results.

8. **Imbalanced Sample Size Calculation**: Added a new function for calculating necessary sample sizes for control and test groups in studies with imbalanced group sizes, supporting both binary and continuous outcomes.

Bug Fixes

1. **Fixed Cholesky Decomposition Issue**: Adjusted data processing to ensure Cholesky decomposition can always be performed, addressing failures due to non-positive definite matrices.

2. **Resolved Dataset Creation Bug**: Corrected an issue affecting the number of users in dataset creation, ensuring accuracy and reliability.

Documentation and Warnings

- Added warnings and guidelines regarding the alpha version functionalities like multitarget matching, feature selection in matching, and validation in matching.
- Enhanced documentation to include warnings about feature selection, providing a more comprehensive understanding of potential pitfalls and considerations.

Methodological Innovations

- Developed a method based on the theory of limit distributions, aimed at maintaining strict adherence to the predefined probabilities of Type I and Type II errors, thus improving the decision-making process in multiple hypothesis testing.

This release represents a significant step forward for the HypEx project, delivering robust solutions to complex data analysis challenges while maintaining high standards of accuracy and reliability.

0.0.4

This release marks a significant update to HypEx, introducing new features, enhancements to existing functionalities, and various optimizations and bug fixes. Below is a detailed breakdown of the updates and improvements included in this release.

New Features
- **MDE Calculation Function**: Added a function for calculating the Minimum Detectable Effect (MDE).
- **Sample Size Calculation with MDE**: Introduced a function to calculate the sample size required for a given MDE.
- **Test Power Calculation Function**: Implemented a function for calculating the power of the test.

Enhancements to AA Tests
- **Simplified Pipeline**: The classical pipeline can now be invoked with a single `process` function.
- **Stratification Optimization**: The process now allows for optimization in stratification.
- **Enhanced Test Interpretation**:
- Built a table with test statistics for better analysis.
- Created a summary table for AA test outcomes.
- Developed visualizations (graphs) for AA test results.
- **Improved Split Analysis**:
- Added graphs for distribution analysis to ensure group homogeneity.
- Conducted statistical tests for homogeneity.
- Calculated and presented group statistics.
- Receiving a brief summary of the test

General Improvements
- **Bug Fixes**: Addressed and resolved known issues.
- **Code Refactoring**: Improved code structure for better maintainability and readability.
- **Process Optimization**: Enhanced overall process efficiency.

Documentation and Community Engagement
- **Updated README**: A new and improved README for better project understanding.
- **Issue Templates**: Introduced templates for streamlined issue reporting.
- **Pull Request Templates**: Added templates to facilitate consistent and structured pull requests.
- **Contributing Guidelines**: Updated `CONTRIBUTING.md` with new guidelines and templates.
- **New Tutorials**: Added tutorials to guide users through the new features and enhancements.

0.0.3.1

Fixed issues related to OutLiers Filters. It returned set, which made further Matching work impossible. Now it immediately deletes rows and returns DataFrame without these rows

0.0.3

First release

- Faiss KNN Matching: Utilizes Faiss for efficient and precise nearest neighbor searches, aligning with RCM for optimal pair matching.
- Data Filters: Built-in outlier and Spearman filters ensure data quality for matching.
- Result Validation: Offers multiple validation methods, including random treatment, feature, and subset validations.
- Data Tests: Incorporates SMD, KS, PSI, and Repeats tests to affirm the robustness of effect estimations.
- Automated Feature Selection: Employs LGBM feature selection to pinpoint the most impactful features for causal analysis.
- AB Testing Suite: Features a suite of AB testing tools for comprehensive hypothesis evaluation.

Releases

Has known vulnerabilities