Megaprofiler

Latest version: v1.0.0

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.2.0

When working with large datasets, it’s often necessary to understand data types, distributions, and potential issues (e.g., missing values, outliers) before analysis. While libraries like pandas-profiling exist, there is still room for an extensible, easy-to-use, and highly customizable profiler that integrates data validation.


**Key Features:**

- Automatic Data Summaries: Provide insights like distribution, unique values, missing values, and more for each column.

- Anomaly Detection: Automatically flag columns or rows with unusual distributions, outliers, or inconsistent data.

- Data Validation: Set validation rules (e.g., no missing values in specific columns, data type constraints) and get alerts if the data violates these rules.

- Custom Reports: Generate visual reports (e.g., HTML, PDF) with configurable thresholds for what counts as an anomaly.

- Data Drift Detection: Track changes in data distributions over time to identify shifts in data quality or content.


**Benefits:**

megaprofiler could be invaluable to data scientists and engineers dealing with exploratory data analysis, data quality checks, and ETL pipelines, reducing manual data investigation and improving data quality oversight.

Links

Releases

Has known vulnerabilities

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.