🚀 Feature Enhancements
Advanced Data Partitioning with `num_partitions` 🔄
We're excited to introduce the `num_partitions` argument for our `forecast`, `cross_validation`, and `detect_anomalies` methods, offering more control over data processing and parallelization:
- **Optimized Resource Utilization in Distributed Environments:** For Spark, Ray, or Dask dataframes, `num_partitions` enables the system to either leverage all available parallel resources or to specify the number of parallel processes. This ensures efficient resource allocation and utilization across distributed computing environments.
python
Utilize num_partitions in distributed environments
fcst_df = timegpt.forecast(df, model='timegpt-1-long-horizon', num_partitions=10)
- **Efficient Handling of Large Pandas Dataframes:** When working with Pandas dataframes, `num_partitions` groups series into specified partitions, allowing for sequential API calls. This is particularly useful for large dataframes that are impractical to send over the internet in one go, enhancing performance and efficiency.
python
Efficiently process large Pandas dataframes
cv_df = timegpt.cross_validation(df, model='timegpt-1', num_partitions=5)
This new feature provides a flexible approach to handling data across different environments, ensuring optimal performance and resource management.
*See full changelog [here](https://github.com/Nixtla/nixtla/releases/v0.1.19).*