Diive

Latest version: v0.84.2

Safety actively analyzes 682404 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 15

0.78.1.1

Additions

- Added CITATIONS file

0.78.1

Changes

- Added option to set different `n_sigma` for daytime and nightime data
in `HampelDaytimeNighttime` (`diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime`)
- Updated `flag_outliers_hampel_dtnt_test` in step-wise outlier detection
- Updated `level32_flag_outliers_hampel_dtnt_test` in flux processing chain

Notebooks

- Updated notebook `HampelDaytimeNighttime`
- Updated notebook `FluxProcessingChain`

Tests

- Updated unittest `test_hampel_filter_daytime_nighttime`
- 35/35 unittests ran successfully

0.78.0

New features

- Added new class for outlier removal, based on the rolling z-score. It can also be used in step-wise outlier detection
and during meteoscreening from the
database. (`diive.pkgs.outlierdetection.zscore.zScoreRolling`,
`diive.pkgs.outlierdetection.stepwiseoutlierdetection.StepwiseOutlierDetection`,
`diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb`).
- Added Hampel filter for outlier removal (`diive.pkgs.outlierdetection.hampel.Hampel`)
- Added Hampel filter (separate daytime, nighttime) for outlier
removal (`diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime`)
- Added function to plot daytime and nighttime outliers during outlier
tests (`diive.core.plotting.outlier_dtnt.outlier_daytime_nighttime`)

Changes

- Flux processing chain:
- Several changes to the flux processing chain to make sure it can also work with data files not directly output by
EddyPro. The class `FluxProcessingChain` can now handle files that have a different format than the two EddyPro
output files `EDDYPRO-FLUXNET-CSV-30MIN` and `EDDYPRO-FULL-OUTPUT-CSV-30MIN`. See following notes.
- Removed option to process EddyPro `_full_output_` files, since it as an older format and its variables do not
follow FLUXNET conventions.
- Removed keyword `filetype` in class `FluxProcessingChain`. It is now assumed that the variable names follow the
FLUXNET convention. Variables used in FLUXNET are
listed [here](https://fluxnet.org/data/fluxnet2015-dataset/fullset-data-product/) (
`diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain`)
- When detecting the base variable from which a flux variable was calculated, the variables defined for
filetype `EDDYPRO-FLUXNET-CSV-30MIN` are now assumed by default. (`diive.pkgs.flux.common.detect_basevar`)
- Renamed function that detects the base variable that was used to calculate the respective
flux (`diive.pkgs.flux.common.detect_fluxbasevar`)
- Renamed `gas` in functions related to completeness tests to `fluxbasevar` to better reflect that the completeness
test does not necessarily require a gas (e.g. `T_SONIC` is used to calculate the completeness for sensible heat
flux) (`flag_fluxbasevar_completeness_eddypro_test`)
- Removing the radiation offset now uses `0.001` (W m-2) instead of `50` as the threshold value to flag nighttime values
for the correction (`diive.pkgs.corrections.offsetcorrection.remove_radiation_zero_offset`)
- The database tag for meteo data screened with `diive` is
now `meteoscreening_diive` (`diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb.resample`)
- During noise generation, function now uses the absolute values of the min/max of a series to calculate minimum noise
and maximum noise (`diive.pkgs.createvar.noise.add_impulse_noise`)

Notebooks

- Added new notebook for outlier detection using class `zScore` (`notebooks/OutlierDetection/zScore.ipynb`)
- Added new notebook for outlier detection using
class `zScoreDaytimeNighttime` (`notebooks/OutlierDetection/zScoreDaytimeNighttime.ipynb`)
- Added new notebook for outlier removal using trimming (`notebooks/OutlierDetection/TrimLow.ipynb`)
- Updated notebook (`notebooks/MeteoScreening/StepwiseMeteoScreeningFromDatabase_v7.0.ipynb`)
- When uploading screened meteo data to the database using the notebook `StepwiseMeteoScreeningFromDatabase`, variables
with the same name, measurement and data version as the screened variable(s) are now deleted from the database before
the new data are uploaded. Implemented in the Python package `dbc-influxdb` to avoid duplicates in the database. Such
duplicates can occur when one of the tags of an otherwise identical variable changed, e.g., when one of the tags of
the originally uploaded data was wrong and needed correction. The database `InfluxDB` stores a new time series
alongside the previous time series when one of the tags is different in an otherwise identical time series.

Tests

- Added test case for `Hampel` filter (`tests.test_outlierdetection.TestOutlierDetection.test_hampel_filter`)
- Added test case for `HampelDaytimeNighttime`
filter (`tests.test_outlierdetection.TestOutlierDetection.test_hampel_filter_daytime_nighttime`)
- Added test case for `zScore` (`tests.test_outlierdetection.TestOutlierDetection.test_zscore`)
- Added test case for `TrimLow` (`tests.test_outlierdetection.TestOutlierDetection.test_trim_low_nt`)
- Added test case
for `zScoreDaytimeNighttime` (`tests.test_outlierdetection.TestOutlierDetection.test_zscore_daytime_nighttime`)
- 33/33 unittests ran successfully

Environment

- Added package [sktime](https://www.sktime.net/en/stable/index.html), a unified framework for machine learning with
time series.

0.77.0

Additions

- Plotting cumulatives with `CumulativeYear` now also shows the cumulative for the reference, i.e. for the mean over the
reference years (`diive.core.plotting.cumulative.CumulativeYear`)
- Plotting `DielCycle` now accepts `ylim` parameter (`diive.core.plotting.dielcycle.DielCycle`)
- Added long-term dataset for local testing purposes (internal
only) (`diive.configs.exampledata.load_exampledata_parquet_long`)
- Added several classes in preparation for long-term gap-filling for a future update

Changes

- Several updates and changes to the base class for regressor decision
trees (`diive.core.ml.common.MlRegressorGapFillingBase`):
- The data are now split into training set and test set at the very start of regressor setup. This test set is used
to evaluate models on unseen data. The default split is 80% training and 20% test data.
- Plotting (scores, importances etc.) is now generally separated from the method where they are calculated.
- the same `random_state` is now used for all processing steps
- refactored code
- beautified console output
- When correcting for relative humidity values above 100%, the maximum of the corrected time series is now set to 100,
after the (daily) offset was removed (`diive.pkgs.corrections.offsetcorrection.remove_relativehumidity_offset`)
- During feature reduction in machine learning regressors, features with permutation importance < 0 are now always
removed (`diive.core.ml.common.MlRegressorGapFillingBase._remove_rejected_features`)
- Changed default parameters for quick random forest gap-filling (`diive.pkgs.gapfilling.randomforest_ts.QuickFillRFTS`)
- I tried to improve the console output (clarity) for several functions and methods

Environment

- Added package [dtreeviz](https://github.com/parrt/dtreeviz?tab=readme-ov-file) to visualize decision trees

Notebooks

- Updated notebook (`notebooks/GapFilling/RandomForestGapFilling.ipynb`)
- Updated notebook (`notebooks/GapFilling/LinearInterpolation.ipynb`)
- Updated notebook (`notebooks/GapFilling/XGBoostGapFillingExtensive.ipynb`)
- Updated notebook (`notebooks/GapFilling/XGBoostGapFillingMinimal.ipynb`)
- Updated notebook (`notebooks/GapFilling/RandomForestParamOptimization.ipynb`)
- Updated notebook (`notebooks/GapFilling/QuickRandomForestGapFilling.ipynb`)

Tests

- Updated and fixed test case (`tests.test_outlierdetection.TestOutlierDetection.test_zscore_increments`)
- Updated and fixed test case (`tests.test_gapfilling.TestGapFilling.test_gapfilling_randomforest`)

0.76.2

Additions

- Added function to calculate absolute double differences of a time series, which is the sum of absolute differences
between a data record and its preceding and next record. Used in class `zScoreIncrements` for finding (isolated)
outliers that are distant from neighboring records. (`diive.core.dfun.stats.double_diff_absolute`)
- Added small function to calculate z-score stats of a time series (`diive.core.dfun.stats.sstats_zscore`)
- Added small function to calculate stats for absolute double differences of a time
series (`diive.core.dfun.stats.sstats_doublediff_abs`)

Changes

- Changed the algorithm for outlier detection when using `zScoreIncrements`. Data points are now flagged as outliers if
the z-scores of three absolute differences (previous record, next record and the sum of both) all exceed a specified
threshold. (`diive.pkgs.outlierdetection.incremental.zScoreIncrements`)

Notebooks

- Added new notebook for outlier detection using
class `LocalOutlierFactorAllData` (`notebooks/OutlierDetection/LocalOutlierFactorAllData.ipynb`)

Tests

- Added new test case
for `LocalOutlierFactorAllData` (`tests.test_outlierdetection.TestOutlierDetection.test_lof_alldata`)

0.76.1

Additions

- It is now possible to set a fixed random seed when creating impulse
noise (`diive.pkgs.createvar.noise.add_impulse_noise`)

Changes

- In class `zScoreIncrements`, outliers are now detected by calculating the sum of the absolute differences between a
data point and its respective preceding and next data point. Before, only the non-absolute difference of the preceding
data point was considered. The sum of absolute differences is then used to calculate the z-score and in further
consequence to flag outliers. (`diive.pkgs.outlierdetection.incremental.zScoreIncrements`)

Notebooks

- Added new notebook for outlier detection using
class `zScoreIncrements` (`notebooks/OutlierDetection/zScoreIncremental.ipynb`)
- Added new notebook for outlier detection using
class `LocalSD` (`notebooks/OutlierDetection/LocalSD.ipynb`)

Tests

- Added new test case for `zScoreIncrements` (`tests.test_outlierdetection.TestOutlierDetection.test_zscore_increments`)
- Added new test case for `LocalSD` (`tests.test_outlierdetection.TestOutlierDetection.test_localsd`)

Page 3 of 15

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.