This release is focused on the removal of Nearest-Neighbor Smoothing (NNS) from the clustering phase and accounting for all of the downstream effects. Some of the highlights are
- Removal of NNS! Finally! This was an old artifact from the Dalitz clustering method, which could have dramatic and often times unpredictable effects on the data, including the Particle ID.
- Removal of NNS revealed previously ignored artifact-ing in the point cloud data in the z-coordinate due to the improper handling of conversion from GET TimeBuckets to floating point values. Previously was a simple cast to float, which would result in "clumps" of data in z. Now cast to float and add a random smearing on the interval of [0.0, 1.0) in time buckets. This accurately represents the sampling behavior in the data.
- Removal of NNS as well as user reports revealed bug where point cloud data could sometimes become NaN or Inf if values outside the legal detector volume were attempted to be sent through the electric field correction. Illegal points are now pruned at the clustering phase.
- Removal of NNS necessitated a change to the clustering algorithm. No longer cluster on charge. Without NNS, charge is too diffuse to cluster on. This also allowed simplification of scales as all scales are in the same base units (mm). Only scaling applied is to scale the z-axis to match the x and y axis to avoid over emphasis of z separation in data. These changes affect the recommended default value for epsilon in HDBSCAN. Now recommend value of 10.0 for `cluster_selection_epsilon`.
- Smoothing factor in estimation phase also needs to be increased. New recommended value is 100.0 (verified with scipy).
Documentation is updated. Notebooks updated. Dependency versions have been bumped in this release, so please reinstall the requirements.txt.
This is a big change to the analysis, and can effect the data outcome. Please use caution when updating, and report any issues or unexpected behavior!