A number of changes to the underlying data in this release - the default PWMs have been changed to a slightly less-stringent set, which should leave most results relatively unchanged and deals with some edge-cases where the original PWMs were overly penalizing for certain base positions due to being built from low-N samples. Other changes include:
- Default 3'SS region shortened to [-6, 4]
- By default, the human U2-type BPS PWM is used instead of the on-the-fly version. A per-run PWM can be generated using `--generate_u2_bps_pwm`
- z-scores in the output have been adjusted to correspond to the entire dataset (previously, they were based on the training set only)
- Non-canonical introns by default now use whatever PWM is closest to their terminal dinucleotides if one is obvious (e.g. for `AT-TC` introns, this would be the `AT-AC` PWM; for `AT-AG` introns, `GT-AG` and `AT-AC` are equally close in terms of edit distance). Otherwise, the terminal dinucleotides will be ignored and the best PWM will be selected based on the geometric mean of the component scores from each PWM. This can be reverted to the old behavior using `--no_ignore_nc_dnts`