Features:
- Added a parameter `use_cache`, which defaults to `True`. When enabled, the algorithm will skip re-evaluating solutions that have already been evaluated, retrieving the performance metrics from the cache instead.
If `use_cache` is set to `False`, the algorithm will always re-evaluate solutions, even if they have been seen before, to obtain fresh performance metrics.
- Added a parameter in `GAFeatureSelectionCV` named `warm_start_configs`, which defaults to `None`. This is a list of predefined hyperparameter configurations to seed the initial population. Each element in the list is a dictionary where the keys are the names of the hyperparameters, and the values are the corresponding hyperparameter values to be used for the individual.
**Example:**
python
warm_start_configs = [
{"min_weight_fraction_leaf": 0.02, "bootstrap": True, "max_depth": None, "n_estimators": 100},
{"min_weight_fraction_leaf": 0.4, "bootstrap": True, "max_depth": 5, "n_estimators": 200},
]
The genetic algorithm will initialize part of the population with these configurations to warm-start the optimization process. The remaining individuals in the population will be initialized randomly according to the defined hyperparameter space.
This parameter is useful when prior knowledge of good hyperparameter configurations exists, allowing the algorithm to focus on refining known good solutions while still exploring new areas of the hyperparameter space. If set to None, the entire population will be initialized randomly.
- Introduced a novelty search strategy to the GASearchCV class. This strategy rewards solutions that are more distinct from others in the population by incorporating a novelty score into the fitness evaluation. The novelty score encourages exploration and promotes diversity, reducing the risk of premature convergence to local optima.
* Novelty Score: Calculated based on the distance between an individual and its nearest neighbors in the population. Individuals with higher novelty scores are more distinct from the rest of the population.
* Fitness Evaluation: The overall fitness is now a combination of the traditional performance score and the novelty score, allowing the algorithm to balance between exploiting known good solutions and exploring new, diverse ones.
* Improved Exploration: This strategy helps explore new areas of the hyperparameter space, increasing the likelihood of discovering better solutions and avoiding local optima.
API Changes:
- Dropped support for Python 3.8