New Extension - Molecular Factories
The new `molecular_factories` extension includes two classes to assemble and derive molecules combinatorially. Documentation on the new extension can be found [here](https://biobuild.readthedocs.io/en/dev/buildamol.extensions.factories.html#factories).
The Assembler
The `Assembler` class can assemble fragments from a library into molecules. This is a slightly more generic implementation of the code presented in the automated ligand-design pipeline. The class can be used to sample molecular candidates from "candidate space" (i.e., produce random molecules by combining fragments) or create molecules explicitly from an array input (useful for optimization procedures).
The Derivator
The `Derivator` class can create modified versions of a specified molecule, which we call _derivatives_, hence the name. There is also a new [Tutorial](https://biobuild.readthedocs.io/en/dev/examples/derivator_example.html) that exemplifies the usage of the Derivator.
Available modifications are:
- changing elements of atoms
- changing bond orders
- adding functional groups to specific positions
- globally modifying molecules by arbitrary functions
The Derivator can sample molecular candidates just like the Assembler but, in addition, also produce _all_ combinatorically possible derivatives given the desired modifications.
Other Notable Changes
Changes in Modifier Functions
- `amidate` was wrongly named (it was doing _amination_, after all). Now `amidate` will add an **amide** group and `aminate` (new function) will add an **amine** group!
- Residues added by `hydroxylate` are now named `OH` instead of `HOH` (the old name was causing trouble with some PDB reading software that were excluding hydroxyl groups for being "water" due to the residue name)
- `carboxylate` now adheres to the standard naming scheme and calls the atoms `O`, `OXT`, and `HXT` (instead of the current O1, O2, H2)
Refactored Functional Groups
The functional groups (in `structural.groups`) received a significant refactoring. A `BaseFunctionalGroup` class now brings most of the generic features of functional group objects such as `find_matches`, but does not implement any details on the mechanics of finding these matches. This job is carried out by daughter classes. The `FunctionalGroup` (now a daughter of `BaseFunctionalGroup`) serves as the base class for groups defined explicitly by a _single geometry_ around a central atom. The aromatic group is now handled by a separate class `AromaticGroup` - this was the primary reason for the refactor.
What does this mean for users/devs? If you have defined a custom functional group, check if it can still inherit from `FunctionalGroup` (i.e. is defined via a single geometry) or if it should inherit directly from `BaseFunctionalGroup` (i.e. is a little more complex like an aromatic ring).
Miscellaneous Changes
- In-place optimization with RDKit should work now properly
- `rdkit_optimize` now allows to use either `mmff` or `uff` as the force field.
- New methods `Molecule.single(...)`, `Molecule.double(...)`, and `Molecule.triple(...)` to set bond orders more conveniently than just `Molecule.set_bond_order(...)`
- `Residue` objects can now use `name` as a synonym to `resname`
- `is_cis`and `is_trans` received an update that should make them more robust when it comes to symmetric molecules
- `infer_bond_orders` received an update that should make it a little faster
- other bug fixes and small performance enhancements
v.1.2.6
This release of BuildAMol features a small code change that makes BuildAMol now compatible with several older Python versions.
Specifically, BuildAMol should be able to run on:
- Python 3.8 (tested on 3.8.19)
- Python 3.9 (tested on 3.9.19)
- Python 3.10 (tested on 3.10.14)
- Python 3.11 (main development on 3.11.0)
- Python 3.12 (tested on 3.12.2)
BuildAMol uses some string formatting features that make it unable to run on Python 3.7 and earlier. There are no plans to refactor the string formatting to add support for older Python versions.
v.1.2.5
This release of BuildAMol introduces new bioinformatics-related extensions, has a few bug fixes, and, again, slightly refactored architecture to improve compatibility across systems (see below).
New Features
- Extended `bio` extension
- The `bio.proteins` extension now has functions to compute `phi`, `psi` and `omega` angles of polypeptides.
- New `bio.glycans` extension can model glycans from IUPAC string input
- New `bio.lipids` extension offers functions to build:
- fatty acids
- triacylglycerols
- phospholipids
- sphingolipids
- Protonation status can now be changed automatically when changing bond orders and setting atom charges
- The use of Internal Coordinates (IC) when connecting molecules can now be globally disabled using `dont_use_ic()` (and re-enabled using `use_ic()`). This is useful if a given linkage with IC specifies relative coordinates for the wrong stereoisomer of a fragment molecule. Of course, `Molecule.attach(..., use_patch=False)` (or `Molecule.stitch_attach(...)`) can still be used to manually ensure that no ICs are involved in the process on an individual basis.
Technical changes
- The `base_classes` are no longer part of the `core` package but a stand-alone module.
The accessibility of the defined objects (Atom, etc.) has not changed. But be sure to change any direct imports to the module from
python
import buildamol.core.base_classes as ...
or
from buildamol.core import base_classes as ...
to
python
import buildamol.base_classes as ...
or
from buildamol import base_classes as ...
- Annoying warnings from repeated built-in dataset loading are now suppressed
- The utility progress bar `utils.auxiliary.progress_bar` can now use `tqdm` or `alive_progress` as backend.
Notable bug fixes
- Coordinate bleeding when getting reference compounds from the built-in datasets is fixed now. There was an issue that coordinates between molecules were shared if they were obtained by multiple calls to `get_compound` or `Molecule.from_compound` (or the `molecule` function) instead of through the `Molecule.copy` method.
- Fragment mirroring in `Molecule.stitch_attach` is fixed now. There was an issue that in some cases the rotation matrix computed in the `Stitcher` class contained a negative determinant leading to the `source` molecule being mirrored. This should now not happen anymore.