==================
- Add support for the ZODB 5 ``connection.prefetch(*args)`` API. This
takes either OIDs (``obj._p_oid``) or persistent ghost objects, or
an iterator of those things, and asks the storage to load them into
its cache for use in the future. In RelStorage, this uses the shared
cache and so may be useful for more than one thread. This can be
3x or more faster than loading objects on-demand. See :issue:`239`.
- Stop chunking blob uploads on PostgreSQL. All supported PostgreSQL
versions natively handle blobs greater than 2GB in size, and the
server was already chunking the blobs for storage, so our layer of
extra chunking has become unnecessary.
.. important::
The first time a storage is opened with this version,
blobs that have multiple chunks will be collapsed into a single
chunk. If there are many blobs larger than 2GB, this could take
some time.
It is recommended you have a backup before installing this
version.
To verify that the blobs were correctly migrated, you should
clean or remove your configured blob-cache directory, forcing new
blobs to be downloaded.
- Fix a bug that left large objects behind if a PostgreSQL database
containing any blobs was ever zapped (with ``storage.zap_all()``).
The ``zodbconvert`` command, the ``zodbshootout`` command, and the
RelStorage test suite could all zap databases. Running the
``vacuumlo`` command included with PostgreSQL will free such
orphaned large objects, after which a regular ``vacuumdb`` command
can be used to reclaim space. See :issue:`260`.
- Conflict resolution can use data from the cache, thus potentially
eliminating a database hit during a very time-sensitive process.
Please file issues if you encounter any strange behaviour when
concurrently packing to the present time and also resolving
conflicts, in case there are corner cases.
- Packing a storage now invalidates the cached values that were packed
away. For the global caches this helps reduce memory pressure; for
the local cache this helps reduce memory pressure and ensure a more
useful persistent cache (this probably matters most when running on
a single machine).
- Make MySQL use ``ON DUPLICATE KEY UPDATE`` rather than ``REPLACE``.
This can be friendlier to the storage engine as it performs an
in-place ``UPDATE`` rather than a ``DELETE`` followed by an
``INSERT``. See :issue:`189`.
- Make PostgreSQL use an upsert query for moving rows into place on
history-preserving databases.
- Support ZODB 5's parallel commit feature. This means that the
database-wide commit lock is taken much later in the process, and
held for a much shorter time than before.
Previously, the commit lock was taken during the ``tpc_vote`` phase,
and held while we checked ``Connection.readCurrent`` values, and
checked for (and hopefully resolved) conflicts. Other transaction
resources (such as other ZODB databases in a multi-db setup) then
got to vote while we held this lock. Finally, in ``tpc_finally``,
objects were moved into place and the lock was released. This
prevented any other storage instances from checking for
``readCurrent`` or conflicts while we were doing that.
Now, ``tpc_vote`` is (usually) able to check
``Connection.readCurrent`` and check and resolve conflicts without
taking the commit lock. Only in ``tpc_finish``, when we need to
finally allocate the transaction ID, is the commit lock taken, and
only held for the duration needed to finally move objects into
place. This allows other storages for this database, and other
transaction resources for this transaction, to proceed with voting,
conflict resolution, etc, in parallel.
Consistent results are maintained by use of object-level row
locking. Thus, two transactions that attempt to modify the same
object will now only block each other.
There are two exceptions. First, if the ``storage.restore()`` method
is used, the commit lock must be taken very early (before
``tpc_vote``). This is usually only done as part of copying one
database to another. Second, if the storage is configured with a
shared blob directory instead of a blob cache (meaning that blobs
are *only* stored on the filesystem) and the transaction has added
or mutated blobs, the commit lock must be taken somewhat early to
ensure blobs can be saved (after conflict resolution, etc, but
before the end of ``tpc_vote``). It is recommended to store blobs on
the RDBMS server and use a blob cache. The shared blob layout can be
considered deprecated for this reason).
In addition, the new locking scheme means that packing no longer
needs to acquire a commit lock and more work can proceed in parallel
with regular commits. (Though, there may have been some regressions
in the deletion phase of packing speed MySQL; this has not been
benchmarked.)
.. note::
If the environment variable ``RELSTORAGE_LOCK_EARLY`` is
set when RelStorage is imported, then parallel commit will not be
enabled, and the commit lock will be taken at the beginning of
the tpc_vote phase, just like before: conflict resolution and
readCurrent will all be handled with the lock held.
This is intended for use diagnosing and temporarily working
around bugs, such as the database driver reporting a deadlock
error. If you find it necessary to use this setting, please
report an issue at https://github.com/zodb/relstorage/issues.
See :issue:`125`.
- Deprecate the option ``shared-blob-dir``. Shared blob dirs prevent
using parallel commits when blobs are part of a transaction.
- Remove the 'umysqldb' driver option. This driver exhibited failures
with row-level locking used for parallel commits. See :issue:`264`.
- Migrate all remaining MySQL tables to InnoDB. This is primarily the
tables used during packing, but also the table used for allocating
new OIDs.
Tables will be converted the first time a storage is opened that is
allowed to create the schema (``create-schema`` in the
configuration; default is true). For large tables, this may take
some time, so it is recommended to finish any outstanding packs
before upgrading RelStorage.
If schema creation is not allowed, and required tables are not using
InnoDB, an exception will be raised. Please contact the RelStorage
maintainers on GitHub if you have a need to use a storage engine
besides InnoDB.
This allows for better error detection during packing with parallel
commits. It is also required for `MySQL Group Replication
<https://dev.mysql.com/doc/refman/8.0/en/group-replication-requirements.html>`_.
Benchmarking also shows that creating new objects can be up to 15%
faster due to faster OID allocation.
Things to be aware of:
- MySQL's `general conversion notes
<https://dev.mysql.com/doc/refman/8.0/en/converting-tables-to-innodb.html>`_
suggest that if you had tuned certain server parameters for
MyISAM tables (which RelStorage only used during packing) it
might be good to evaluate those parameters again.
- InnoDB tables may take more disk space than MyISAM tables.
- The ``new_oid`` table may temporarily have more rows in it at one
time than before. They will still be garbage collected
eventually. The change in strategy was necessary to handle
concurrent transactions better.
See :issue:`188`.
- Fix an ``OperationalError: database is locked`` that could occur on
startup if multiple processes were reading or writing the cache
database. See :issue:`266`.