pycassa Changelog

1.8.0

This release requires either Python 2.6 or 2.7. Python 2.4 and 2.5 are no
longer supported. There are no concrete plans for Python 3 compatibility
yet.

Features

* Add configurable socket_factory attribute and constructor parameter to
ConnectionPool and SystemManager.
* Add SSL support via the new socket_factory attribute.
* Add support for DynamicCompositeType
* Add mock support through a new pycassa.contrib.stubs module

Bug Fixes

* Don’t return closed connections to the pool. This was primarily a
problem when operations failed after retrying up to the limit,
resulting in a MaximumRetryException or AllServersUnavailable.
* Set keyspace for connection after logging in instead of before. This
fixes authentication against Cassandra 1.2, which requires logging in
prior to setting a keyspace.
* Specify correct UUID variant when creating v1 uuid.UUID objects from
datetimes or timestamps
* Add 900ns to v1 uuid.UUID timestamps when the “max” TimeUUID for a
specific datetime or timestamp is requested, such as a column slice end
* Also look at attributes of parent classes when creating columns from
attributes in ColumnFamilyMap

Other

* Upgrade bundled Thrift-generated python to 19.35.0, generated with
Thrift 0.9.0.

1.7.2

This release fixes a minor bug and upgrades the bundled Cassandra
Thrift client interface to 19.34.0, matching Cassandra 1.2.0-beta1.
This doesn't affect any existing Thrift methods, only adds new ones
(that aren't yet utilized by pycassa), so there should not be any
breakage.

Bug Fixes

* Fix single-component composite packing
* Avoid cyclic imports during installation in setup.py

Other

* Travis CI integration

1.7.1

This release has few changes, and should make for a smooth upgrade
from 1.7.0.

Features

* Add support for DecimalType

Bug Fixes

* Fix bad slice ends when using xget() with composite columns and a
column_finish parameter
* Fix bad documentation paths in debian packaging scripts

Other

* Add __version__ and __version_info__ attributes to the pycassa module

1.7.0

This release has a few relatively large changes in it: a new connection
pool stats collector, compatibility with Cassandra 0.7 through 1.1, and a
change in timezone behavior for datetimes.

Before upgrading, take special care to make sure datetimes that you pass to
pycassa (for TimeUUIDType or DateType data) are in UTC, and make sure your code
expects to get UTC datetimes back in return.

Likewise, the SystemManager changes should be backwards compatible, but there
may be minor differences, mostly in create_column_family() and
alter_column_family(). Be sure to test any code that works programmatically
with these.

Features

* Added StatsLogger for tracking ConnectionPool metrics
* Full Cassandra 1.1 compatibility in SystemManager. To support this, all
column family or keyspace attributes that have existed since Cassandra 0.7 may
be used as keyword arguments for create_column_family() and
alter_column_family(). It is up to the user to know which attributes are
available and valid for their version of Cassandra. As part of this change, the
version-specific thrift-generated cassandra modules (pycassa.cassandra.c07,
pycassa.cassandra.c08, and pycassa.cassandra.c10) have been replaced by
pycassa.cassandra. A minor related change is that individual connections now
now longer ask for the node’s API version, and that information is no longer
stored as an attribute of the ConnectionWrapper.

Bug Fixes

* Fix xget() paging for non-string comparators
* Add batch_insert() to ColumnFamilyMap
* Use setattr instead of directly updating the object’s __dict__ in
* ColumnFamilyMap to avoid breaking descriptors
* Fix single-column counter increments with ColumnFamily.insert()
* Include AuthenticationException and AuthorizationException in the pycassa module
* Support counters in xget()
* Sort column families in pycassaShell for display
* Raise TypeError when bad keyword arguments are used when creating a ColumnFamily object

Other

All datetime objects create by pycassa now use UTC as their timezone
rather than the local timezone. Likewise, naive datetime objects that
are passed to pycassa are now assumed to be in UTC time, but tz_info is
respected if set.

Specifically, the types of data that you may need to make adjustments
for when upgrading are TimeUUIDType and DateType (including OldPycassaDateType
and IntermediateDateType).

1.6.0

This release adds a few minor features and several important bug fixes.

The most important change to take note of if you are using composite
comparators is the change to the default inclusive/exclusive behavior for slice
ends.

Other than that, this should be a smooth upgrade from 1.5.x.

Features

* New script for easily building RPM packages
* Add request and parameter information to PoolListener callback
* Add ColumnFamily.xget(), a generator version of get() that automatically
pages over columns in reasonably sized chunks
* Add support for Int32Type, a 4-byte signed integer format
* Add constants for the highest and lowest possible TimeUUID values to
pycassa.util

Bug Fixes

* Various 2.4 syntax errors
* Raise AllServersUnavailable if server_list is empty
* Handle custom types inside of composites
* Don’t erase comment when updating column families
* Match Cassandra’s sorting of TimeUUIDType values when the timestamps
tie
* This could result in some columns being erroneously left off of the end
of column slices when datetime objects or timestamps were used for
column_start or column_finish.
* Use gevent’s queue in place of the stdlib version when gevent
monkeypatching has been applied.
* Avoid sub-microsecond loss of precision with TimeUUID timestamps when
using pycassa.util.convert_time_to_uuid()
* Make default slice ends inclusive when using CompositeType comparator
* Previously, the end of the slice was exclusive by default (as was the
start of the slice when column_reversed was True)

1.5.1

This release only affects those of you using DateType data, which has been
supported since pycassa 1.2.0. If you are using DateType, it is very
important that you read this closely.

DateType data is internally stored as an 8 byte integer timestamp. Since
version 1.2.0 of pycassa, the timestamp stored has counted the number of
microseconds since the unix epoch. The actual format that Cassandra
standardizes on is milliseconds since the epoch.

If you are only using pycassa, you probably won’t have noticed any problems
with this. However, if you try to use cassandra-cli, sstable2json, Hector,
or any other client that supports DateType, DateType data written by pycassa
will appear to be far in the future. Similarly, DateType data written by
other clients will appear to be in the past when loaded by pycassa.

This release changes the default DateType behavior to comply with the
standard, millisecond-based format. If you use DateType, and you upgrade to
this release without making any modifications, you will have problems.
Unfortunately, this is a bit of a tricky situation to resolve, but the
appropriate actions to take are detailed below.

To temporarily continue using the old behavior, a new class has been
created: pycassa.types.OldPycassaDateType. This will read and write DateType
data exactly the same as pycassa 1.2.0 to 1.5.0 did.

If you want to convert your data to the new format, the other new class,
pycassa.types.IntermediateDateType, may be useful. It can read either the
new or old format correctly (unless you have used dates close to 1970 with
the new format) and will write only the new format. The best case for using
this is if you have DateType validated columns that don’t have a secondary
index on them.

To tell pycassa to use OldPycassaDateType or IntermediateDateType, use the
ColumnFamily attributes that control types: column_name_class,
key_validation_class, column_validators, and so on. Here’s an example:

from pycassa.types import OldPycassaDateType, IntermediateDateType
from pycassa.column_family import ColumnFamily
from pycassa.pool import ConnectionPool

pool = ConnectionPool('MyKeyspace', ['192.168.1.1'])

Our tweet timeline has a comparator_type of DateType
tweet_timeline_cf = ColumnFamily(pool, 'tweets')
tweet_timeline_cf.column_name_class = OldPycassaDateType()

Our tweet timeline has a comparator_type of DateType
users_cf = ColumnFamily(pool, 'users')
users_cf.column_validators['join_date'] = IntermediateDateType()

If you’re using DateType for the key_validation_class, column names, column
values with a secondary index on them, or are using the DateType validated
column as a non-indexed part of an index clause with get_indexed_slices()
(eg. “where state = ‘TX’ and join_date > 2012”), you need to be more careful
about the conversion process, and IntermediateDateType probably isn’t a good
choice.

In most of cases, if you want to switch to the new date format, a manual
migration script to convert all existing DateType data to the new format
will be needed. In particular, if you convert keys, column names, or indexed
columns on a live data set, be very careful how you go about it. If you need
any assistance or suggestions at all with migrating your data, please feel
free to send an email to tylerdatastax.com; I would be glad to help.

Pycassa

Page 2 of 7