Bitstring

Latest version: v4.3.0

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 5

2.2.0

---

June 18th 2011: version 2.2.0 released

This is a minor upgrade with a couple of new features.

New interleaved exponential-Golomb interpretations

New bit interpretations for interleaved exponential-Golomb (as used in the
Dirac video codec) are supplied via 'uie' and 'sie':

>>> s = BitArray(uie=41)
>>> s.uie
41
>>> s.bin
'0b00010001001'

These are pretty similar to the non-interleaved versions - see the manual
for more details. Credit goes to Paul Sargent for the patch.

New package-level bytealigned variable

A number of methods take a 'bytealigned' parameter to indicate that they
should only work on byte boundaries (e.g. find, replace, split). Previously
this parameter defaulted to 'False'. Instead it now defaults to
'bitstring.bytealigned', which itself defaults to 'False', but can be changed
to modify the default behaviour of the methods. For example:

>>> a = BitArray('0x00 ff 0f ff')
>>> a.find('0x0f')
(4,) found first not on a byte boundary
>>> a.find('0x0f', bytealigned=True)
(16,) forced looking only on byte boundaries
>>> bitstring.bytealigned = True Change default behaviour
>>> a.find('0x0f')
(16,)
>>> a.find('0x0f', bytealigned=False)
(4,)

If you're only working with bytes then this can help avoid some errors and
save some typing!

Other changes
- Fix for Python 3.2, correcting for a change to the binascii module.
- Fix for bool initialisation from 0 or 1.
- Efficiency improvements, including interning strategy.

2.1.1

---

February 23rd 2011: version 2.1.1 released

This is a release to fix a couple of bugs that were introduced in 2.1.0.
- Bug fix: Reading using the 'bytes' token had been broken (Issue 102).
- Fixed problem using some methods on ConstBitArrays.
- Better exception handling for tokens missing values.
- Some performance improvements.

2.1.0

---

January 23rd 2011: version 2.1.0 released

New class hierarchy introduced with simpler classes

Previously there were just two classes, the immutable Bits which was the base
class for the mutable BitString class. Both of these classes have the concept
of a bit position, from which reads etc. take place so that the bitstring could
be treated as if it were a file or stream.

Two simpler classes have now been added which are purely bit containers and
don't have a bit position. These are called ConstBitArray and BitArray. As you
can guess the former is an immutable version of the latter.

The other classes have also been renamed to better reflect their capabilities.
Instead of BitString you can use BitStream, and instead of Bits you can use
ConstBitStream. The old names are kept as aliases for backward compatibility.

The classes hierarchy is:

ConstBitArray
/ \
/ \
BitArray ConstBitStream (formerly Bits)
\ /
\ /
BitStream (formerly BitString)

Other changes

A lot of internal reorganisation has taken place since the previous version,
most of which won't be noticed by the end user. Some things you might see are:
- New package structure. Previous versions have been a single file for the
module and another for the unit tests. The module is now split into many
more files so it can't be used just by copying bitstring.py any more.
- To run the unit tests there is now a script called runtests.py in the test
directory.
- File based bitstring are now implemented in terms of an mmap. This should
be just an implementation detail, but unfortunately for 32-bit versions of
Python this creates a limit of 4GB on the files that can be used. The work
around is either to get a 64-bit Python, or just stick with version 2.0.
- The ConstBitArray and ConstBitStream classes no longer copy byte data when
a slice or a read takes place, they just take a reference. This is mostly
a very nice optimisation, but there are occassions where it could have an
adverse effect. For example if a very large bitstring is created, a small
slice taken and the original deleted. The byte data from the large
bitstring would still be retained in memory.
- Optimisations. Once again this version should be faster than the last.
The module is still pure Python but some of the reorganisation was to make
it more feasible to put some of the code into Cython or similar, so
hopefully more speed will be on the way.

2.0.3

---

July 26th 2010: version 2.0.3 released
- Bug fix: Using peek and read for a single bit now returns a new bitstring
as was intended, rather than the old behaviour of returning a bool.
- Removed HTML docs from source archive - better to use the online version.

2.0.2

---

July 25th 2010: version 2.0.2 released

This is a major release, with a number of backwardly incompatible changes.
The main change is the removal of many methods, all of which have simple
alternatives. Other changes are quite minor but may need some recoding.

There are a few new features, most of which have been made to help the
stream-lining of the API. As always there are performance improvements and
some API changes were made purely with future performance in mind.

The backwardly incompatible changes are:
- Methods removed.

About half of the class methods have been removed from the API. They all have
simple alternatives, so what remains is more powerful and easier to remember.
The removed methods are listed here on the left, with their equivalent
replacements on the right:

s.advancebit() -> s.pos += 1
s.advancebits(bits) -> s.pos += bits
s.advancebyte() -> s.pos += 8
s.advancebytes(bytes) -> s.pos += 8*bytes
s.allunset([a, b]) -> s.all(False, [a, b])
s.anyunset([a, b]) -> s.any(False, [a, b])
s.delete(bits, pos) -> del s[pos:pos+bits]
s.peekbit() -> s.peek(1)
s.peekbitlist(a, b) -> s.peeklist([a, b])
s.peekbits(bits) -> s.peek(bits)
s.peekbyte() -> s.peek(8)
s.peekbytelist(a, b) -> s.peeklist([8*a, 8*b])
s.peekbytes(bytes) -> s.peek(8*bytes)
s.readbit() -> s.read(1)
s.readbitlist(a, b) -> s.readlist([a, b])
s.readbits(bits) -> s.read(bits)
s.readbyte() -> s.read(8)
s.readbytelist(a, b) -> s.readlist([8*a, 8*b])
s.readbytes(bytes) -> s.read(8*bytes)
s.retreatbit() -> s.pos -= 1
s.retreatbits(bits) -> s.pos -= bits
s.retreatbyte() -> s.pos -= 8
s.retreatbytes(bytes) -> s.pos -= 8*bytes
s.reversebytes(start, end) -> s.byteswap(0, start, end)
s.seek(pos) -> s.pos = pos
s.seekbyte(bytepos) -> s.bytepos = bytepos
s.slice(start, end, step) -> s[start:end:step]
s.tell() -> s.pos
s.tellbyte() -> s.bytepos
s.truncateend(bits) -> del s[-bits:]
s.truncatestart(bits) -> del s[:bits]
s.unset([a, b]) -> s.set(False, [a, b])

Many of these methods have been deprecated for the last few releases, but
there are some new removals too. Any recoding needed should be quite
straightforward, so while I apologise for the hassle, I had to take the
opportunity to streamline and rationalise what was becoming a bit of an
overblown API.
- set / unset methods combined.

The set/unset methods have been combined in a single method, which now
takes a boolean as its first argument:

s.set([a, b]) -> s.set(1, [a, b])
s.unset([a, b]) -> s.set(0, [a, b])
s.allset([a, b]) -> s.all(1, [a, b])
s.allunset([a, b]) -> s.all(0, [a, b])
s.anyset([a, b]) -> s.any(1, [a, b])
s.anyunset([a, b]) -> s.any(0, [a, b])

- all / any only accept iterables.

The all and any methods (previously called allset, allunset, anyset and
anyunset) no longer accept a single bit position. The recommended way of
testing a single bit is just to index it, for example instead of:

>>> if s.all(True, i):

just use

>>> if s[i]:

If you really want to you can of course use an iterable with a single
element, such as 's.any(False, [i])', but it's clearer just to write
'not s[i]'.
- Exception raised on reading off end of bitstring.

If a read or peek goes beyond the end of the bitstring then a ReadError
will be raised. The previous behaviour was that the rest of the bitstring
would be returned and no exception raised.
- BitStringError renamed to Error.

The base class for errors in the bitstring module is now just Error, so
it will likely appears in your code as bitstring.Error instead of
the rather repetitive bitstring.BitStringError.
- Single bit slices and reads return a bool.

A single index slice (such as s[5]) will now return a bool (i.e. True or
False) rather than a single bit bitstring. This is partly to reflect the
style of the bytearray type, which returns an integer for single items, but
mostly to avoid common errors like:

>>> if s[0]:
... do_something()

While the intent of this code snippet is quite clear (i.e. do_something if
the first bit of s is set) under the old rules s[0] would be true as long
as s wasn't empty. That's because any one-bit bitstring was true as it was a
non-empty container. Under the new rule s[0] is True if s starts with a '1'
bit and False if s starts with a '0' bit.

The change does not affect reads and peeks, so s.peek(1) will still return
a single bit bitstring, which leads on to the next item...
- Empty bitstrings or bitstrings with only zero bits are considered False.

Previously a bitstring was False if it had no elements, otherwise it was True.
This is standard behaviour for containers, but wasn't very useful for a container
of just 0s and 1s. The new behaviour means that the bitstring is False if it
has no 1 bits. This means that code like this:

>>> if s.peek(1):
... do_something()

should work as you'd expect. It also means that Bits(1000), Bits(0x00) and
Bits('uint:12=0') are all also False. If you need to check for the emptiness of
a bitstring then instead check the len property:

if s -> if s.len
if not s -> if not s.len

- Length and offset disallowed for some initialisers.

Previously you could create bitstring using expressions like:

>>> s = Bits(hex='0xabcde', offset=4, length=13)

This has now been disallowed, and the offset and length parameters may only
be used when initialising with bytes or a file. To replace the old behaviour
you could instead use

>>> s = Bits(hex='0xabcde')[4:17]

- Renamed 'format' parameter 'fmt'.

Methods with a 'format' parameter have had it renamed to 'fmt', to prevent
hiding the built-in 'format'. Affects methods unpack, read, peek, readlist,
peeklist and byteswap and the pack function.
- Iterables instead of *format accepted for some methods.

This means that for the affected methods (unpack, readlist and peeklist) you
will need to use an iterable to specify multiple items. This is easier to
show than to describe, so instead of

>>> a, b, c, d = s.readlist('uint:12', 'hex:4', 'bin:7')

you would instead write

>>> a, b, c, d = s.readlist(['uint:12', 'hex:4', 'bin:7'])

Note that you could still use the single string 'uint:12, hex:4, bin:7' if
you preferred.
- Bool auto-initialisation removed.

You can no longer use True and False to initialise single bit bitstrings.
The reasoning behind this is that as bool is a subclass of int, it really is
bad practice to have Bits(False) be different to Bits(0) and to have Bits(True)
different to Bits(1).

If you have used bool auto-initialisation then you will have to be careful to
replace it as the bools will now be interpreted as ints, so Bits(False) will
be empty (a bitstring of length 0), and Bits(True) will be a single zero bit
(a bitstring of length 1). Sorry for the confusion, but I think this will
prevent bigger problems in the future.

There are a few alternatives for creating a single bit bitstring. My favourite
it to use a list with a single item:

Bits(False) -> Bits([0])
Bits(True) -> Bits([1])

- New creation from file strategy

Previously if you created a bitstring from a file, either by auto-initialising
with a file object or using the filename parameter, the file would not be read
into memory unless you tried to modify it, at which point the whole file would
be read.

The new behaviour depends on whether you create a Bits or a BitString from the
file. If you create a Bits (which is immutable) then the file will never be
read into memory. This allows very large files to be opened for examination
even if they could never fit in memory.

If however you create a BitString, the whole of the referenced file will be read
to store in memory. If the file is very big this could take a long time, or fail,
but the idea is that in saying you want the mutable BitString you are implicitly
saying that you want to make changes and so (for now) we need to load it into
memory.

The new strategy is a bit more predictable in terms of performance than the old.
The main point to remember is that if you want to open a file and don't plan to
alter the bitstring then use the Bits class rather than BitString.

Just to be clear, in neither case will the contents of the file ever be changed -
if you want to output the modified BitString then use the tofile method, for
example.
- find and rfind return a tuple instead of a bool.

If a find is unsuccessful then an empty tuple is returned (which is False in a
boolean sense) otherwise a single item tuple with the bit position is returned
(which is True in a boolean sense). You shouldn't need to recode unless you
explicitly compared the result of a find to True or False, for example this
snippet doesn't need to be altered:

>>> if s.find('0x23'):
... print(s.bitpos)

but you could now instead use

>>> found = s.find('0x23')
>>> if found:
... print(found[0])

The reason for returning the bit position in a tuple is so that finding at
position zero can still be True - it's the tuple (0,) - whereas not found can
be False - the empty tuple ().

The new features in this release are:
- New count method.

This method just counts the number of 1 or 0 bits in the bitstring.

>>> s = Bits('0x31fff4')
>>> s.count(1)
16

- read and peek methods accept integers.

The read, readlist, peek and peeklist methods now accept integers as parameters
to mean "read this many bits and return a bitstring". This has allowed a number
of methods to be removed from this release, so for example instead of:

>>> a, b, c = s.readbits(5, 6, 7)
>>> if s.peekbit():
... do_something()

you should write:

>>> a, b, c = s.readlist([5, 6, 7])
>>> if s.peek(1):
... do_something()

- byteswap used to reverse all bytes.

The byteswap method now allows a format specifier of 0 (the default) to signify
that all of the whole bytes should be reversed. This means that calling just
byteswap() is almost equivalent to the now removed bytereverse() method (a small
difference is that byteswap won't raise an exception if the bitstring isn't a
whole number of bytes long).
- Auto initialise with bytearray or (for Python 3 only) bytes.

So rather than writing:

>>> a = Bits(bytes=some_bytearray)

you can just write

>>> a = Bits(some_bytearray)

This also works for the bytes type, but only if you're using Python 3.
For Python 2 it's not possible to distinguish between a bytes object and a
str. For this reason this method should be used with some caution as it will
make you code behave differently with the different major Python versions.

>>> b = Bits(b'abcd\x23\x00') Only Python 3!

- set, invert, all and any default to whole bitstring.

This means that you can for example write:

>>> a = BitString(100) 100 zero bits
>>> a.set(1) set all bits to 1
>>> a.all(1) are all bits set to 1?
True
>>> a.any(0) are any set to 0?
False
>>> a.invert() invert every bit

- New exception types.

As well as renaming BitStringError to just Error
there are also new exceptions which use Error as a base class.

These can be caught in preference to Error if you need finer control.
The new exceptions sometimes also derive from built-in exceptions:

ByteAlignError(Error) - whole byte position or length needed.

ReadError(Error, IndexError) - reading or peeking off the end of
the bitstring.

CreationError(Error, ValueError) - inappropriate argument during
bitstring creation.

InterpretError(Error, ValueError) - inappropriate interpretation of
binary data.

1.3.0

---

March 18th 2010: version 1.3.0 for Python 2.6 and 3.x released

New features:
- byteswap method for changing endianness.

Changes the endianness in-place according to a format string or
integer(s) giving the byte pattern. See the manual for details.

>>> s = BitString('0x00112233445566')
>>> s.byteswap(2)
3
>>> s
BitString('0x11003322554466')
>>> s.byteswap('h')
3
>>> s
BitString('0x00112233445566')
>>> s.byteswap([2, 5])
1
>>> s
BitString('0x11006655443322')

- Multiplicative factors in bitstring creation and reading.

For example:

>>> s = Bits('100*0x123')

- Token grouping using parenthesis.

For example:

>>> s = Bits('3*(uint:6=3, 0b1)')

- Negative slice indices allowed.

The start and end parameters of many methods may now be negative, with the
same meaning as for negative slice indices. Affects all methods with these
parameters.
- Sequence ABCs used.

The Bits class now derives from collections.Sequence, while the BitString
class derives from collections.MutableSequence.
- Keywords allowed in readlist, peeklist and unpack.

Keywords for token lengths are now permitted when reading. So for example,
you can write

>>> s = bitstring.pack('4*(uint:n)', 2, 3, 4, 5, n=7)
>>> s.unpack('4*(uint:n)', n=7)
[2, 3, 4, 5]

- start and end parameters added to rol and ror.
- join function accepts other iterables.

Also its parameter has changed from 'bitstringlist' to 'sequence'. This is
technically a backward incompatibility in the unlikely event that you are
referring to the parameter by name.
- **init** method accepts keywords.

Rather than a long list of initialisers the **init** methods now use a
**kwargs dictionary for all initialisers except 'auto'. This should have no
effect, except that this is a small backward incompatibility if you use
positional arguments when initialising with anything other than auto
(which would be rather unusual).
- More optimisations.
- Bug fixed in replace method (it could fail if start != 0).

Page 5 of 5

Releases

Has known vulnerabilities

Bitstring

Page 5 of 5

2.2.0

2.1.1

2.1.0

2.0.3

2.0.2

1.3.0

Page 5 of 5

Links

Releases