Bitstring Manual

bitstring_manual

User Manual: Pdf

Open the PDF directly: View PDF .
Page Count: 133 [warning: Documents this large are best viewed by clicking the View PDF Link!]

I User Manual
II Reference
III Appendices

A Python module to help you manage your bits

by Scott Grifﬁths

version 3.0.0

November 20, 2011

python-bitstring.googlecode.com

CONTENTS

I User Manual 1

1 Walkthrough 3

1.1 A Brief Introduction ............................... 3

1.1.1 Prerequisites .............................. 3

1.1.2 Getting started ............................. 3

1.1.3 Modifying bitstrings .......................... 5

1.1.4 Finding and Replacing ......................... 6

1.1.5 Constructing a bitstring ......................... 6

1.1.6 Parsing bitstreams ............................ 8

1.2 Worked examples ................................. 8

1.2.1 Hamming distance ........................... 9

1.2.2 Sieve of Eratosthenes .......................... 9

2 Introduction 11

2.1 Getting Started .................................. 12

3 Creation 15

3.1 The bitstring classes ............................... 15

3.2 Using the constructor ............................... 16

3.2.1 From a hexadecimal string ....................... 16

3.2.2 From a binary string .......................... 17

3.2.3 From an octal string ........................... 17

3.2.4 From an integer ............................. 17

3.2.5 Big and little-endian integers ...................... 18

3.2.6 From a ﬂoating point number ...................... 18

3.2.7 Exponential-Golomb codes ....................... 19

3.2.8 From raw byte data ........................... 19

3.2.9 From a ﬁle ................................ 20

3.3 The auto initialiser ................................ 20

4 Packing 23

4.1 Compact format strings ............................. 25

5 Interpreting Bitstrings 27

5.1 bin ........................................ 27

5.2 hex ........................................ 27

5.3 oct ........................................ 28

5.4 uint / uintbe / uintle / uintne ........................... 28

5.5 int / intbe / intle / intne .............................. 29

5.6 ﬂoat / ﬂoatbe / ﬂoatle / ﬂoatne .......................... 29

5.7 bytes ....................................... 29

5.8 ue ......................................... 30

5.9 se ......................................... 30

5.10 uie / sie ...................................... 30

6 Slicing, Dicing and Splicing 31

6.1 Slicing ...................................... 31

6.1.1 Stepping in slices ............................ 32

6.2 Joining ...................................... 32

6.3 Truncating, inserting, deleting and overwriting ................. 33

6.3.1 Deleting and truncating ......................... 33

6.3.2 insert ................................. 33

6.3.3 overwrite .............................. 34

6.4 The bitstring as a list ............................... 34

6.5 Splitting ..................................... 34

6.5.1 split ................................. 34

6.5.2 cut ................................... 34

7 Reading, Parsing and Unpacking 37

7.1 Reading and parsing ............................... 37

7.1.1 read / readlist ......................... 37

7.1.2 Reading using format strings ...................... 38

7.1.3 Peeking ................................. 39

7.2 Unpacking .................................... 39

7.3 Seeking ...................................... 40

7.4 Finding and replacing .............................. 40

7.4.1 find / rfind ............................ 40

7.4.2 findall ................................ 41

7.4.3 replace ................................ 41

7.5 Working with byte aligned data ......................... 41

8 Miscellany 43

8.1 Other Functions ................................. 43

8.1.1 bytealign .............................. 43

8.1.2 reverse ................................ 43

8.1.3 tobytes ................................ 43

8.1.4 tofile ................................. 44

8.1.5 startswith / endswith ..................... 44

8.1.6 ror / rol .............................. 44

8.2 Special Methods ................................. 44

8.2.1 __len__ ................................ 44

8.2.2 __str__ / __repr__ ....................... 45

8.2.3 __eq__ / __ne__ ......................... 45

8.2.4 __invert__ ............................. 45

8.2.5 __lshift__ / __rshift__ / __ilshift__ /

__irshift__ ............................. 46

8.2.6 __mul__ / __imul__ / __rmul__ .............. 46

8.2.7 __copy__ ............................... 46

8.2.8 __and__ / __or__ / __xor__ / __iand__ /

__ior__ / __ixor__ ....................... 47

II Reference 49

9 Quick Reference 51

9.1 Bits ........................................ 51

9.1.1 Methods ................................. 51

9.1.2 Special methods ............................. 52

9.1.3 Properties ................................ 52

9.2 BitArray ..................................... 52

9.2.1 Additional methods ........................... 53

9.2.2 Additional special methods ....................... 53

9.2.3 Attributes ................................ 53

9.3 ConstBitStream .................................. 53

9.3.1 Additional methods ........................... 53

9.3.2 Additional attributes .......................... 54

9.4 BitStream ..................................... 54

10 The bitstring module 55

10.1 The auto initialiser ................................ 55

10.2 Compact format strings ............................. 56

10.3 Class properties ................................. 57

11 The Bits class 59

12 The BitArray class 71

13 The ConstBitStream class 79

14 The BitStream class 83

15 Functions 85

16 Exceptions 87

III Appendices 89

17 Examples 93

17.1 Creation ..................................... 93

17.2 Manipulation ................................... 93

17.3 Parsing ...................................... 94

17.4 Sieve of Eratosthenes ............................... 94

iii

18 Exponential-Golomb Codes 95

18.1 Interleaved exponential-Golomb codes ..................... 96

19 Optimisation Techniques 97

19.1 Use combined read and interpretation ...................... 97

19.2 Choose the simplest class you can ........................ 97

19.3 Use dedicated functions for bit setting and checking .............. 98

20 Release Notes 99

20.1 Full Version History ............................... 99

20.1.1 November 21st 2011: version 3.0.0 released .............. 99

20.1.2 Backwardly incompatible changes ................... 99

20.1.3 New features ..............................100

20.1.4 June 18th 2011: version 2.2.0 released .................100

20.1.5 February 23rd 2011: version 2.1.1 released ..............101

20.1.6 January 23rd 2011: version 2.1.0 released ...............102

20.1.7 New class hierarchy introduced with simpler classes .........102

20.1.8 July 26th 2010: version 2.0.3 released .................103

20.1.9 July 25th 2010: version 2.0.2 released .................103

20.1.10 The backwardly incompatible changes are: ..............103

20.1.11 The new features in this release are: ..................107

20.1.12 March 18th 2010: version 1.3.0 for Python 2.6 and 3.x released . . . . 109

20.1.13 New features ..............................109

20.1.14 January 19th 2010: version 1.2.0 for Python 2.6 and 3.x released . . . 111

20.1.15 New ‘Bits’ class .............................111

20.1.16 December 22nd 2009: version 1.1.3 for Python 2.6 and 3.x released . 112

20.1.17 December 18th 2009: version 1.1.2 for Python 2.6 and 3.x released . . 112

20.1.18 November 24th 2009: version 1.1.0 for Python 2.6 and 3.x released . 113

20.1.19 New features ..............................113

20.1.20 October 10th 2009: version 1.0.1 for Python 3.x released .......114

20.1.21 October 9th 2009: version 1.0.0 for Python 2.x released ........114

20.1.22 API Changes ..............................115

20.1.23 New features ..............................117

20.1.24 September 11th 2009: version 0.5.2 for Python 2.x released ......117

20.1.25 August 26th 2009: version 0.5.1 for Python 2.x released .......118

20.1.26 July 19th 2009: version 0.5.0 for Python 2.x released .........119

20.1.27 June 15th 2009: version 0.4.3 for Python 2.x released .........121

20.1.28 May 25th 2009: version 0.4.2 for Python 2.x released .........122

20.1.29 April 23rd 2009: Python 3 only version 0.4.1 released .........123

20.1.30 April 11th 2009: version 0.4.0 released ................124

20.1.31 March 11th 2009: version 0.3.2 released ................125

20.1.32 February 26th 2009: version 0.3.1 released ..............125

20.1.33 February 15th 2009: version 0.3.0 released ..............126

20.1.34 January 13th 2009: version 0.2.0 released ...............127

20.1.35 December 29th 2008: version 0.1.0 released ..............127

Part I

User Manual

CHAPTER

ONE

WALKTHROUGH

1.1 A Brief Introduction

The aim of the bitstring module is make dealing with binary data in Python as easy as

possible. In this section I will take you through some of the features of the module to help you

get started using it.

Only a few of the module’s features will be covered in this walkthrough; the User Manual and

Reference provide a more thorough guide. The whole of this section can be safely skipped or

skimmed over if you prefer to start with the manual. If however you’d like a gentler introduction

then you might like to follow along the examples with a Python interpreter.

1.1.1 Prerequisites

• Python 2.6, 2.7 or 3.x. (If you’re using Python 2.4 or 2.5 then you can use bitstring

version 1.0, but that isn’t covered here.)

• An installed bitstring module.

• A rudimentory knowledge of binary concepts.

• A little free time.

If you haven’t yet downloaded and installed bitstring then please do so (it might

be as easy as typing “sudo easy_install bitstring” or “sudo pip install

bitstring” depending on your system). I’ll be going through some examples using the

interactive Python interpreter, so feel free to start up a Python session and follow along.

1.1.2 Getting started

>>> from bitstring import BitArray, BitStream

First things ﬁrst, we’re going to be typing ‘bitstring’ a lot, so importing directly saves us a lot of

bitstring.BitStream nonsense. The classes we have imported are BitArray which

is just a container for our binary data and BitStream which adds a bit position and reading

bitstring Documentation, Release 3.0.0

methods to treat the data as a stream. There are also immutable versions of both these classes

that we won’t be using here.

We can now create a couple of bitstrings:

>>> a=BitArray(’0xff01’)

>>> b=BitArray(’0b110’)

The ﬁrst of these we made from the hexadecimal string 0xff01 - the 0x preﬁx makes it

hexadecimal just as 0b means binary and 0o means octal. Each hex digit represents four bits,

so we have a bitstring of length 16 bits.

The second was created from a binary string. In this case it is just three bits long. Don’t worry

about it not being a whole number of bytes long, that’s all been taken care of internally.

Note: Be sure to remember the quotes around the hex and binary strings. If you forget

them you would just have an ordinary Python integer, which would instead create a bitstring

of that many ‘0’ bits. For example 0xff01 is the same as the base-10 number 65281, so

BitArray(0xff01) would consist of 65281 zero bits!

There are lots of things we can do with our new bitstrings, the simplest of which is just to print

them:

>>> print(a)

0xff01

>>> print(b)

0b110

Now you would be forgiven for thinking that the strings that we used to create the two bitstrings

had just been stored to be given back when printed, but that’s not the case. Every bitstring

should be considered just as a sequence of bits. As we’ll see there are lots of ways to create and

manipulate them, but they have no memory of how they were created. When they are printed

they just pick the simplest hex or binary representation of themselves. If you prefer you can

pick the representation that you want:

>>> a.bin

’1111111100000001’

>>> b.oct

’6’

>>> b.int

-2

>>> a.bytes

’\xff\x01’

There are a few things to note here:

• To get the different interpretations of the binary data we use properties such as bin,hex,

oct,int and bytes. You can probably guess what these all mean, but you don’t need

to know quite yet. The properties are calculated when you ask for them rather than being

stored as part of the object itself.

• The bytes property returns a bytes object. This is slightly different in Python 2 to

4 Chapter 1. Walkthrough

bitstring Documentation, Release 3.0.0

Python 3 - in Python 3 you would get b’\xff\x01’ returned instead.

Great - let’s try some more:

>>> b.hex

bitstring.InterpretError: Cannot convert to hex unambiguously - not multiple of 4 bits.

Oh dear. The problem we have here is that bis 3 bits long, whereas each hex digit represents

4 bits. This means that there is no unambiguous way to represent it in hexadecimal. There are

similar restrictions on other interpretations (octal must be a mulitple of 3 bits, bytes a multiple

of 8 bits etc.)

An exception is raised rather than trying to guess the best hex representation as there are a

multitude of ways to convert to hex. I occasionally get asked why it doesn’t just do the ‘obvious’

conversion, which is invariably what that person expects from his own ﬁeld of work. This could

be truncating bits at the start or end, or padding at the start or end with either zeros or ones.

Rather than try to guess what is meant we just raise an exception - if you want a particular

behaviour then write it explicitly:

>>> (b +[0]).hex

’c’

>>> ([0]+b).hex

’6’

Here we’ve added a zero bit ﬁrst to the end and then to the start. Don’t worry too much about

how it all works, but just to give you a taster the zero bit [0] could also have been written

as BitArray([0]),BitArray([0]),BitArray(’0b0’),BitArray(bin=’0’),

’0b0’ or just 1(this ﬁnal method isn’t a typo, it means construct a bitstring of length one,

with all the bits initialised to zero - it does look a bit confusing though which is why I prefer

[0] and [1] to represent single bits). Take a look at The auto initialiser for more details.

1.1.3 Modifying bitstrings

ABitArray can be treated just like a list of bits. You can slice it, delete sections, insert new

bits and more using standard index notation:

>>> print(a[3:9])

0b111110

>>> del a[-6:]

>>> print(a)

0b1111111100

The slicing works just as it does for other containers, so the deletion above removes the ﬁnal

six bits.

If you ask for a single item, rather than a slice, a boolean is returned. Naturally enough 1bits

are True whereas 0bits are False.

>>> a[0]

True

>>> a[-1]

False

1.1. A Brief Introduction 5

bitstring Documentation, Release 3.0.0

To join together bitstrings you can use a variety of methods, including append,prepend,

insert, and plain +or += operations:

>>> a.prepend(’0b01’)

>>> a.append(’0o7’)

>>> a+= ’0x06’

Here we ﬁrst put two bits at the start of a, then three bits on the end (a single octal digit) and

ﬁnally another byte (two hex digits) on the end.

Note how we are just using ordinary strings to specify the new bitstrings we are adding. These

get converted automatically to the right sequence of bits.

Note: The length in bits of bitstrings speciﬁed with strings depends on the number of charac-

ters, including leading zeros. So each hex character is four bits, each octal character three bits

and each binary character one bit.

1.1.4 Finding and Replacing

Afind is provided to search for bit patterns within a bitstring. You can choose whether to

search only on byte boundaries or at any bit position:

>>> a=BitArray(’0xa9f’)

>>> a.find(’0x4f’)

(3,)

Here we have found the 0x4f byte in our bitstring, though it wasn’t obvious from the hexadec-

imal as it was at bit position 3. To see this clearer consider this equality:

>>> a== ’0b101, 0x4f, 0b1’

True

in which we’ve broken the bitstring into three parts to show the found byte. This also illustrates

using commas to join bitstring sections.

1.1.5 Constructing a bitstring

Let’s say you have a speciﬁcation for a binary ﬁle type (or maybe a packet speciﬁcation etc.)

and you want to create a bitstring quickly and easily in Python. For this example I’m going

to use a header from the MPEG-2 video standard. Here’s how the header is described in the

standard:

6 Chapter 1. Walkthrough

bitstring Documentation, Release 3.0.0

sequence_header() No. of bits Mnemonic

sequence_header_code 32 bslbf

horizontal_size_value 12 uimsbf

vertical_size_value 12 uimsbf

aspect_ratio_information 4 uimsbf

frame_rate_code 4 uimsbf

bit_rate_value 18 uimsbf

marker_bit 1 bslbf

vbv_buffer_size_value 10 uimsbf

constrained_parameters_ﬂag 1 bslbf

load_intra_quantiser_matrix 1 uimsbf

if (load_intra_quantiser_matrix)

{ intra_quantiser_matrix[64] } 8*64 uimsbf

load_non_intra_quantiser_matrix 1 uimsbf

if (load_non_intra_quantiser_matrix)

{ non_intra_quantiser_matrix[64] } 8*64 uimsbf

next_start_code()

The mnemonics mean things like uimsbf = ‘Unsigned integer, most signiﬁcant bit ﬁrst’.

So to create a sequence_header for your particular stream with width of 352 and height of 288

you could start like this:

s=BitArray()

s.append(’0x000001b3’)# the sequence_header_code

s.append(’uint:12=352’)# 12 bit unsigned integer

s.append(’uint:12=288’)

...

which is ﬁne, but if you wanted to be a bit more concise you could just write

s=BitArray(’0x000001b3, uint:12=352, uint:12=288’)

This is better, but it might not be a good idea to have the width and height hard-wired in like

that. We can make it more ﬂexible by using a format string and the pack function:

width, height =352,288

s=bitstring.pack(’0x000001b3, 2*uint:12’, width, height)

where we have also used 2*uint:12 as shorthand for uint:12, uint:12.

The pack function can also take a dictionary as a parameter which can replace the tokens in

the format string. For example:

fmt = ’sequence_header_code,

uint:12=horizontal_size_value,

uint:12=vertical_size_value,

uint:4=aspect_ratio_information,

...

’

d = {’sequence_header_code’: ’0x000001b3’,

’horizontal_size_value’: 352,

1.1. A Brief Introduction 7

bitstring Documentation, Release 3.0.0

’vertical_size_value’: 288,

’aspect_ratio_information’: 1,

...

}

s = bitstring.pack(fmt, **d)

1.1.6 Parsing bitstreams

You might have noticed that pack returned a BitStream rather than a BitArray. This

isn’t a problem as the BitStream class just adds a few stream-like qualities to BitArray

which we’ll take a quick look at here.

First, let’s look at the stream we’ve just created:

>>> s

BitStream(’0x000001b31601201’)

The stream-ness of this object is via its bit position, and various reading and peeking methods.

First let’s try a read or two, and see how this affects the bit position:

>>> s.pos

>>> s.read(24)

BitStream(’0x000001’)

>>> s.pos

>>> s.read(’hex:8’)

’b3’

>>> s.pos

First we read 24 bits, which returned a new BitStream object, then we used a format string

to read 8 bits interpreted as a hexadecimal string. We know that the next two sets of 12 bits

were created from integers, so to read them back we can say

>>> s.readlist(’2*uint:12’)

[352, 288]

If you don’t want to use a bitstream then you can always use unpack. This takes much the

same form as readlist except it just unpacks from the start of the bitstring. For example:

>>> s.unpack(’bytes:4, 2*uint:12, uint:4’)

[’\x00\x00\x01\xb3’, 352, 288, 1]

1.2 Worked examples

Below are a few examples of using the bitstring module, as I always ﬁnd that a good example

can help more than a lengthy reference manual.

8 Chapter 1. Walkthrough

bitstring Documentation, Release 3.0.0

1.2.1 Hamming distance

The Hamming distance between two bitstrings is the number of bit positions in which the two

bitstrings differ. So for example the distance between 0b00110 and 0b01100 is 2 as the second

and fourth bits are different.

Write a function that calculates the Hamming weight of two bitstrings.

def hamming_weight(a, b):

return (a^b).count(True)

Er, that’s it. The ^is a bit-wise exclusive or, which means that the bits in a^b are only set if

they differ in aand b. The count method just counts the number of 1 (or True) bits.

>>> a=Bits(’0b00110’)

>>> hamming_weight(a, ’0b01100’)

1.2.2 Sieve of Eratosthenes

The sieve of Eratosthenes is an ancient (and very inefﬁcient) method of ﬁnding prime numbers.

The algorithm starts with the number 2 (which is prime) and marks all of its multiples as not

prime, it then continues with the next unmarked integer (which will also be prime) and marks

all of its multiples as not prime.

So to print all primes under a million you could write:

from bitstring import BitArray

# create a BitArray with a million zero bits.

# The bits will be set to indicate that the bit position isn’t prime.

has_factors =BitArray(1000000)

for iin xrange(2,1000000):

if not has_factors[i]:

print(i)

# Set all multiples of our prime to 1.

has_factors.set(True,xrange(i*2,1000000, i))

I’ll leave optimising the algorithm as an exercise for the reader, but it illustrates both bit check-

ing and setting. One reason you might want to use a bitstring for this purpose (instead of a

plain list for example) is that the million bits only take up a million bits in memory, whereas

for a list of integers it would be much more. Try asking for a billion elements in a list - unless

you’ve got some really nice hardware it will fail, whereas a billion element bitstring only takes

125MB.

1.2. Worked examples 9

bitstring Documentation, Release 3.0.0

10 Chapter 1. Walkthrough

CHAPTER

TWO

INTRODUCTION

While it is not difﬁcult to manipulate binary data in Python, for example using the struct

and array modules, it can be quite ﬁddly and time consuming even for quite small tasks,

especially if you are not dealing only with whole-byte data.

The bitstring module provides four classes, BitStream,BitArray,ConstBitStream

and Bits, instances of which can be constructed from integers, ﬂoats, hex, octal, binary,

strings or ﬁles, but they all just represent a string of binary digits. I shall use the general

term ‘bitstring’ when referring generically to any of the classes, and use the class names for

parts that apply to only one or another.

BitArray objects can be sliced, joined, reversed, inserted into, overwritten, packed, un-

packed etc. with simple functions or slice notation. BitStream objects can also be read

from, searched in, and navigated in, similar to a ﬁle or stream.

Bitstrings are designed to be as lightweight as possible and can be considered to be just a list

of binary digits. They are however stored efﬁciently - although there are a variety of ways of

creating and viewing the binary data, the bitstring itself just stores the byte data, and all views

are calculated as needed, and are not stored as part of the object.

The different views or interpretations on the data are accessed through properties such as hex,

bin and int, and an extensive set of functions is supplied for modifying, navigating and

analysing the binary data.

A complete reference for the module is given in the Reference section, while the rest of this

manual acts more like a tutorial or guided tour. Below are just a few examples to whet your

appetite; everything here will be covered in greater detail in the rest of this manual.

from bitstring import BitArray

Just some of the ways to create bitstrings:

# from a binary string

a=BitArray(’0b001’)

# from a hexadecimal string

b=BitArray(’0xff470001’)

# straight from a file

c=BitArray(filename=’somefile.ext’)

# from an integer

d=BitArray(int=540, length=11)

bitstring Documentation, Release 3.0.0

# using a format string

d=BitArray(’int:11=540’)

Easily construct new bitstrings:

# 5 copies of ’a’ followed by two new bytes

e=5*a+’0xcdcd’

# put a single bit on the front

e.prepend(’0b1’)

# take a slice of the first 7 bits

f=e[7:]

# replace 3 bits with 9 bits from octal string

f[1:4]=’0o775’

# find and replace 2 bit string with 16 bit string

f.replace(’0b01’,’0xee34’)

Interpret the bitstring however you want:

>>> print(e.hex)

’9249cdcd’

>>> print(e.int)

-1840656947

>>> print(e.uint)

2454310349

2.1 Getting Started

The easiest way to install bitstring is to use easy_install via:

sudo easy_install bitstring

or similar.

If you want an earlier version, or need other ﬁles in the full package, you can download and

extract the contents of the .zip provided on the project’s website.

First download the latest release (see the Downloads tab on the project’s homepage). Note that

this manual covers only the Python 2.6 and later version. Version 1.0 is available for Python

2.4 / 2.5, which can be found on the project’s homepage.

If you then extract the contents of the zip ﬁle you should ﬁnd ﬁles organised in these directories

•bitstring/ : The bitstring module ﬁles.

•test/ : Unit tests for the module, plus some example ﬁles for testing purposes.

•doc/ : This manual as a PDF and as HTML.

If you downloaded the source and want to install, run:

python setup.py install

12 Chapter 2. Introduction

bitstring Documentation, Release 3.0.0

You might need to add a ‘sudo’ to the start of that command, depending on your system. This

will copy the source ﬁles to your Python installation’s site-packages directory.

The module comes with comprehensive unit tests. To run them yourself use your favourite unit

test running method, mine is:

nosetests -w test

which should run all the tests (over 400) and say OK. If tests fail then either your version of

Python isn’t supported (you need Python 2.6, 2.7 or 3.x, though earlier versions supported 2.4

and 2.5) or something unexpected has happened - in which case please tell me about it.

2.1. Getting Started 13

bitstring Documentation, Release 3.0.0

14 Chapter 2. Introduction

CHAPTER

THREE

CREATION

You can create bitstrings in a variety of ways. Internally they are stored as byte arrays, which

means that no space is wasted, and a bitstring containing 10MB of binary data will only take

up 10MB of memory.

3.1 The bitstring classes

Four classes are provided by the bitstring module: BitStream and BitArray together with

their immutable versions ConstBitStream and Bits:

•Bits (object): This is the most basic class. It is immutable and so its contents can’t

be changed after creation.

•BitArray (Bits): This adds mutating methods to its base class.

•ConstBitStream (Bits): This adds methods and properties to allow the bits to be

treated as a stream of bits, with a bit position and reading/parsing methods.

•BitStream (BitArray, ConstBitStream): This is the most versative class,

having both the bitstream methods and the mutating methods.

Before verion 3.0 Bits was known as ConstBitArray. The old name is still available for

backward compatibility.

The term ‘bitstring’ is used in this manual to refer generically to any of these classes.

Most of the exampes in this manual use the BitArray class, with BitStream used when

necessary. For most uses the non-const classes are more versatile and so probably your best

choice when starting to use the module.

To summarise when to use each class:

• If you need to change the contents of the bitstring then you must use BitArray or

BitStream. Truncating, replacing, inserting, appending etc. are not available for the

const classes.

• If you need to use a bitstring as the key in a dictionary or as a member of a set then you

must use Bits or a ConstBitStream. As BitArray and BitStream objects are

mutable they do not support hashing and so cannot be used in these ways.

bitstring Documentation, Release 3.0.0

• If you are creating directly from a ﬁle then a BitArray or BitStream will read the

ﬁle into memory whereas a Bits or ConstBitStream will not, so using the const

classes allows extremely large ﬁles to be examined.

• If you don’t need the extra functionality of a particular class then the simpler ones might

be faster and more memory efﬁcient. The fastest and most memory efﬁcient class is

Bits.

The Bits class is the base class of the other three class. This means that isinstance(s,

Bits) will be true if sis an instance of any of the four classes.

3.2 Using the constructor

When initialising a bitstring you need to specify at most one initialiser. These will be explained

in full below, but brieﬂy they are:

•auto : Either a specially formatted string, a list or tuple, a ﬁle object, integer, bytearray,

bytes or another bitstring.

•bytes : A bytes object (a str in Python 2.6), for example read from a binary ﬁle.

•hex,oct,bin: Hexadecimal, octal or binary strings.

•int,uint: Signed or unsigned bit-wise big-endian binary integers.

•intle,uintle: Signed or unsigned byte-wise little-endian binary integers.

•intbe,uintbe: Signed or unsigned byte-wise big-endian binary integers.

•intne,uintne: Signed or unsigned byte-wise native-endian binary integers.

•float /floatbe,floatle,floatne: Big, little and native endian ﬂoating point

numbers.

•se,ue : Signed or unsigned exponential-Golomb coded integers.

•sie,uie : Signed or unsigned interleaved exponential-Golomb coded integers.

•bool : A boolean (i.e. True or False).

•filename : Directly from a ﬁle, without reading into memory.

3.2.1 From a hexadecimal string

>>> c=BitArray(hex=’0x000001b3’)

>>> c.hex

’000001b3’

The initial 0x or 0X is optional. Whitespace is also allowed and is ignored. Note that the

leading zeros are signiﬁcant, so the length of cwill be 32.

If you include the initial 0x then you can use the auto initialiser instead. As it is the ﬁrst

parameter in __init__ this will work equally well:

16 Chapter 3. Creation

bitstring Documentation, Release 3.0.0

c=BitArray(’0x000001b3’)

3.2.2 From a binary string

>>> d=BitArray(bin=’0011 00’)

>>> d.bin

’001100’

An initial 0b or 0B is optional and whitespace will be ignored.

As with hex, the auto initialiser will work if the binary string is preﬁxed by 0b:

>>> d=BitArray(’0b001100’)

3.2.3 From an octal string

>>> o=BitArray(oct=’34100’)

>>> o.oct

’34100’

An initial 0o or 0O is optional, but 0o (a zero and lower-case ‘o’) is preferred as it is slightly

more readable.

As with hex and bin, the auto initialiser will work if the octal string is preﬁxed by 0o:

>>> o=BitArray(’0o34100’)

3.2.4 From an integer

>>> e=BitArray(uint=45, length=12)

>>> f=BitArray(int=-1, length=7)

>>> e.bin

’000000101101’

>>> f.bin

’1111111’

For initialisation with signed and unsigned binary integers (int and uint respectively) the

length parameter is mandatory, and must be large enough to contain the integer. So for

example if length is 8 then uint can be in the range 0 to 255, while int can range from

-128 to 127. Two’s complement is used to represent negative numbers.

The auto initialise can be used by giving a colon and the length in bits immediately after the

int or uint token, followed by an equals sign then the value:

>>> e=BitArray(’uint:12=45’)

>>> f=BitArray(’int:7=-1’)

3.2. Using the constructor 17

bitstring Documentation, Release 3.0.0

The plain int and uint initialisers are bit-wise big-endian. That is to say that the most

signiﬁcant bit comes ﬁrst and the least signiﬁcant bit comes last, so the unsigned number one

will have a 1as its ﬁnal bit with all other bits set to 0. These can be any number of bits long.

For whole-byte bitstring objects there are more options available with different endiannesses.

3.2.5 Big and little-endian integers

>>> big_endian =BitArray(uintbe=1, length=16)

>>> little_endian =BitArray(uintle=1, length=16)

>>> native_endian =BitArray(uintne=1, length=16)

There are unsigned and signed versions of three additional ‘endian’ types. The unsigned ver-

sions are used above to create three bitstrings.

The ﬁrst of these, big_endian, is equivalent to just using the plain bit-wise big-endian uint

initialiser, except that all intbe or uintbe interpretations must be of whole-byte bitstrings,

otherwise a ValueError is raised.

The second, little_endian, is interpreted as least signiﬁcant byte ﬁrst, i.e. it is a byte

reversal of big_endian. So we have:

>>> big_endian.hex

’0001’

>>> little_endian.hex

’0100’

Finally we have native_endian, which will equal either big_endian or

little_endian, depending on whether you are running on a big or little-endian ma-

chine (if you really need to check then use import sys; sys.byteorder).

3.2.6 From a ﬂoating point number

>>> f1 =BitArray(float=10.3, length=32)

>>> f2 =BitArray(’float:64=5.4e31’)

Floating point numbers can be used for initialisation provided that the bitstring is 32 or 64 bits

long. Standard Python ﬂoating point numbers are 64 bits long, so if you use 32 bits then some

accuracy could be lost.

Note that the exact bits used to represent the ﬂoating point number could be platform dependent.

Most PCs will conform to the IEEE 754 standard, and presently other ﬂoating point represen-

tations are not supported (although they should work on a single platform - it just might get

confusing if you try to interpret a generated bitstring on another platform).

Similar to the situation with integers there are big and little endian versions. The plain float

is big endian and so floatbe is just an alias.

As with other initialisers you can also auto initialise, as demonstrated with the second example

below:

18 Chapter 3. Creation

bitstring Documentation, Release 3.0.0

>>> little_endian =BitArray(floatle=0.0, length=64)

>>> native_endian =BitArray(’floatne:32=-6.3’)

3.2.7 Exponential-Golomb codes

Initialisation with integers represented by exponential-Golomb codes is also possible. ue is

an unsigned code while se is a signed code. Interleaved exponential-Golomb codes are also

supported via uie and sie:

>>> g=BitArray(ue=12)

>>> h=BitArray(se=-402)

>>> g.bin

’0001101’

>>> h.bin

’0000000001100100101’

For these initialisers the length of the bitstring is ﬁxed by the value it is initialised with, so

the length parameter must not be supplied and it is an error to do so. If you don’t know what

exponential-Golomb codes are then you are in good company, but they are quite interesting, so

I’ve included a section on them (see Exponential-Golomb Codes).

The auto initialiser may also be used by giving an equals sign and the value immediately after

aue or se token:

>>> g=BitArray(’ue=12’)

>>> h=BitArray(’se=-402’)

You may wonder why you would bother with auto in this case as the syntax is slightly longer.

Hopefully all will become clear in the next section.

3.2.8 From raw byte data

Using the length and offset parameters to specify the length in bits and an offset at the start to

be ignored is particularly useful when initialising from raw data or from a ﬁle.

a=BitArray(bytes=b’\x00\x01\x02\xff’, length=28, offset=1)

b=BitArray(bytes=open("somefile",’rb’).read())

The length parameter is optional; it defaults to the length of the data in bits (and so will be a

multiple of 8). You can use it to truncate some bits from the end of the bitstring. The offset

parameter is also optional and is used to truncate bits at the start of the data.

You can also use a bytearray object, either explicitly with a bytes=some_bytearray

keyword or via the auto initialiser:

c=BitArray(a_bytearray_object)

If you are using Python 3.x you can use this trick with bytes objects too. This should be

used with caution as in Python 2 it will instead be interpreted as a string (it’s not possible

3.2. Using the constructor 19

bitstring Documentation, Release 3.0.0

to distinguish between str and bytes in Python 2) and so your code won’t work the same

between Python versions.

d=BitArray(b’\x23g$5’)# Use with caution! Only works correctly in Python 3.

3.2.9 From a ﬁle

Using the filename initialiser allows a ﬁle to be analysed without the need to read it all into

memory. The way to create a ﬁle-based bitstring is:

p=Bits(filename="my2GBfile")

This will open the ﬁle in binary read-only mode. The ﬁle will only be read as and when other

operations require it, and the contents of the ﬁle will not be changed by any operations. If only

a portion of the ﬁle is needed then the offset and length parameters (speciﬁed in bits) can

be used.

Note that we created a Bits here rather than a BitArray, as they have quite different be-

haviour in this case. The immutable Bits will never read the ﬁle into memory (except as

needed by other operations), whereas if we had created a BitArray then the whole of the ﬁle

would immediately have been read into memory. This is because in creating a BitArray you

are implicitly saying that you want to modify it, and so it needs to be in memory.

It’s also possible to use the auto initialiser for ﬁle objects. It’s as simple as:

f=open(’my2GBfile’,’rb’)

p=Bits(f)

3.3 The auto initialiser

The auto parameter is the ﬁrst parameter in the __init__ function and so the auto= can

be omitted when using it. It accepts either a string, an iterable, another bitstring, an integer, a

bytearray or a ﬁle object.

Strings starting with 0x or hex: are interpreted as hexadecimal, 0o or oct: implies octal,

and strings starting with 0b or bin: are interpreted as binary. You can also initialise with the

various integer initialisers as described above. If given another bitstring it will create a copy of

it, (non string) iterables are interpreted as boolean arrays and ﬁle objects acts a source of binary

data. Finally you can use an integer to create a zeroed bitstring of that number of bits.

>>> fromhex =BitArray(’0x01ffc9’)

>>> frombin =BitArray(’0b01’)

>>> fromoct =BitArray(’0o7550’)

>>> fromint =BitArray(’int:32=10’)

>>> fromfloat =BitArray(’float:64=0.2’)

>>> acopy =BitArray(fromoct)

>>> fromlist =BitArray([1,0,0])

>>> f=open(’somefile’,’rb’)

>>> fromfile =BitArray(f)

20 Chapter 3. Creation

bitstring Documentation, Release 3.0.0

>>> zeroed =BitArray(1000)

>>> frombytes =BitArray(bytearray(b’xyz’))

It can also be used to convert between the BitArray and Bits classes:

>>> immutable =Bits(’0xabc’)

>>> mutable =BitArray(immutable)

>>> mutable += ’0xdef’

>>> immutable =Bits(mutable)

As always the bitstring doesn’t know how it was created; initialising with octal or hex might

be more convenient or natural for a particular example but it is exactly equivalent to initialising

with the corresponding binary string.

>>> fromoct.oct

’7550’

>>> fromoct.hex

’f68’

>>> fromoct.bin

’111101101000’

>>> fromoct.uint

3994

>>> fromoct.int

-152

>>> BitArray(’0o7777’)== ’0xfff’

True

>>> BitArray(’0xf’)== ’0b1111’

True

>>> frombin[::-1]+’0b0’ == fromlist

True

Note how in the ﬁnal examples above only one half of the == needs to be a bitstring, the other

half gets auto initialised before the comparison is made. This is in common with many other

functions and operators.

You can also chain together string initialisers with commas, which causes the individual bit-

strings to be concatenated.

>>> s=BitArray(’0x12, 0b1, uint:5=2, ue=5, se=-1, se=4’)

>>> s.find(’uint:5=2, ue=5’)

True

>>> s.insert(’0o332, 0b11, int:23=300’,4)

Again, note how the format used in the auto initialiser can be used in many other places where

a bitstring is needed.

3.3. The auto initialiser 21

bitstring Documentation, Release 3.0.0

22 Chapter 3. Creation

CHAPTER

FOUR

PACKING

Another method of creating BitStream objects is to use the pack function. This takes a

format speciﬁer which is a string with comma separated tokens, and a number of items to pack

according to it. It’s signature is bitstring.pack(format, *values, **kwargs).

For example using just the *values arguments we can say:

s=bitstring.pack(’hex:32, uint:12, uint:12’,’0x000001b3’,352,288)

which is equivalent to initialising as:

s=BitStream(’0x0000001b3, uint:12=352, uint:12=288’)

The advantage of the pack function is if you want to write more general code for creation.

def foo(a, b, c, d):

return bitstring.pack(’uint:8, 0b110, int:6, bin, bits’, a, b, c, d)

s1 =foo(12,5,’0b00000’,’’)

s2 =foo(101,3,’0b11011’, s1)

Note how you can use some tokens without sizes (such as bin and bits in the above ex-

ample), and use values of any length to ﬁll them. If the size had been speciﬁed then a

ValueError would be raised if the parameter given was the wrong length. Note also how

bitstring literals can be used (the 0b110 in the bitstring returned by foo) and these don’t

consume any of the items in *values.

You can also include keyword, value pairs (or an equivalent dictionary) as the ﬁnal parame-

ter(s). The values are then packed according to the positions of the keywords in the format

string. This is most easily explained with some examples. Firstly the format string needs to

contain parameter names:

format =’hex:32=start_code, uint:12=width, uint:12=height’

Then we can make a dictionary with these parameters as keys and pass it to pack:

d={’start_code’:’0x000001b3’,’width’:352,’height’:288}

s=bitstring.pack(format, **d)

Another method is to pass the same information as keywords at the end of pack’s parameter

list:

bitstring Documentation, Release 3.0.0

s=bitstring.pack(format, width=352, height=288, start_code=’0x000001b3’)

The tokens in the format string that you must provide values for are:

int:n n bits as a signed integer.

uint:n n bits as an unsigned integer.

intbe:n n bits as a big-endian whole byte signed integer.

uintbe:n n bits as a big-endian whole byte unsigned integer.

intle:n n bits as a little-endian whole byte signed integer.

uintle:n n bits as a little-endian whole byte unsigned integer.

intne:n n bits as a native-endian whole byte signed integer.

uintne:n n bits as a native-endian whole byte unsigned integer.

float:n n bits as a big-endian ﬂoating point number (same as floatbe).

floatbe:n n bits as a big-endian ﬂoating point number (same as float).

floatle:n n bits as a little-endian ﬂoating point number.

floatne:n n bits as a native-endian ﬂoating point number.

hex[:n] [nbits as] a hexadecimal string.

oct[:n] [nbits as] an octal string.

bin[:n] [nbits as] a binary string.

bits[:n] [nbits as] a new bitstring.

bool single bit as a boolean (True or False).

ue an unsigned integer as an exponential-Golomb code.

se a signed integer as an exponential-Golomb code.

uie an unsigned integer as an interleaved exponential-Golomb code.

sie a signed integer as an interleaved exponential-Golomb code.

and you can also include constant bitstring tokens constructed from any of the following:

0b... binary literal.

0o... octal literal.

0x... hexadecimal literal.

int:n=m signed integer min nbits.

uint:n=m unsigned integer min nbits.

intbe:n=m big-endian whole byte signed integer min nbits.

uintbe:n=m big-endian whole byte unsigned integer min nbits.

intle:n=m little-endian whole byte signed integer min nbits.

uintle:n=m little-endian whole byte unsigned integer min nbits.

intne:n=m native-endian whole byte signed integer min nbits.

uintne:n=m native-endian whole byte unsigned integer min nbits.

float:n=f big-endian ﬂoating point number fin nbits.

floatbe:n=f big-endian ﬂoating point number fin nbits.

floatle:n=f little-endian ﬂoating point number fin nbits.

floatne:n=f native-endian ﬂoating point number fin nbits.

ue=m exponential-Golomb code for unsigned integer m.

se=m exponential-Golomb code for signed integer m.

uie=m interleaved exponential-Golomb code for unsigned integer m.

sie=m interleaved exponential-Golomb code for signed integer m.

bool=b a single bit, either True or False.

24 Chapter 4. Packing

bitstring Documentation, Release 3.0.0

You can also use a keyword for the length speciﬁer in the token, for example:

s=bitstring.pack(’int:n=-1’, n=100)

And ﬁnally it is also possible just to use a keyword as a token:

s=bitstring.pack(’hello, world’, world=’0x123’, hello=’0b110’)

As you would expect, there is also an unpack function that takes a bitstring and unpacks it

according to a very similar format string. This is covered later in more detail, but a quick

example is:

>>> s=bitstring.pack(’ue, oct:3, hex:8, uint:14’,3,’0o7’,’0xff’,90)

>>> s.unpack(’ue, oct:3, hex:8, uint:14’)

[3, ’7’, ’ff’, 90]

4.1 Compact format strings

Another option when using pack, as well as other methods such as read and byteswap,

is to use a format speciﬁer similar to those used in the struct and array modules. These

consist of a character to give the endianness, followed by more single characters to give the

format.

The endianness character must start the format string and unlike in the struct module it is not

optional (except when used with byteswap):

>Big-endian

<Little-endian

@Native-endian

For ‘network’ endianness use >as network and big-endian are equivalent. This is followed by

at least one of these format characters:

b8 bit signed integer

B8 bit unsigned integer

h16 bit signed integer

H16 bit unsigned integer

l32 bit signed integer

L32 bit unsigned integer

q64 bit signed integer

Q64 bit unsigned integer

f32 bit ﬂoating point number

d64 bit ﬂoating point number

The exact type is determined by combining the endianness character with the format character,

but rather than give an exhaustive list a single example should explain:

>h Big-endian 16 bit signed integer intbe:16

<h Little-endian 16 bit signed integer intle:16

@h Native-endian 16 bit signed integer intne:16

4.1. Compact format strings 25

bitstring Documentation, Release 3.0.0

As you can see all three are signed integers in 16 bits, the only difference is the endianness.

The native-endian @h will equal the big-endian >h on big-endian systems, and equal the little-

endian <h on little-endian systems. For the single byte codes band Bthe endianness doesn’t

make any difference, but you still need to specify one so that the format string can be parsed

correctly.

An example:

s=bitstring.pack(’>qqqq’,10,11,12,13)

is equivalent to

s=bitstring.pack(’intbe:64, intbe:64, intbe:64, intbe:64’,10,11,12,13)

Just as in the struct module you can also give a multiplicative factor before the format character,

so the previous example could be written even more concisely as

s=bitstring.pack(’>4q’,10,11,12,13)

You can of course combine these format strings with other initialisers, even mixing endian-

nesses (although I’m not sure why you’d want to):

s=bitstring.pack(’>6h3b, 0b1, <9L’,*range(18))

This rather contrived example takes the numbers 0 to 17 and packs the ﬁrst 6 as signed big-

endian 2-byte integers, the next 3 as single bytes, then inserts a single 1 bit, before packing the

remaining 9 as little-endian 4-byte unsigned integers.

26 Chapter 4. Packing

CHAPTER

FIVE

INTERPRETING BITSTRINGS

Bitstrings don’t know or care how they were created; they are just collections of bits. This

means that you are quite free to interpret them in any way that makes sense.

Several Python properties are used to create interpretations for the bitstring. These properties

call private functions which will calculate and return the appropriate interpretation. These don’t

change the bitstring in any way and it remains just a collection of bits. If you use the property

again then the calculation will be repeated.

Note that these properties can potentially be very expensive in terms of both computation and

memory requirements. For example if you have initialised a bitstring from a 10 GB ﬁle object

and ask for its binary string representation then that string will be around 80 GB in size!

For the properties described below we will use these:

>>> a=BitArray(’0x123’)

>>> b=BitArray(’0b111’)

5.1 bin

The most fundamental interpretation is perhaps as a binary string (a ‘bitstring’). The bin

property returns a string of the binary representation of the bitstring. All bitstrings can use this

property and it is used to test equality between bitstrings.

>>> a.bin

’000100100011’

>>> b.bin

’111’

Note that the initial zeros are signiﬁcant; for bitstrings the zeros are just as important as the

ones!

5.2 hex

For whole-byte bitstrings the most natural interpretation is often as hexadecimal, with each

byte represented by two hex digits.

bitstring Documentation, Release 3.0.0

If the bitstring does not have a length that is a multiple of four bits then an InterpretError

exception will be raised. This is done in preference to truncating or padding the value, which

could hide errors in user code.

>>> a.hex

’123’

>>> b.hex

ValueError: Cannot convert to hex unambiguously - not multiple of 4 bits.

5.3 oct

For an octal interpretation use the oct property.

If the bitstring does not have a length that is a multiple of three then an InterpretError

exception will be raised.

>>> a.oct

’0443’

>>> b.oct

’7’

>>> (b +’0b0’).oct

ValueError: Cannot convert to octal unambiguously - not multiple of 3 bits.

5.4 uint / uintbe / uintle / uintne

To interpret the bitstring as a binary (base-2) bit-wise big-endian unsigned integer (i.e. a non-

negative integer) use the uint property.

>>> a.uint

283

>>> b.uint

For byte-wise big-endian, little-endian and native-endian interpretations use uintbe,

uintle and uintne respectively. These will raise a ValueError if the bitstring is not

a whole number of bytes long.

>>> s=BitArray(’0x000001’)

>>> s.uint # bit-wise big-endian

>>> s.uintbe # byte-wise big-endian

>>> s.uintle # byte-wise little-endian

65536

>>> s.uintne # byte-wise native-endian (will be 1 on a big-endian platform!)

65536

28 Chapter 5. Interpreting Bitstrings

bitstring Documentation, Release 3.0.0

5.5 int / intbe / intle / intne

For a two’s complement interpretation as a base-2 signed integer use the int property. If the

ﬁrst bit of the bitstring is zero then the int and uint interpretations will be equal, otherwise

the int will represent a negative number.

>>> a.int

283

>>> b.int

-1

For byte-wise big, little and native endian signed integer interpretations use intbe,intle

and intne respectively. These work in the same manner as their unsigned counterparts de-

scribed above.

5.6 ﬂoat / ﬂoatbe / ﬂoatle / ﬂoatne

For a ﬂoating point interpretation use the float property. This uses your machine’s underlying

ﬂoating point representation and will only work if the bitstring is 32 or 64 bits long.

Different endiannesses are provided via floatle and floatne. Note that as ﬂoating point

interpretations are only valid on whole-byte bitstrings there is no difference between the bit-

wise big-endian float and the byte-wise big-endian floatbe.

Note also that standard ﬂoating point numbers in Python are stored in 64 bits, so use this size

if you wish to avoid rounding errors.

5.7 bytes

A common need is to retrieve the raw bytes from a bitstring for further processing or for writ-

ing to a ﬁle. For this use the bytes interpretation, which returns a bytes object (which is

equivalent to an ordinary str in Python 2.6/2.7).

If the length of the bitstring isn’t a multiple of eight then a ValueError will be raised. This is

because there isn’t an unequivocal representation as bytes. You may prefer to use the method

tobytes as this will be pad with between one and seven zero bits up to a byte boundary if

neccessary.

>>> open(’somefile’,’wb’).write(a.tobytes())

>>> open(’anotherfile’,’wb’).write((’0x0’+a).bytes)

>>> a1 =BitArray(filename=’somefile’)

>>> a1.hex

’1230’

>>> a2 =BitArray(filename=’anotherfile’)

>>> a2.hex

’0123’

5.5. int / intbe / intle / intne 29

bitstring Documentation, Release 3.0.0

Note that the tobytes method automatically padded with four zero bits at the end, whereas

for the other example we explicitly padded at the start to byte align before using the bytes

property.

5.8 ue

The ue property interprets the bitstring as a single unsigned exponential-Golomb code and

returns an integer. If the bitstring is not exactly one code then an InterpretError is raised

instead. If you instead wish to read the next bits in the stream and interpret them as a code

use the read function with a ue format string. See Exponential-Golomb Codes for a short

explanation of this type of integer representation.

>>> s=BitArray(ue=12)

>>> s.bin

’0001101’

>>> s.append(BitArray(ue=3))

>>> print(s.readlist(’2*ue’))

[12, 3]

5.9 se

The se property does much the same as ue and the provisos there all apply. The obvious

difference is that it interprets the bitstring as a signed exponential-Golomb rather than unsigned

- see Exponential-Golomb Codes for more information.

>>> s=BitArray(’0x164b’)

>>> s.se

InterpretError: BitArray, is not a single exponential-Golomb code.

>>> while s.pos <s.length:

... print(s.read(’se’))

-5

-1

5.10 uie / sie

A slightly different type, interleaved exponential-Golomb codes are also supported. The prin-

ciples are the same as with ue and se - see Exponential-Golomb Codes for detail of the differ-

ences.

30 Chapter 5. Interpreting Bitstrings

CHAPTER

SIX

SLICING, DICING AND SPLICING

Manipulating binary data can be a bit of a challenge in Python. One of its strengths is that you

don’t have to worry about the low level data, but this can make life difﬁcult when what you

care about is precisely the thing that is safely hidden by high level abstractions.

In this section some more methods are described that treat data as a series of bits, rather than

bytes.

6.1 Slicing

Slicing takes three arguments: the ﬁrst position you want, one past the last position you want

and a multiplicative factor which defaults to 1.

The third argument (the ‘step’) will be described shortly, but most of the time you’ll probably

just need the bit-wise slice, where for example a[10:12] will return a 2-bit bitstring of the

10th and 11th bits in a, and a[32] will return just the 32nd bit.

>>> a=BitArray(’0b00011110’)

>>> b=a[3:7]

>>> print(a, b)

0x1e 0xf

For single bit indices (as opposed to slices) a boolean is returned; that is True for ‘1’ bits and

False for ‘0’ bits:

>>> a[0]

False

>>> a[4]

True

If you want a single bit as a new bitstring then use a one-bit slice instead:

>>> a[0:1]

BitArray(’0b0’)

Indexing also works for missing and negative arguments, just as it does for other containers.

bitstring Documentation, Release 3.0.0

>>> a=BitArray(’0b00011110’)

>>> print(a[:5]) # first 5 bits

0b00011

>>> print(a[3:]) # everything except first 3 bits

0b11110

>>> print(a[-4:]) # final 4 bits

0xe

>>> print(a[:-1]) # everything except last bit

0b0001111

>>> print(a[-6:-4]) # from 6 from the end to 4 from the end

0b01

6.1.1 Stepping in slices

The step parameter (also known as the stride) can be used in slices and has the same meaning

as in the built-in containers:

>>> s=BitArray(16)

>>> s[::2]=[1]*8

>>> s.bin

’1010101010101010’

>>> del s[8::2]

>>> s.bin

’101010100000’

>>> s[::3].bin

’1010’

Negative slices are also allowed, and should do what you’d expect. So for example s[::-1]

returns a bit-reversed copy of s(which is similar to using s.reverse(), which does the

same operation on sin-place).

6.2 Joining

To join together a couple of bitstring objects use the +or += operators, or the append and

prepend methods.

# Six ways of creating the same BitArray:

a1 =BitArray(bin=’000’)+BitArray(hex=’f’)

a2 =BitArray(’0b000’)+BitArray(’0xf’)

a3 =BitArray(’0b000’)+’0xf’

a4 =BitArray(’0b000’)

a4.append(’0xf’)

a5 =BitArray(’0xf’)

a5.prepend(’0b000’)

a6 =BitArray(’0b000’)

a6 += ’0xf’

32 Chapter 6. Slicing, Dicing and Splicing

bitstring Documentation, Release 3.0.0

Note that the ﬁnal three methods all modify a bitstring, and so will only work with BitArray

objects, not the immutable Bits objects.

If you want to join a large number of bitstrings then the method join can be used to improve

efﬁciency and readability. It works like the ordinary string join function in that it uses the

bitstring that it is called on as a separator when joining the list of bitstring objects it is given. If

you don’t want a separator then it can be called on an empty bitstring.

bslist =[BitArray(uint=n, length=12)for nin xrange(1000)]

s=BitArray(’0b1111’).join(bslist)

6.3 Truncating, inserting, deleting and overwriting

The functions in this section all modify the bitstring that they operate on and so are not available

for Bits objects.

6.3.1 Deleting and truncating

To delete bits just use del as you would with any other container:

>>> a=BitArray(’0b00011000’)

>>> del a[3:5]# remove 2 bits at pos 3

>>> a.bin

‘000000’

>>> b=BitArray(’0x112233445566’)

>>> del b[24:40]

>>> b.hex

’11223366’

You can of course use this to truncate the start or end bits just as easily:

>>> a=BitArray(’0x001122’)

>>> del a[-8:] # remove last 8 bits

>>> del a[:8]# remove first 8 bits

>>> a== ’0x11’

True

6.3.2 insert

As you might expect, insert takes one BitArray and inserts it into another. A bit position

must be speciﬁed for BitArray and Bits, but for BitStreams if not present then the current

pos is used.

>>> a=BitArray(’0x00112233’)

>>> a.insert(’0xffff’,16)

>>> a.hex

’0011ffff2233’

6.3. Truncating, inserting, deleting and overwriting 33

bitstring Documentation, Release 3.0.0

6.3.3 overwrite

overwrite does much the same as insert, but predictably the BitArray object’s data is

overwritten by the new data.

>>> a=BitStream(’0x00112233’)

>>> a.pos =4

>>> a.overwrite(’0b1111’)# Uses current pos as default

>>> a.hex

’0f112233’

6.4 The bitstring as a list

If you treat a bitstring object as a list whose elements are all either ‘1’ or ‘0’ then you won’t

go far wrong. The table below gives some of the equivalent ways of using methods and the

standard slice notation.

Using functions Using slices

s.insert(bs, pos) s[pos:pos] = bs

s.overwrite(bs, pos) s[pos:pos + bs.len] = bs

s.append(bs) s[s.len:s.len] = bs

s.prepend(bs) s[0:0] = bs

6.5 Splitting

6.5.1 split

Sometimes it can be very useful to use a delimiter to split a bitstring into sections. The split

method returns a generator for the sections.

>>> a=BitArray(’0x4700004711472222’)

>>> for sin a.split(’0x47’, bytealigned=True):

... print(s.hex)

470000

4711

472222

Note that the ﬁrst item returned is always the bitstring before the ﬁrst occurrence of the delim-

iter, even if it is empty.

6.5.2 cut

If you just want to split into equal parts then use the cut method. This takes a number of bits

as its ﬁrst argument and returns a generator for chunks of that size.

34 Chapter 6. Slicing, Dicing and Splicing

bitstring Documentation, Release 3.0.0

>>> a=BitArray(’0x47001243’)

>>> for byte in a.cut(8):

... print(byte.hex)

6.5. Splitting 35

bitstring Documentation, Release 3.0.0

36 Chapter 6. Slicing, Dicing and Splicing

CHAPTER

SEVEN

READING, PARSING AND UNPACKING

7.1 Reading and parsing

The BitStream and ConstBitStream classes contain number of methods for reading the

bitstring as if it were a ﬁle or stream. Depending on how it was constructed the bitstream might

actually be contained in a ﬁle rather than stored in memory, but these methods work for either

case.

In order to behave like a ﬁle or stream, every bitstream has a property pos which is the current

position from which reads occur. pos can range from zero (its value on construction) to the

length of the bitstream, a position from which all reads will fail as it is past the last bit. Note

that the pos property isn’t considered a part of the bitstream’s identity; this allows it to vary

for immutable ConstBitStream objects and means that it doesn’t affect equality or hash

values.

The property bytepos is also available, and is useful if you are only dealing with byte

data and don’t want to always have to divide the bit position by eight. Note that if you try

to use bytepos and the bitstring isn’t byte aligned (i.e. pos isn’t a multiple of 8) then a

ByteAlignError exception will be raised.

7.1.1 read / readlist

For simple reading of a number of bits you can use read with an integer argument. A new

bitstring object gets returned, which can be interpreted using one of its properties or used for

further reads. The following example does some simple parsing of an MPEG-1 video stream

(the stream is provided in the test directory if you downloaded the source archive).

>>> s=ConstBitStream(filename=’test/test.m1v’)

>>> print(s.pos)

>>> start_code =s.read(32).hex

>>> width =s.read(12).uint

>>> height =s.read(12).uint

>>> print(start_code, width, height, s.pos)

000001b3 352 288 56

>>> s.pos += 37

bitstring Documentation, Release 3.0.0

>>> flags =s.read(2)

>>> constrained_parameters_flag =flags.read(1)

>>> load_intra_quantiser_matrix =flags.read(1)

>>> print(s.pos, flags.pos)

95 2

If you want to read multiple items in one go you can use readlist. This can take an iterable

of bit lengths and return a list of bitstring objects. So for example instead of writing:

a=s.read(32)

b=s.read(8)

c=s.read(24)

you can equivalently use just:

a,b,c=s.readlist([32,8,24])

7.1.2 Reading using format strings

The read /readlist methods can also take a format string similar to that used in the auto

initialiser. Only one token should be provided to read and a single value is returned. To read

multiple tokens use readlist, which unsurprisingly returns a list.

The format string consists of comma separated tokens that describe how to interpret the next

bits in the bitstring. The tokens are:

int:n n bits as a signed integer.

uint:n n bits as an unsigned integer.

intbe:n n bits as a byte-wise big-endian signed integer.

uintbe:n n bits as a byte-wise big-endian unsigned integer.

intle:n n bits as a byte-wise little-endian signed integer.

uintle:n n bits as a byte-wise little-endian unsigned integer.

intne:n n bits as a byte-wise native-endian signed integer.

uintne:n n bits as a byte-wise native-endian unsigned integer.

float:n n bits as a big-endian ﬂoating point number (same as floatbe).

floatbe:n n bits as a big-endian ﬂoating point number (same as float).

floatle:n n bits as a little-endian ﬂoating point number.

floatne:n n bits as a native-endian ﬂoating point number.

hex:n n bits as a hexadecimal string.

oct:n n bits as an octal string.

bin:n n bits as a binary string.

bits:n n bits as a new bitstring.

bytes:n n bytes as a bytes object.

ue next bits as an unsigned exponential-Golomb code.

se next bits as a signed exponential-Golomb code.

uie next bits as an interleaved unsigned exponential-Golomb code.

sie next bits as an interleaved signed exponential-Golomb code.

bool next bits as a boolean (True or False).

38 Chapter 7. Reading, Parsing and Unpacking

bitstring Documentation, Release 3.0.0

So in the earlier example we could have written:

start_code =s.read(’hex:32’)

width =s.read(’uint:12’)

height =s.read(’uint:12’)

and we also could have combined the three reads as:

start_code, width, height =s.readlist(’hex:32, 12, 12’)

where here we are also taking advantage of the default uint interpretation for the second and

third tokens.

You are allowed to use one ‘stretchy’ token in a readlist. This is a token without a length

speciﬁed which will stretch to ﬁll encompass as many bits as possible. This is often useful

when you just want to assign something to ‘the rest’ of the bitstring:

a, b, everthing_else =s.readlist(’intle:16, intle:24, bits’)

In this example the bits token will consist of everything left after the ﬁrst two tokens are read,

and could be empty.

It is an error to use more than one stretchy token, or to use a ue,se,uie or se token after

a stretchy token (the reason you can’t use exponential-Golomb codes after a stretchy token is

that the codes can only be read forwards; that is you can’t ask “if this code ends here, where

did it begin?” as there could be many possible answers).

7.1.3 Peeking

In addition to the read methods there are matching peek methods. These are identical to the

read except that they do not advance the position in the bitstring to after the read elements.

s=ConstBitStream(’0x4732aa34’)

if s.peek(8)== ’0x47’:

t=s.read(16)# t is first 2 bytes ’0x4732’

else:

s.find(’0x47’)

7.2 Unpacking

The unpack method works in a very similar way to readlist. The major difference is that

it interprets the whole bitstring from the start, and takes no account of the current pos. It’s a

natural complement of the pack function.

s=pack(’uint:10, hex, int:13, 0b11’,130,’3d’,-23)

a,b,c,d=s.unpack(’uint:10, hex, int:13, bin:2’)

7.2. Unpacking 39

bitstring Documentation, Release 3.0.0

7.3 Seeking

The properties pos and bytepos are available for getting and setting the position, which is

zero on creation of the bitstring.

Note that you can only use bytepos if the position is byte aligned, i.e. the bit position is a

multiple of 8. Otherwise a ByteAlignError exception is raised.

For example:

>>> s=BitStream(’0x123456’)

>>> s.pos

>>> s.bytepos += 2

>>> s.pos # note pos verses bytepos

>>> s.pos += 4

>>> print(s.read(’bin:4’)) # the final nibble ’0x6’

0110

7.4 Finding and replacing

7.4.1 find / rfind

To search for a sub-string use the find method. If the ﬁnd succeeds it will set the position

to the start of the next occurrence of the searched for string and return a tuple containing that

position, otherwise it will return an empty tuple. By default the sub-string will be found at any

bit position - to allow it to only be found on byte boundaries set bytealigned=True.

>>> s=ConstBitStream(’0x00123400001234’)

>>> found =s.find(’0x1234’, bytealigned=True)

>>> print(found, s.bytepos)

(8,) 1

>>> found =s.find(’0xff’, bytealigned=True)

>>> print(found, s.bytepos)

() 1

The reason for returning the bit position in a tuple is so that the return value is True in a

boolean sense if the sub-string is found, and False if it is not (if just the bit position were

returned there would be a problem with ﬁnding at position 0). The effect is that you can use if

s.find(...): and have it behave as you’d expect.

rfind does much the same as find, except that it will ﬁnd the last occurrence, rather than

the ﬁrst.

>>> t=BitArray(’0x0f231443e8’)

>>> found =t.rfind(’0xf’)# Search all bit positions in reverse

>>> print(found)

(31,) # Found within the 0x3e near the end

40 Chapter 7. Reading, Parsing and Unpacking

bitstring Documentation, Release 3.0.0

For all of these ﬁnding functions you can optionally specify a start and / or end to narrow

the search range. Note though that because it’s searching backwards rfind will start at end

and end at start (so you always need start <end).

7.4.2 findall

To ﬁnd all occurrences of a bitstring inside another (even overlapping ones), use findall.

This returns a generator for the bit positions of the found strings.

>>> r=BitArray(’0b011101011001’)

>>> ones =r.findall([1])

>>> print(list(ones))

[1, 2, 3, 5, 7, 8, 11]

7.4.3 replace

To replace all occurrences of one BitArray with another use replace. The replacements

are done in-place, and the number of replacements made is returned. This methods changes the

contents of the bitstring and so isn’t available for the Bits or ConstBitStream classes.

>>> s=BitArray(’0b110000110110’)

>>> s.replace(’0b110’,’0b1111’)

3 # The number of replacements made

>>> s.bin

’111100011111111’

7.5 Working with byte aligned data

The emphasis with the bitstring module is always towards not worrying if things are a whole

number of bytes long or are aligned on byte boundaries. Internally the module has to worry

about this quite a lot, but the user shouldn’t have to care. To this end methods such as find,

findall,split and replace by default aren’t concerned with looking for things only on

byte boundaries and provide a parameter bytealigned which can be set to True to change

this behaviour.

This works ﬁne, but it’s not uncommon to be working only with whole-byte data and all the

bytealigned=True can get a bit repetitive. To solve this it is possible to change the default

throughout the module by setting bitstring.bytealigned. For example:

>>> s=BitArray(’0xabbb’)

>>> s.find(’0xbb’)# look for the byte 0xbb

(4,) # found, but not on byte boundary

>>> s.find(’0xbb’, bytealigned=True)# try again...

(8,) # not found on any byte boundaries

>>> bitstring.bytealigned =True # change the default behaviour

>>> s.find(’0xbb’)

(8,) # now only finds byte aligned

7.5. Working with byte aligned data 41

bitstring Documentation, Release 3.0.0

42 Chapter 7. Reading, Parsing and Unpacking

CHAPTER

EIGHT

MISCELLANY

8.1 Other Functions

8.1.1 bytealign

bytealign advances between zero and seven bits to make the pos a multiple of eight. It

returns the number of bits advanced.

>>> a=BitStream(’0x11223344’)

>>> a.pos =1

>>> skipped =a.bytealign()

>>> print(skipped, a.pos)

7 8

>>> skipped =a.bytealign()

>>> print(skipped, a.pos)

0 8

8.1.2 reverse

This simply reverses the bits of the BitArray in place. You can optionally specify a range of

bits to reverse.

>>> a=BitArray(’0b000001101’)

>>> a.reverse()

>>> a.bin

’101100000’

>>> a.reverse(0,4)

>>> a.bin

’110100000’

8.1.3 tobytes

Returns the byte data contained in the bitstring as a bytes object (equivalent to a str if you’re

using Python 2.6). This differs from using the plain bytes property in that if the bitstring isn’t

a whole number of bytes long then it will be made so by appending up to seven zero bits.

bitstring Documentation, Release 3.0.0

>>> BitArray(’0b1’).tobytes()

’\x80’

8.1.4 tofile

Writes the byte data contained in the bitstring to a ﬁle. The ﬁle should have been opened in a

binary write mode, for example:

>>> f=open(’newfile’,’wb’)

>>> BitArray(’0xffee3241fed’).tofile(f)

In exactly the same manner as with tobytes, up to seven zero bits will be appended to make

the ﬁle a whole number of bytes long.

8.1.5 startswith / endswith

These act like the same named functions on strings, that is they return True if the bitstring

starts or ends with the parameter given. Optionally you can specify a range of bits to use.

>>> s=BitArray(’0xef133’)

>>> s.startswith(’0b111011’)

True

>>> s.endswith(’0x4’)

False

8.1.6 ror / rol

To rotate the bits in a BitArray use ror and rol for right and left rotations respectively.

The changes are done in-place.

>>> s=BitArray(’0x00001’)

>>> s.rol(6)

>>> s.hex

’00040’

8.2 Special Methods

A few of the special methods have already been covered, for example __add__ and

__iadd__ (the +and += operators) and __getitem__ and __setitem__ (reading and

setting slices via []). Here are some more:

8.2.1 __len__

This implements the len function and returns the length of the bitstring in bits.

44 Chapter 8. Miscellany

bitstring Documentation, Release 3.0.0

It’s recommended that you use the len property instead of the function as a limitation of

Python means that the function will raise an OverflowError if the bitstring has more than

sys.maxsize elements (that’s typically 256MB of data with 32-bit Python).

There’s not much more to say really, except to emphasise that it is always in bits and never

bytes.

>>> len(BitArray(’0x00’))

8.2.2 __str__ / __repr__

These get called when you try to print a bitstring. As bitstrings have no preferred interpretation

the form printed might not be what you want - if not then use the hex,bin,int etc. properties.

The main use here is in interactive sessions when you just want a quick look at the bitstring.

The __repr__ tries to give a code fragment which if evaluated would give an equal bitstring.

The form used for the bitstring is generally the one which gives it the shortest representation.

If the resulting string is too long then it will be truncated with ... - this prevents very long

bitstrings from tying up your interactive session while they print themselves.

>>> a=BitArray(’0b1111 111’)

>>> print(a)

0b1111111

>>> a

BitArray(’0b1111111’)

>>> a+= ’0b1’

>>> print(a)

0xff

>>> print(a.bin)

11111111

8.2.3 __eq__ / __ne__

The equality of two bitstring objects is determined by their binary representations being equal.

If you have a different criterion you wish to use then code it explicitly, for example a.int ==

b.int could be true even if a == b wasn’t (as they could be different lengths).

>>> BitArray(’0b0010’)== ’0x2’

True

>>> BitArray(’0x2’)!= ’0o2’

True

8.2.4 __invert__

To get a bit-inverted copy of a bitstring use the ~operator:

8.2. Special Methods 45

bitstring Documentation, Release 3.0.0

>>> a=BitArray(’0b0001100111’)

>>> print(a)

0b0001100111

>>> print(~a)

0b1110011000

>>> ~~a== a

True

8.2.5 __lshift__ / __rshift__ / __ilshift__ /

__irshift__

Bitwise shifts can be achieved using <<,>>,<<= and >>=. Bits shifted off the left or right

are replaced with zero bits. If you need special behaviour, such as keeping the sign of two’s

complement integers then do the shift on the property instead, for example use a.int >>=

>>> a=BitArray(’0b10011001’)

>>> b=a<< 2

>>> print(b)

0b01100100

>>> a>>= 2

>>> print(a)

0b00100110

8.2.6 __mul__ / __imul__ / __rmul__

Multiplication of a bitstring by an integer means the same as it does for ordinary strings: con-

catenation of multiple copies of the bitstring.

>>> a=BitArray(’0b10’)*8

>>> print(a.bin)

1010101010101010

8.2.7 __copy__

This allows the bitstring to be copied via the copy module.

>>> import copy

>>> a=Bits(’0x4223fbddec2231’)

>>> b=copy.copy(a)

>>> b== a

True

>>> bis a

False

It’s not terribly exciting, and isn’t the only method of making a copy. Using b =

BitArray(a) is another option, but b = a[:] may be more familiar to some.

46 Chapter 8. Miscellany

bitstring Documentation, Release 3.0.0

8.2.8 __and__ / __or__ / __xor__ / __iand__ / __ior__

/ __ixor__

Bit-wise AND, OR and XOR are provided for bitstring objects of equal length only (otherwise

aValueError is raised).

>>> a=BitArray(’0b00001111’)

>>> b=BitArray(’0b01010101’)

>>> print((a&b).bin)

00000101

>>> print((a|b).bin)

01011111

>>> print((a^b).bin)

01010000

>>> b&= ’0x1f’

>>> print(b.bin)

00010101

8.2. Special Methods 47

bitstring Documentation, Release 3.0.0

48 Chapter 8. Miscellany

Part II

Reference

CHAPTER

NINE

QUICK REFERENCE

This section lists the bitstring module’s classes together with all their methods and attributes.

The next section goes into full detail with examples.

9.1 Bits

Bits(object)

ABits is the most basic class. It is immutable, so once created its value cannot change. It is

a base class for all the other classes in the bitstring module.

9.1.1 Methods

•all – Check if all speciﬁed bits are set to 1 or 0.

•any – Check if any of speciﬁed bits are set to 1 or 0.

•count – Count the number of bits set to 1 or 0.

•cut – Create generator of constant sized chunks.

•endswith – Return whether the bitstring ends with a sub-bitstring.

•find – Find a sub-bitstring in the current bitstring.

•findall – Find all occurences of a sub-bitstring in the current bitstring.

•join – Join bitstrings together using current bitstring.

•rfind – Seek backwards to ﬁnd a sub-bitstring.

•split – Create generator of chunks split by a delimiter.

•startswith – Return whether the bitstring starts with a sub-bitstring.

•tobytes – Return bitstring as bytes, padding if needed.

•tofile – Write bitstring to ﬁle, padding if needed.

•unpack – Interpret bits using format string.

bitstring Documentation, Release 3.0.0

9.1.2 Special methods

Also available are the operators [],==,!=,+,*,~,<<,>>,&,|and ^.

9.1.3 Properties

•bin – The bitstring as a binary string.

•bool – For single bit bitstrings, interpret as True or False.

•bytes – The bitstring as a bytes object.

•float – Interpret as a ﬂoating point number.

•floatbe – Interpret as a big-endian ﬂoating point number.

•floatle – Interpret as a little-endian ﬂoating point number.

•floatne – Interpret as a native-endian ﬂoating point number.

•hex – The bitstring as a hexadecimal string.

•int – Interpret as a two’s complement signed integer.

•intbe – Interpret as a big-endian signed integer.

•intle – Interpret as a little-endian signed integer.

•intne – Interpret as a native-endian signed integer.

•len – Length of the bitstring in bits.

•oct – The bitstring as an octal string.

•se – Interpret as a signed exponential-Golomb code.

•ue – Interpret as an unsigned exponential-Golomb code.

•sie – Interpret as a signed interleaved exponential-Golomb code.

•uie – Interpret as an unsigned interleaved exponential-Golomb code.

•uint – Interpret as a two’s complement unsigned integer.

•uintbe – Interpret as a big-endian unsigned integer.

•uintle – Interpret as a little-endian unsigned integer.

•uintne – Interpret as a native-endian unsigned integer.

9.2 BitArray

BitArray(Bits)

This class adds mutating methods to Bits.

52 Chapter 9. Quick Reference

bitstring Documentation, Release 3.0.0

9.2.1 Additional methods

•append – Append a bitstring.

•byteswap – Change byte endianness in-place.

•insert – Insert a bitstring.

•invert – Flip bit(s) between one and zero.

•overwrite – Overwrite a section with a new bitstring.

•prepend – Prepend a bitstring.

•replace – Replace occurences of one bitstring with another.

•reverse – Reverse bits in-place.

•rol – Rotate bits to the left.

•ror – Rotate bits to the right.

•set – Set bit(s) to 1 or 0.

9.2.2 Additional special methods

Mutating operators are available: [],<<=,>>=,*=,&=,|= and ^=.

9.2.3 Attributes

The same as Bits, except that they are all (with the exception of len) writable as

well as readable.

9.3 ConstBitStream

ConstBitStream(Bits)

This class, previously known as just Bits (which is an alias for backward-compatibility), adds

a bit position and methods to read and navigate in the bitstream.

9.3.1 Additional methods

•bytealign – Align to next byte boundary.

•peek – Peek at and interpret next bits as a single item.

•peeklist – Peek at and interpret next bits as a list of items.

•read – Read and interpret next bits as a single item.

•readlist – Read and interpret next bits as a list of items.

9.3. ConstBitStream 53

bitstring Documentation, Release 3.0.0

•readto – Read up to and including next occurrence of a bitstring.

9.3.2 Additional attributes

•bytepos – The current byte position in the bitstring.

•pos – The current bit position in the bitstring.

9.4 BitStream

BitStream(BitArray, ConstBitStream)

This class, also known as BitString, contains all of the ‘stream’ elements of

ConstBitStream and adds all of the mutating methods of BitArray.

54 Chapter 9. Quick Reference

CHAPTER

TEN

THE BITSTRING MODULE

The bitstring module provides four classes, Bits,BitArray,ConstBitStream and

BitStream.Bits is the simplest, and represents an immutable sequence of bits, while

BitArray adds various methods that modify the contents (these classes are intended to

loosely mirror bytes and bytearray in Python 3). The ‘Stream’ classes have additional

methods to treat the bits as a ﬁle or stream.

If you need to change the contents of a bitstring after creation then you must use either the

BitArray or BitStream classes. If you need to use bitstrings as keys in a dictionary or

members of a set then you must use either a Bits or a ConstBitStream. In this section

the generic term ‘bitstring’ is used to refer to an object of any of these classes.

Note that for the bitstream classes the bit position within the bitstream (the position from which

reads occur) can change without affecting the equality operation. This means that the pos and

bytepos properties can change even for a ConstBitStream object.

The public methods, special methods and properties of both classes are detailed in this section.

10.1 The auto initialiser

Note that in places where a bitstring can be used as a parameter, any other valid input to the

auto initialiser can also be used. This means that the parameter can also be a format string

which consists of tokens:

• Starting with hex=, or simply starting with 0x implies hexadecimal. e.g. 0x013ff,

hex=013ff

• Starting with oct=, or simply starting with 0o implies octal. e.g. 0o755,oct=755

• Starting with bin=, or simply starting with 0b implies binary. e.g. 0b0011010,

bin=0011010

• Starting with int: or uint: followed by a length in bits and =gives base-2 integers.

e.g. uint:8=255,int:4=-7

• To get big, little and native-endian whole-byte integers append be,le or ne respectively

to the uint or int identiﬁer. e.g. uintle:32=1,intne:16=-23

bitstring Documentation, Release 3.0.0

• For ﬂoating point numbers use float: followed by the length in bits and =and the

number. The default is big-endian, but you can also append be,le or ne as with integers.

e.g. float:64=0.2,floatle:32=-0.3e12

• Starting with ue=,uie=,se= or sie= implies an exponential-Golomb coded integer.

e.g. ue=12,sie=-4

Multiples tokens can be joined by separating them with commas, so for example se=4, 0b1,

se=-1 represents the concatenation of three elements.

Parentheses and multiplicative factors can also be used, for example 2*(0b10, 0xf) is

equivalent to 0b10, 0xf, 0b10, 0xf. The multiplying factor must come before the thing

it is being used to repeat.

The auto parameter also accepts other types:

• A list or tuple, whose elements will be evaluated as booleans (imagine calling bool()

on each item) and the bits set to 1for True items and 0for False items.

• A positive integer, used to create a bitstring of that many zero bits.

• A ﬁle object, presumably opened in read-binary mode, from which the bitstring will be

formed.

• A bytearray object.

• In Python 3 only, a bytes object. Note this won’t work for Python 2 as bytes is just a

synonym for str.

10.2 Compact format strings

For the read,unpack,peek methods and pack function you can use compact format strings

similar to those used in the struct and array modules. These start with an endian identiﬁer:

>for big-endian, <for little-endian or @for native-endian. This must be followed by at least

one of these codes:

Code Interpretation

b8 bit signed integer

B8 bit unsigned integer

h16 bit signed integer

H16 bit unsigned integer

l32 bit signed integer

L32 bit unsigned integer

q64 bit signed integer

Q64 bit unsigned integer

f32 bit ﬂoating point number

d64 bit ﬂoating point number

For more detail see Compact format strings.

56 Chapter 10. The bitstring module

bitstring Documentation, Release 3.0.0

10.3 Class properties

Bitstrings use a wide range of properties for getting and setting different interpretations on the

binary data, as well as accessing bit lengths and positions. For the mutable BitStream and

BitArray objects the properties are all read and write (with the exception of the length),

whereas for immutable objects the only write enabled properties are for the position in the

bitstream (pos/bitpos and bytepos).

10.3. Class properties 57

bitstring Documentation, Release 3.0.0

58 Chapter 10. The bitstring module

CHAPTER

ELEVEN

THE BITS CLASS

class bitstring.Bits([auto,length,offset,**kwargs ])

Creates a new bitstring. You must specify either no initialiser, just an auto value,

or one of the keyword arguments bytes,bin,hex,oct,uint,int,uintbe,

intbe,uintle,intle,uintne,intne,se,ue,sie,uie,float,floatbe,

floatle,floatne,bool or filename. If no initialiser is given then a zeroed

bitstring of length bits is created.

The initialiser for the Bits class is precisely the same as for BitArray,BitStream

and ConstBitStream.

offset is available when using the bytes or filename initialisers. It gives a number

of bits to ignore at the start of the bitstring.

Specifying length is mandatory when using the various integer initialisers. It must be

large enough that a bitstring can contain the integer in length bits. It must also be

speciﬁed for the ﬂoat initialisers (the only valid values are 32 and 64). It is optional for

the bytes and filename initialisers and can be used to truncate data from the end of

the input value.

>>> s1 =Bits(hex=’0x934’)

>>> s2 =Bits(oct=’0o4464’)

>>> s3 =Bits(bin=’0b001000110100’)

>>> s4 =Bits(int=-1740, length=12)

>>> s5 =Bits(uint=2356, length=12)

>>> s6 =Bits(bytes=b’\x93@’, length=12)

>>> s1 == s2 == s3 == s4 == s5 == s6

True

For information on the use of auto see The auto initialiser.

>>> s=Bits(’uint:12=32, 0b110’)

>>> t=Bits(’0o755, ue:12, int:3=-1’)

all(value[,pos ])

Returns True if all of the speciﬁed bits are all set to value, otherwise returns

False.

If value is True then 1bits are checked for, otherwise 0bits are checked for.

bitstring Documentation, Release 3.0.0

pos should be an iterable of bit positions. Negative numbers are treated in the same

way as slice indices and it will raise an IndexError if pos < -s.len or pos

> s.len. It defaults to the whole bitstring.

>>> s=Bits(’int:15=-1’)

>>> s.all(True, [3,4,12,13])

True

>>> s.all(1)

True

any(value[,pos ])

Returns True if any of the speciﬁed bits are set to value, otherwise returns False.

If value is True then 1bits are checked for, otherwise 0bits are checked for.

pos should be an iterable of bit positions. Negative numbers are treated in the same

way as slice indices and it will raise an IndexError if pos < -s.len or pos

> s.len. It defaults to the whole bitstring.

>>> s=Bits(’0b11011100’)

>>> s.any(False,range(6))

True

>>> s.any(1)

True

count(value)

Returns the number of bits set to value.

value can be True or False or anything that can be cast to a bool, so you could

equally use 1or 0.

>>> s=BitString(1000000)

>>> s.set(1, [4,44,444444])

>>> s.count(1)

>>> s.count(False)

999997

cut(bits[,start,end,count ])

Returns a generator for slices of the bitstring of length bits.

At most count items are returned and the range is given by the slice [start:end],

which defaults to the whole bitstring.

>>> s=BitString(’0x1234’)

>>> for nibble in s.cut(4):

... s.prepend(nibble)

>>> print(s)

0x43211234

endswith(bs[,start,end ])

Returns True if the bitstring ends with the sub-string bs, otherwise returns False.

A slice can be given using the start and end bit positions and defaults to the whole

bitstring.

60 Chapter 11. The Bits class

bitstring Documentation, Release 3.0.0

>>> s=Bits(’0x35e22’)

>>> s.endswith(’0b10, 0x22’)

True

>>> s.endswith(’0x22’, start=13)

False

find(bs[,start,end,bytealigned ])

Searches for bs in the current bitstring and sets pos to the start of bs and returns it

in a tuple if found, otherwise it returns an empty tuple.

The reason for returning the bit position in a tuple is so that it evaluates as True even

if the bit position is zero. This allows constructs such as if s.find(’0xb3’):

to work as expected.

If bytealigned is True then it will look for bs only at byte aligned positions (which

is generally much faster than searching for it in every possible bit position). start

and end give the search range and default to the whole bitstring.

>>> s=Bits(’0x0023122’)

>>> s.find(’0b000100’, bytealigned=True)

(16,)

findall(bs[,start,end,count,bytealigned ])

Searches for all occurrences of bs (even overlapping ones) and returns a generator

of their bit positions.

If bytealigned is True then bs will only be looked for at byte aligned positions.

start and end optionally deﬁne a search range and default to the whole bitstring.

The count paramater limits the number of items that will be found - the default is to

ﬁnd all occurences.

>>> s=Bits(’0xab220101’)*5

>>> list(s.findall(’0x22’, bytealigned=True))

[8, 40, 72, 104, 136]

join(sequence)

Returns the concatenation of the bitstrings in the iterable sequence joined with

self as a separator.

>>> s=Bits().join([’0x0001ee’,’uint:24=13’,’0b0111’])

>>> print(s)

0x0001ee00000d7

>>> s=Bits(’0b1’).join([’0b0’]*5)

>>> print(s.bin)

010101010

rfind(bs[,start,end,bytealigned ])

Searches backwards for bs in the current bitstring and sets pos to the start of bs

and returns it in a tuple if found, otherwise it returns an empty tuple.

The reason for returning the bit position in a tuple is so that it evaluates

as True even if the bit position is zero. This allows constructs such as if

bitstring Documentation, Release 3.0.0

s.rfind(’0xb3’): to work as expected.

If bytealigned is True then it will look for bs only at byte aligned positions. start

and end give the search range and default to 0and len respectively.

Note that as it’s a reverse search it will start at end and ﬁnish at start.

>>> s=Bits(’0o031544’)

>>> s.rfind(’0b100’)

(15,)

>>> s.rfind(’0b100’, end=17)

(12,)

split(delimiter[,start,end,count,bytealigned ])

Splits the bitstring into sections that start with delimiter. Returns a generator for

bitstring objects.

The ﬁrst item generated is always the bits before the ﬁrst occurrence of delimiter

(even if empty). A slice can be optionally speciﬁed with start and end, while count

speciﬁes the maximum number of items generated.

If bytealigned is True then the delimiter will only be found if it starts at a byte

aligned position.

>>> s=Bits(’0x42423’)

>>> [bs.bin for bs in s.split(’0x4’)]

[’’, ’01000’, ’01001000’, ’0100011’]

startswith(bs[,start,end ])

Returns True if the bitstring starts with the sub-string bs, otherwise returns False.

A slice can be given using the start and end bit positions and defaults to the whole

bitstring.

tobytes()

Returns the bitstring as a bytes object (equivalent to a str in Python 2.6/2.7).

The returned value will be padded at the end with between zero and seven 0bits to

make it byte aligned.

This method can also be used to output your bitstring to a ﬁle - just open a ﬁle in

binary write mode and write the function’s output.

>>> s=Bits(bytes=b’hello’)

>>> s+= ’0b01’

>>> s.tobytes()

b’hello@’

tofile(f)

Writes the bitstring to the ﬁle object f, which should have been opened in binary

write mode.

The data written will be padded at the end with between zero and seven 0bits to

make it byte aligned.

62 Chapter 11. The Bits class

bitstring Documentation, Release 3.0.0

>>> f=open(’newfile’,’wb’)

>>> Bits(’0x1234’).tofile(f)

unpack(fmt,**kwargs)

Interprets the whole bitstring according to the fmt string or iterable and returns a list

of bitstring objects.

A dictionary or keyword arguments can also be provided. These will replace length

identiﬁers in the format string.

fmt is an iterable or a string with comma separated tokens that describe how to

interpret the next bits in the bitstring. See the entry for read for details.

>>> s=Bits(’int:4=-1, 0b1110’)

>>> i,b=s.unpack(’int:4, bin’)

If a token doesn’t supply a length (as with bin above) then it will try to consume

the rest of the bitstring. Only one such token is allowed.

bin

Property for the representation of the bitstring as a binary string.

bool

Property for representing the bitstring as a boolean (True or False).

If the bitstring is not a single bit then the getter will raise an InterpretError.

bytes

Property representing the underlying byte data that contains the bitstring.

When used as a getter the bitstring must be a whole number of byte long or a

InterpretError will be raised.

An alternative is to use the tobytes method, which will pad with between zero

and seven 0bits to make it byte aligned if needed.

>>> s=Bits(’0x12345678’)

>>> s.bytes

b’\x124Vx’

hex

Property representing the hexadecimal value of the bitstring.

If the bitstring is not a multiple of four bits long then getting its hex value will raise

an InterpretError.

>>> s=Bits(bin=’1111 0000’)

>>> s.hex

’f0’

int

Property for the signed two’s complement integer representation of the bitstring.

intbe

Property for the byte-wise big-endian signed two’s complement integer representa-

tion of the bitstring.

bitstring Documentation, Release 3.0.0

Only valid for whole-byte bitstrings, in which case it is equal to s.int, otherwise

an InterpretError is raised.

intle

Property for the byte-wise little-endian signed two’s complement integer represen-

tation of the bitstring.

Only valid for whole-byte bitstring, in which case it is equal to s[::-8].int,

i.e. the integer representation of the byte-reversed bitstring.

intne

Property for the byte-wise native-endian signed two’s complement integer repre-

sentation of the bitstring.

Only valid for whole-byte bitstrings, and will equal either the big-endian or the

little-endian integer representation depending on the platform being used.

float

floatbe

Property for the ﬂoating point representation of the bitstring.

The bitstring must be either 32 or 64 bits long to support the ﬂoating point interpre-

tations, otherwise an InterpretError will be raised.

If the underlying ﬂoating point methods on your machine are not IEEE 754 com-

pliant then using the ﬂoat interpretations is undeﬁned (this is unlikely unless you’re

on some very unusual hardware).

The float property is bit-wise big-endian, which as all ﬂoats must be whole-byte

is exactly equivalent to the byte-wise big-endian floatbe.

floatle

Property for the byte-wise little-endian ﬂoating point representation of the bitstring.

floatne

Property for the byte-wise native-endian ﬂoating point representation of the bit-

string.

len

length

Read-only property that give the length of the bitstring in bits (len and length

are equivalent).

This is almost equivalent to using the len() built-in function, except that for large

bitstrings len() may fail with an OverflowError, whereas the len property

continues to work.

oct

Property for the octal representation of the bitstring.

If the bitstring is not a multiple of three bits long then getting its octal value will

raise a InterpretError.

64 Chapter 11. The Bits class

bitstring Documentation, Release 3.0.0

>>> s=BitString(’0b111101101’)

>>> s.oct

’755’

>>> s.oct =’01234567’

>>> s.oct

’01234567’

Property for the signed exponential-Golomb code representation of the bitstring.

When used as a getter an InterpretError will be raised if the bitstring is not a

single code.

>>> s=BitString(se=-40)

>>> s.bin

0000001010001

>>> s+= ’0b1’

>>> s.se

Error: BitString is not a single exponential-Golomb code.

Property for the unsigned exponential-Golomb code representation of the bitstring.

When used as a getter an InterpretError will be raised if the bitstring is not a

single code.

sie

Property for the signed interleaved exponential-Golomb code representation of the

bitstring.

When used as a getter an InterpretError will be raised if the bitstring is not a

single code.

uie

Property for the unsigned interleaved exponential-Golomb code representation of

the bitstring.

When used as a getter an InterpretError will be raised if the bitstring is not a

single code.

uint

Property for the unsigned base-2 integer representation of the bitstring.

uintbe

Property for the byte-wise big-endian unsigned base-2 integer representation of the

bitstring.

uintle

Property for the byte-wise little-endian unsigned base-2 integer representation of

the bitstring.

uintne

Property for the byte-wise native-endian unsigned base-2 integer representation of

the bitstring.

bitstring Documentation, Release 3.0.0

__add__(bs)

__radd__(bs)

s1 + s2

Concatenate two bitstring objects and return the result. Either bitstring can be ‘auto’

initialised.

s=Bits(ue=132)+’0xff’

s2 =’0b101’ +s

__and__(bs)

__rand__(bs)

s1 & s2

Returns the bit-wise AND between two bitstrings, which must have the same length

otherwise a ValueError is raised.

>>> print(Bits(’0x33’)&’0x0f’)

0x03

__bool__()

if s:

Returns True if at least one bit is set to 1, otherwise returns False.

This special method is used in Python 3 only; for Python 2 the equivalent is called

__nonzero__, but the details are exactly the same.

>>> bool(Bits())

False

>>> bool(Bits(’0b0000010000’))

True

>>> bool(Bits(’0b0000000000’))

False

__contains__(bs)

bs in s

Returns True if bs can be found in the bitstring, otherwise returns False.

Similar to using find, except that you are only told if it is found, and not where it

was found.

>>> ’0b11’ in Bits(’0x06’)

True

>>> ’0b111’ in Bits(’0x06’)

False

__copy__()

s2 = copy.copy(s1)

This allows the copy module to correctly copy bitstrings. Other equivalent meth-

ods are to initialise a new bitstring with the old one or to take a complete slice.

66 Chapter 11. The Bits class

bitstring Documentation, Release 3.0.0

>>> import copy

>>> s=Bits(’0o775’)

>>> s_copy1 =copy.copy(s)

>>> s_copy2 =Bits(s)

>>> s_copy3 =s[:]

>>> s== s_copy1 == s_copy2 == s_copy3

True

__eq__(bs)

s1 == s2

Compares two bitstring objects for equality, returning True if they have the same

binary representation, otherwise returning False.

>>> Bits(’0o7777’)== ’0xfff’

True

>>> a=Bits(uint=13, length=8)

>>> b=Bits(uint=13, length=10)

>>> a== b

False

__getitem__(key)

s[start:end:step]

Returns a slice of the bitstring.

The usual slice behaviour applies except that the step parameter gives a multiplica-

tive factor for start and end (i.e. the bits ‘stepped over’ are included in the

slice).

>>> s=Bits(’0x0123456’)

>>> s[0:4]

Bits(’0x1’)

>>> s[0:3:8]

Bits(’0x012345’)

If a single element is asked for then either True or False will be returned.

>>> s[0]

False

>>> s[-1]

True

__hash__()

hash(s)

Returns an integer hash of the Bits.

This method is not available for the BitArray or BitStream classes, as only

immutable objects should be hashed. You typically won’t need to call it directly,

instead it is used for dictionary keys and in sets.

__invert__()

bitstring Documentation, Release 3.0.0

Returns the bitstring with every bit inverted, that is all zeros replaced with ones, and

all ones replaced with zeros.

If the bitstring is empty then an Error will be raised.

>>> s = ConstBitStream(‘0b1110010’)

>>> print(~s)

0b0001101

>>> print(~s & s)

0b0000000

__len__()

len(s)

Returns the length of the bitstring in bits if it is less than sys.maxsize, otherwise

raises OverflowError.

It’s recommended that you use the len property rather than the len function be-

cause of the function’s behaviour for large bitstring objects, although calling the

special function directly will always work.

>>> s=Bits(filename=’11GB.mkv’)

>>> s.len

93944160032

>>> len(s)

OverflowError: long int too large to convert to int

>>> s.__len__()

93944160032

__lshift__(n)

s << n

Returns the bitstring with its bits shifted nplaces to the left. The nright-most bits

will become zeros.

>>> s=Bits(’0xff’)

>>> s<< 4

Bits(’0xf0’)

__mul__(n)

__rmul__(n)

s*n/n*s

Return bitstring consisting of nconcatenations of another.

>>> a=Bits(’0x34’)

>>> b=a*5

>>> print(b)

0x3434343434

__ne__(bs)

s1 != s2

Compares two bitstring objects for inequality, returning False if they have the

same binary representation, otherwise returning True.

68 Chapter 11. The Bits class

bitstring Documentation, Release 3.0.0

__nonzero__()

See __bool__.

__or__(bs)

__ror__(bs)

s1 | s2

Returns the bit-wise OR between two bitstring, which must have the same length

otherwise a ValueError is raised.

>>> print(Bits(’0x33’)|’0x0f’)

0x3f

__repr__()

repr(s)

A representation of the bitstring that could be used to create it (which will often not

be the form used to create it).

If the result is too long then it will be truncated with ... and the length of the

whole will be given.

>>> Bits(‘0b11100011’)

Bits(‘0xe3’)

__rshift__(n)

s >> n

Returns the bitstring with its bits shifted nplaces to the right. The nleft-most bits

will become zeros.

>>> s = Bits(‘0xff’)

>>> s >> 4

Bits(‘0x0f’)

__str__()

print(s)

Used to print a representation of of the bitstring, trying to be as brief as possible.

If the bitstring is a multiple of 4 bits long then hex will be used, otherwise either

binary or a mix of hex and binary will be used. Very long strings will be truncated

with ....

>>> s=Bits(’0b1’)*7

>>> print(s)

0b1111111

>>> print(s +’0b1’)

0xff

__xor__(bs)

__rxor__(bs)

s1 ^ s2

bitstring Documentation, Release 3.0.0

Returns the bit-wise XOR between two bitstrings, which must have the same length

otherwise a ValueError is raised.

>>> print(Bits(’0x33’)^’0x0f’)

0x3c

70 Chapter 11. The Bits class

CHAPTER

TWELVE

THE BITARRAY CLASS

class bitstring.BitArray([auto,length,offset,**kwargs ])

The Bits class is the base class for BitArray and so (with the exception of

__hash__) all of its methods are also available for BitArray objects. The initialiser

is also the same as for Bits and so won’t be repeated here.

ABitArray is a mutable Bits, and so the one thing all of the methods listed here

have in common is that they can modify the contents of the bitstring.

append(bs)

Join a BitArray to the end of the current BitArray.

>>> s=BitArray(’0xbad’)

>>> s.append(’0xf00d’)

>>> s

BitArray(’0xbadf00d’)

byteswap([fmt,start,end,repeat=True ])

Change the endianness of the BitArray in-place according to fmt. Return the

number of swaps done.

The fmt can be an integer, an iterable of integers or a compact format string similar

to those used in pack (described in Compact format strings). It defaults to 0, which

means reverse as many bytes as possible. The fmt gives a pattern of byte sizes to use

to swap the endianness of the BitArray. Note that if you use a compact format

string then the endianness identiﬁer (<,>or @) is not needed, and if present it will

be ignored.

start and end optionally give a slice to apply the transformation to (it defaults to the

whole BitArray). If repeat is True then the byte swapping pattern given by the

fmt is repeated in its entirety as many times as possible.

>>> s=BitArray(’0x00112233445566’)

>>> s.byteswap(2)

>>> s

BitArray(’0x11003322554466’)

>>> s.byteswap(’h’)

>>> s

bitstring Documentation, Release 3.0.0

BitArray(’0x00112233445566’)

>>> s.byteswap([2,5])

>>> s

BitArray(’0x11006655443322’)

It can also be used to swap the endianness of the whole BitArray.

>>> s=BitArray(’uintle:32=1234’)

>>> s.byteswap()

>>> print(s.uintbe)

1234

insert(bs,pos)

Inserts bs at pos.

When used with the BitStream class the pos is optional, and if not present the

current bit position will be used. After insertion the property pos will be immedi-

ately after the inserted bitstring.

>>> s=BitArray(’0xccee’)

>>> s.insert(’0xd’,8)

>>> s

BitArray(’0xccdee’)

>>> s.insert(’0x00’)

>>> s

BitArray(’0xccd00ee’)

invert([pos ])

Inverts one or many bits from 1to 0or vice versa.

pos can be either a single bit position or an iterable of bit positions. Neg-

ative numbers are treated in the same way as slice indices and it will raise

IndexError if pos < -s.len or pos > s.len. The default is to invert

the entire BitArray.

>>> s=BitArray(’0b111001’)

>>> s.invert(0)

>>> s.bin

’0b011001’

>>> s.invert([-2,-1])

>>> s.bin

’0b011010’

>>> s.invert()

>>> s.bin

’0b100101’

overwrite(bs,pos)

Replaces the contents of the current BitArray with bs at pos.

When used with the BitStream class the pos is optional, and if not present the

current bit position will be used. After insertion the property pos will be immedi-

ately after the overwritten bitstring.

72 Chapter 12. The BitArray class

bitstring Documentation, Release 3.0.0

>>> s=BitArray(length=10)

>>> s.overwrite(’0b111’,3)

>>> s

BitArray(’0b0001110000’)

>>> s.pos

prepend(bs)

Inserts bs at the beginning of the current BitArray.

>>> s=BitArray(’0b0’)

>>> s.prepend(’0xf’)

>>> s

BitArray(’0b11110’)

replace(old,new[,start,end,count,bytealigned ])

Finds occurrences of old and replaces them with new. Returns the number of re-

placements made.

If bytealigned is True then replacements will only be made on byte boundaries.

start and end give the search range and default to 0and len respectively. If count

is speciﬁed then no more than this many replacements will be made.

>>> s=BitArray(’0b0011001’)

>>> s.replace(’0b1’,’0xf’)

>>> print(s.bin)

0011111111001111

>>> s.replace(’0b1’,’’, count=6)

>>> print(s.bin)

0011001111

reverse([start,end ])

Reverses bits in the BitArray in-place.

start and end give the range and default to 0and len respectively.

>>> a=BitArray(’0b10111’)

>>> a.reverse()

>>> a.bin

’11101’

rol(bits[,start,end ])

Rotates the contents of the BitArray in-place by bits bits to the left.

start and end deﬁne the slice to use and default to 0and len respectively.

Raises ValueError if bits < 0.

>>> s=BitArray(’0b01000001’)

>>> s.rol(2)

>>> s.bin

’00000101’

bitstring Documentation, Release 3.0.0

ror(bits[,start,end ])

Rotates the contents of the BitArray in-place by bits bits to the right.

start and end deﬁne the slice to use and default to 0and len respectively.

Raises ValueError if bits < 0.

set(value[,pos ])

Sets one or many bits to either 1(if value is True) or 0(if value isn’t True). pos

can be either a single bit position or an iterable of bit positions. Negative numbers

are treated in the same way as slice indices and it will raise IndexError if pos

< -s.len or pos > s.len. The default is to set every bit in the BitArray.

Using s.set(True, x) can be more efﬁcent than other equivalent methods

such as s[x] = 1,s[x] = "0b1" or s.overwrite(’0b1’, x), espe-

cially if many bits are being set.

>>> s=BitArray(’0x0000’)

>>> s.set(True,-1)

>>> print(s)

0x0001

>>> s.set(1, (0,4,5,7,9))

>>> s.bin

’1000110101000001’

>>> s.set(0)

>>> s.bin

’0000000000000000’

bin

Writable version of Bits.bin.

bool

Writable version of Bits.bool.

bytes

Writable version of Bits.bytes.

hex

Writable version of Bits.hex.

int

Writable version of Bits.int.

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

>>> s=BitArray(’0xf3’)

>>> s.int

-13

>>> s.int =1232

ValueError: int 1232 is too large for a BitArray of length 8.

intbe

Writable version of Bits.intbe.

74 Chapter 12. The BitArray class

bitstring Documentation, Release 3.0.0

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

intle

Writable version of Bits.intle.

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

intne

Writable version of Bits.intne.

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

float

floatbe

Writable version of Bits.float.

floatle

Writable version of Bits.floatle.

floatne

Writable version of Bits.floatne.

oct

Writable version of Bits.oct.

Writable version of Bits.se.

Writable version of Bits.uie.

sie

Writable version of Bits.sie.

uie

Writable version of Bits.ue.

uint

Writable version of Bits.uint.

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

uintbe

Writable version of Bits.uintbe.

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

uintle

Writable version of Bits.uintle.

bitstring Documentation, Release 3.0.0

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

uintne

Writable version of Bits.uintle.

When used as a setter the value must ﬁt into the current length of the BitArray,

else a ValueError will be raised.

__delitem__(key)

del s[start:end:step]

Deletes the slice speciﬁed.

__iadd__(bs)

s1 += s2

Appends bs to the current bitstring.

Note that for BitArray objects this will be an in-place change, whereas for Bits

objects using += will not call this method - instead a new object will be created (it

is equivalent to a copy and an __add__).

>>> s=BitArray(ue=423)

>>> s+= BitArray(ue=12)

>>> s.read(’ue’)

423

>>> s.read(’ue’)

__iand__(bs)

s &= bs

In-place bit-wise AND between two bitstrings. If the two bitstrings are not the same

length then a ValueError is raised.

__ilshift__(n)

s <<= n

Shifts the bits in-place nbits to the left. The nright-most bits will become zeros

and bits shifted off the left will be lost.

__imul__(n)

s*= n

In-place concatenation of ncopies of the current bitstring.

>>> s=BitArray(’0xbad’)

>>> s*=3

>>> s.hex

’badbadbad’

__ior__(bs)

s |= bs

In-place bit-wise OR between two bitstrings. If the two bitstrings are not the same

length then a ValueError is raised.

76 Chapter 12. The BitArray class

bitstring Documentation, Release 3.0.0

__irshift__(n)

s >>= n

Shifts the bits in-place nbits to the right. The nleft-most bits will become zeros

and bits shifted off the right will be lost.

__ixor__(bs)

s ^= bs

In-place bit-wise XOR between two bitstrings. If the two bitstrings are not the same

length then a ValueError is raised.

__setitem__(key,value)

s1[start:end:step] = s2

Replaces the slice speciﬁed with a new value.

>>> s=BitArray(’0x00112233’)

>>> s[1:2:8]=’0xfff’

>>> print(s)

0x00fff2233

>>> s[-12:] =’0xc’

>>> print(s)

0x00fff2c

bitstring Documentation, Release 3.0.0

78 Chapter 12. The BitArray class

CHAPTER

THIRTEEN

THE CONSTBITSTREAM CLASS

class bitstring.ConstBitStream([auto,length,offset,**kwargs ])

The Bits class is the base class for ConstBitStream and so all of its methods are

also available for ConstBitStream objects. The initialiser is also the same as for

Bits and so won’t be repeated here.

AConstBitStream is a Bits with added methods and properties that allow it to be

parsed as a stream of bits.

bytealign()

Aligns to the start of the next byte (so that pos is a multiple of 8) and returns the

number of bits skipped.

If the current position is already byte aligned then it is unchanged.

>>> s=ConstBitStream(’0xabcdef’)

>>> s.pos += 3

>>> s.bytealign()

>>> s.pos

peek(fmt)

Reads from the current bit position pos in the bitstring according to the fmt string

or integer and returns the result.

The bit position is unchanged.

For information on the format string see the entry for the read method.

>>> s=ConstBitStream(’0x123456’)

>>> s.peek(16)

ConstBitStream(’0x1234’)

>>> s.peek(’hex:8’)

’12’

peeklist(fmt,**kwargs)

Reads from current bit position pos in the bitstring according to the fmt string or

iterable and returns a list of results.

A dictionary or keyword arguments can also be provided. These will replace length

identiﬁers in the format string. The position is not advanced to after the read items.

bitstring Documentation, Release 3.0.0

See the entries for read and readlist for more information.

read(fmt)

Reads from current bit position pos in the bitstring according the the format string

and returns a single result. If not enough bits are available then a ReadError is

raised.

fmt is either a token string that describes how to interpret the next bits in the bitstring

or an integer. If it’s an integer then that number of bits will be read, and returned as

a new bitstring. Otherwise the tokens are:

int:n n bits as a signed integer.

uint:n n bits as an unsigned integer.

float:n n bits as a ﬂoating point number.

intbe:n n bits as a big-endian signed integer.

uintbe:n n bits as a big-endian unsigned integer.

floatbe:n n bits as a big-endian ﬂoat.

intle:n n bits as a little-endian signed int.

uintle:n n bits as a little-endian unsigned int.

floatle:n n bits as a little-endian ﬂoat.

intne:n n bits as a native-endian signed int.

uintne:n n bits as a native-endian unsigned int.

floatne:n n bits as a native-endian ﬂoat.

hex:n n bits as a hexadecimal string.

oct:n n bits as an octal string.

bin:n n bits as a binary string.

ue next bits as an unsigned exp-Golomb.

se next bits as a signed exp-Golomb.

uie next bits as an interleaved unsigned exp-Golomb.

sie next bits as an interleaved signed exp-Golomb.

bits:n n bits as a new bitstring.

bytes:n n bytes as bytes object.

bool next bit as a boolean (True or False).

For example:

>>> s=ConstBitStream(’0x23ef55302’)

>>> s.read(’hex:12’)

’23e’

>>> s.read(’bin:4’)

’1111’

>>> s.read(’uint:5’)

>>> s.read(’bits:4’)

ConstBitStream(’0xa’)

The read method is useful for reading exponential-Golomb codes.

>>> s=ConstBitStream(’se=-9, ue=4’)

>>> s.read(’se’)

-9

80 Chapter 13. The ConstBitStream class

bitstring Documentation, Release 3.0.0

>>> s.read(’ue’)

readlist(fmt,**kwargs)

Reads from current bit position pos in the bitstring according to the fmt string

or iterable and returns a list of results. If not enough bits are available then a

ReadError is raised.

A dictionary or keyword arguments can also be provided. These will replace length

identiﬁers in the format string. The position is advanced to after the read items.

See the entry for read for information on the format strings.

For multiple items you can separate using commas or given multiple parameters:

>>> s=ConstBitStream(’0x43fe01ff21’)

>>> s.readlist(’hex:8, uint:6’)

[’43’, 63]

>>> s.readlist([’bin:3’,’intle:16’])

[’100’, -509]

>>> s.pos =0

>>> s.readlist(’hex:b, uint:d’, b=8, d=6)

[’43’, 63]

readto(bs,bytealigned)

Reads up to and including the next occurrence of the bitstring bs and returns the

results. If bytealigned is True it will look for the bitstring starting only at whole-

byte positions.

Raises a ReadError if bs is not found, and ValueError if bs is empty.

>>> s=ConstBitStream(’0x47000102034704050647’)

>>> s.readto(’0x47’, bytealigned=True)

BitStream(’0x47’)

>>> s.readto(’0x47’, bytealigned=True)

BitStream(’0x0001020347’)

>>> s.readto(’0x47’, bytealigned=True)

BitStream(’0x04050647’)

bytepos

Property for setting and getting the current byte position in the bitstring.

When used as a getter will raise a ByteAlignError if the current position in not

byte aligned.

pos

bitpos

Read and write property for setting and getting the current bit position in the bit-

string. Can be set to any value from 0to len.

The pos and bitpos properties are exactly equivalent - you can use whichever

you prefer.

bitstring Documentation, Release 3.0.0

if s.pos <100:

s.pos += 10

82 Chapter 13. The ConstBitStream class

CHAPTER

FOURTEEN

THE BITSTREAM CLASS

class bitstring.BitStream([auto,length,offset,**kwargs ])

Both the BitArray and the ConstBitStream classes are base classes for

BitStream and so all of their methods are also available for BitStream objects.

The initialiser is also the same as for Bits and so won’t be repeated here.

ABitStream is a mutable container of bits with methods and properties that allow it

to be parsed as a stream of bits. There are no additional methods of properties in this

class - see its base classes (Bits,BitArray and ConstBitStream) for details.

bitstring Documentation, Release 3.0.0

84 Chapter 14. The BitStream class

CHAPTER

FIFTEEN

FUNCTIONS

bitstring.pack(format[,*values,**kwargs ])

Packs the values and keyword arguments according to the format string and returns a new

BitStream.

Parameters

•format – string with comma separated tokens

•values – extra values used to construct the BitStream

•kwargs – a dictionary of token replacements

Return type BitStream

The format string consists of comma separated tokens of the form name:length=value.

See the entry for read for more details.

The tokens can be ‘literals’, like 0xef,0b110,uint:8=55, etc. which just represent a set

sequence of bits.

They can also have the value missing, in which case the values contained in *values will be

used.

>>> a=pack(’bin:3, hex:4’,’001’,’f’)

>>> b=pack(’uint:10’,33)

A dictionary or keyword arguments can also be provided. These will replace items in the format

string.

>>> c=pack(’int:a=b’, a=10, b=20)

>>> d=pack(’int:8=a, bin=b, int:4=a’, a=7, b=’0b110’)

Plain names can also be used as follows:

>>> e=pack(’a, b, b, a’, a=’0b11’, b=’0o2’)

Tokens starting with an endianness identiﬁer (<,>or @) implies a struct-like compact format

string (see Compact format strings). For example this packs three little-endian 16-bit integers:

>>> f=pack(’<3h’,12,3,108)

And of course you can combine the different methods in a single pack.

bitstring Documentation, Release 3.0.0

AValueError will be raised if the *values are not all used up by the format string, and if

a value provided doesn’t match the length speciﬁed by a token.

86 Chapter 15. Functions

CHAPTER

SIXTEEN

EXCEPTIONS

exception bitstring.Error(Exception)

Base class for all module exceptions.

exception bitstring.InterpretError(Error,ValueError)

Inappropriate interpretation of binary data. For example using the ‘bytes’ property on a

bitstring that isn’t a whole number of bytes long.

exception bitstring.ByteAlignError(Error)

Whole-byte position or length needed.

exception bitstring.CreationError(Error,ValueError)

Inappropriate argument during bitstring creation.

exception bitstring.ReadError(Error,IndexError)

Reading or peeking past the end of a bitstring.

bitstring Documentation, Release 3.0.0

88 Chapter 16. Exceptions

Part III

Appendices

bitstring Documentation, Release 3.0.0

Gathered together here are a few odds and ends that didn’t ﬁt well into either the user manual or

the reference section. The only unifying theme is that none of them provide any vital knowledge

about bitstring, and so they can all be safely ignored.

bitstring Documentation, Release 3.0.0

CHAPTER

SEVENTEEN

EXAMPLES

17.1 Creation

There are lots of ways of creating new bitstrings. The most ﬂexible is via the auto parameter,

which is used in this example.

# Multiple parts can be joined with a single expression...

s=BitArray(’0x000001b3, uint:12=352, uint:12=288, 0x1, 0x3’)

# and extended just as easily

s+= ’uint:18=48000, 0b1, uint:10=4000, 0b100’

# To covert to an ordinary string use the bytes property

open(’video.m2v’,’wb’).write(s.bytes)

# The information can be read back with a similar syntax

start_code, width, height =s.readlist(’hex:32, uint:12, uint:12’)

aspect_ratio, frame_rate =s.readlist(’2*bin:4’)

17.2 Manipulation

s=BitArray(’0x0123456789abcdef’)

del s[4:8]# deletes the ’1’

s.insert(’0xcc’,12)# inserts ’cc’ between the ’3’ and ’4’

s.overwrite(’0b01’,30)# changes the ’6’ to a ’5’

# This replaces every ’1’ bit with a 5 byte Ascii string!

s.replace(’0b1’, BitArray(bytes=’hello’))

del s[-1001:] # deletes final 1001 bits

s.reverse() # reverses whole BitString

s.prepend(’uint:12=44’)# prepend a 12 bit integer

bitstring Documentation, Release 3.0.0

17.3 Parsing

This example creates a class that parses a structure that is part of the H.264 video standard.

class seq_parameter_set_data(object):

def __init__(self, s):

"""Interpret next bits in BitString s as an SPS."""

# Read and interpret bits in a single expression:

self.profile_idc =s.read(’uint:8’)

# Multiple reads in one go returns a list:

self.constraint_flags =s.readlist(’4*uint:1’)

self.reserved_zero_4bits =s.read(’bin:4’)

self.level_idc =s.read(’uint:8’)

self.seq_parameter_set_id =s.read(’ue’)

if self.profile_idc in [100,110,122,244,44,83,86]:

self.chroma_format_idc =s.read(’ue’)

if self.chroma_format_idc == 3:

self.separate_colour_plane_flag =s.read(’uint:1’)

self.bit_depth_luma_minus8 =s.read(’ue’)

self.bit_depth_chroma_minus8 =s.read(’ue’)

# etc.

>>> s=BitStream(’0x6410281bc0’)

>>> sps =seq_parameter_set_data(s)

>>> print(sps.profile_idc)

100

>>> print(sps.level_idc)

>>> print(sps.reserved_zero_4bits)

0b0000

>>> print(sps.constraint_flags)

[0, 0, 0, 1]

17.4 Sieve of Eratosthenes

This classic (though inefﬁcient) method of calculating prime numbers uses a bitstring to store

whether each bit position represents a prime number. This takes much less memory than an

ordinary array.

def prime_sieve(top=1000000):

b=BitArray(top) # bitstring of ’0’ bits

for iin xrange(2, top):

if not b[i]:

yield i

# i is prime, so set all its multiples to ’1’.

b.set(True,xrange(i*i, top, i))

94 Chapter 17. Examples

CHAPTER

EIGHTEEN

EXPONENTIAL-GOLOMB CODES

As this type of representation of integers isn’t as well known as the standard base-2 represen-

tation I thought that a short explanation of them might be welcome. This section can be safely

skipped if you’re not interested.

Exponential-Golomb codes represent integers using bit patterns that get longer for larger num-

bers. For unsigned and signed numbers (the bitstring properties ue and se respectively) the

patterns start like this:

Bit pattern Unsigned Signed

10 0

010 1 1

011 2 -1

00100 3 2

00101 4 -2

00110 5 3

00111 6 -3

0001000 7 4

0001001 8 -4

0001010 9 5

0001011 10 -5

0001100 11 6

... ... ...

They consist of a sequence of n ‘0’ bits, followed by a ‘1’ bit, followed by n more bits. The

bits after the ﬁrst ‘1’ bit count upwards as ordinary base-2 binary numbers until they run out of

space and an extra ‘0’ bit needs to get included at the start.

The advantage of this method of representing integers over many other methods is that it can be

quite efﬁcient at representing small numbers without imposing a limit on the maximum number

that can be represented.

Exercise: Using the table above decode this sequence of unsigned Exponential Golomb codes:

001001101101101011000100100101

The answer is that it decodes to 3, 0, 0, 2, 2, 1, 0, 0, 8, 4. Note how you don’t need to know

how many bits are used for each code in advance - there’s only one way to decode it. To create

this bitstring you could have written something like:

bitstring Documentation, Release 3.0.0

a=BitStream().join([BitArray(ue=i) for iin [3,0,0,2,2,1,0,0,8,4]])

and to read it back:

while a.pos != a.len:

print(a.read(’ue’))

The notation ue and se for the exponential-Golomb code properties comes from the H.264

video standard, which uses these types of code a lot. There are other ways to map the bitstrings

to integers:

18.1 Interleaved exponential-Golomb codes

This type of code is used in the Dirac video standard, and is represented by the attributes uie

and sie. For the interleaved codes the pattern is very similar to before for the unsigned case:

Bit pattern Unsigned

001 1

011 2

00001 3

00011 4

01001 5

01011 6

0000001 7

0000011 8

0001001 9

... ...

For the signed code it looks a little different:

Bit pattern Signed

0010 1

0011 -1

0110 2

0111 -2

000010 3

000011 -3

000110 4

000111 -4

010010 5

010011 -5

... ...

I’m sure you can work out the pattern yourself from here!

96 Chapter 18. Exponential-Golomb Codes

CHAPTER

NINETEEN

OPTIMISATION TECHNIQUES

The bistring module aims to be as fast as reasonably possible, and although there is more

work to be done optimising some operations it is currently quite well optimised without resort-

ing to C extensions.

There are however some pointers you should follow to make your code efﬁcient, so if you need

things to run faster then this is the section for you.

19.1 Use combined read and interpretation

When parsing a bitstring one way to write code is in the following style:

width =s.read(12).uint

height =s.read(12).uint

flags =s.read(4).bin

This works ﬁne, but is not very quick. The problem is that the call to read constructs and

returns a new bitstring, which then has to be interpreted. The new bitstring isn’t used for

anything else and so creating it is wasted effort. Instead it is better to use a string parameter

that does the read and interpretation together:

width =s.read(’uint:12’)

height =s.read(’uint:12’)

flags =s.read(’bin:4’)

This is much faster, although probably not as fast as the combined call:

width, height, flags =s.readlist(’uint:12, uint:12, bin:4’)

19.2 Choose the simplest class you can

If you don’t need to modify your bitstring after creation then prefer the immutable Bits over

the mutable BitArray. This is typically the case when parsing, or when creating directly

from ﬁles.

bitstring Documentation, Release 3.0.0

The speed difference between the classes is noticable, and there are also memory usage opti-

misations that are made if objects are known to be immutable.

You should also prefer ConstBitStream to BitStream if you won’t need to modify any

bits.

One anti-pattern to watch out for is using += on a Bits object. For example, don’t do this:

s=Bits()

for iin range(1000):

s+= ’0xab’

Now this is inefﬁcient for a few reasons, but the one I’m highlighting is that as the immutable

bitstring doesn’t have an __iadd__ special method the ordinary __add__ gets used instead.

In other words s += ’0xab’ gets converted to s = s + ’0xab’, which creates a new

Bits from the old on every iteration. This isn’t what you’d want or possibly expect. If shad

been a BitArray then the addition would have been done in-place, and have been much more

efﬁcient.

19.3 Use dedicated functions for bit setting and check-

ing

If you need to set or check individual bits then there are special functions for this. For example

one way to set bits would be:

s=BitArray(1000)

for pin [14,34,501]:

s[p] =’0b1’

This creates a 1000 bit bitstring and sets three of the bits to ‘1’. Unfortunately the crucial

line spends most of its time creating a new bitstring from the ‘0b1’ string. You could make it

slightly quicker by using s[p] = True, but it is much faster (and I mean at least an order of

magnitude) to use the set method:

s=BitArray(1000)

s.set(True, [14,34,501])

As well as set and invert there are also checking methods all and any. So rather than

using

if s[100]and s[200]:

do_something()

it’s better to say

if s.all(True, (100,200)):

do_something()

98 Chapter 19. Optimisation Techniques

CHAPTER

TWENTY

RELEASE NOTES

20.1 Full Version History

20.1.1 November 21st 2011: version 3.0.0 released

This is a major release which breaks backward compatibility in a few places.

20.1.2 Backwardly incompatible changes

Hex, oct and bin properties don’t have leading 0x, 0o and 0b

If you ask for the hex, octal or binary representations of a bitstring then they will no longer

be preﬁxed with 0x,0o or 0b. This was done as it was noticed that the ﬁrst thing a lot of

user code does after getting these representations was to cut off the ﬁrst two characters before

further processing.

>>> a=BitArray(’0x123’)

>>> a.hex, a.oct, a.bin

(’123’, ’0443’, ’000100100011’)

Previously this would have returned (’0x123’, ’0o0443’, ’0b000100100011’)

This change might require some recoding, but it should all be simpliﬁcations.

ConstBitArray renamed to Bits

Previously Bits was an alias for ConstBitStream (for backward compatibility). This has now

changed so that Bits and BitArray loosely correspond to the built-in types bytes and bytearray.

If you were using streaming/reading methods on a Bits object then you will have to change it

to a ConstBitStream.

The ConstBitArray name is kept as an alias for Bits.

bitstring Documentation, Release 3.0.0

Stepping in slices has conventional meaning

The step parameter in __getitem__,__setitem__ and __delitem__ used to act as a

multiplier for the start and stop parameters. No one seemed to use it though and so it has now

reverted to the convential meaning for containers.

If you are using step then recoding is simple: s[a:b:c] becomes s[a*c:b*c].

Some examples of the new usage:

>>> s=BitArray(’0x0000’)

s[::4] = [1, 1, 1, 1]

>>> s.hex

’8888’

>>> del s[8::2]

>>> s.hex

’880’

20.1.3 New features

New readto method

This method is a mix between a ﬁnd and a read - it searches for a bitstring and then reads up to

and including it. For example:

>>> s=ConstBitStream(’0x47000102034704050647’)

>>> s.readto(’0x47’, bytealigned=True)

BitStream(’0x47’)

>>> s.readto(’0x47’, bytealigned=True)

BitStream(’0x0001020347’)

>>> s.readto(’0x47’, bytealigned=True)

BitStream(’0x04050647’)

pack function accepts an iterable as its format

Previously only a string was accepted as the format in the pack function. This was an oversight

as it broke the symmetry between pack and unpack. Now you can use formats like this:

fmt =[’hex:8’,’bin:3’]

a=pack(fmt, ’47’,’001’)

a.unpack(fmt)

20.1.4 June 18th 2011: version 2.2.0 released

This is a minor upgrade with a couple of new features.

100 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

New interleaved exponential-Golomb interpretations

New bit interpretations for interleaved exponential-Golomb (as used in the Dirac video codec)

are supplied via uie and sie:

>>> s=BitArray(uie=41)

>>> s.uie

>>> s.bin

’0b00010001001’

These are pretty similar to the non-interleaved versions - see the manual for more details. Credit

goes to Paul Sargent for the patch.

New package-level bytealigned variable

A number of methods take a bytealigned parameter to indicate that they should only work

on byte boundaries (e.g. find,replace,split). Previously this parameter defaulted to

False. Instead it now defaults to bitstring.bytealigned, which itself defaults to

False, but can be changed to modify the default behaviour of the methods. For example:

>>> a=BitArray(’0x00 ff 0f ff’)

>>> a.find(’0x0f’)

(4,) # found first not on a byte boundary

>>> a.find(’0x0f’, bytealigned=True)

(16,) # forced looking only on byte boundaries

>>> bitstring.bytealigned =True # Change default behaviour

>>> a.find(’0x0f’)

(16,)

>>> a.find(’0x0f’, bytealigned=False)

(4,)

If you’re only working with bytes then this can help avoid some errors and save some typing!

Other changes

• Fix for Python 3.2, correcting for a change to the binascii module.

• Fix for bool initialisation from 0or 1.

• Efﬁciency improvements, including interning strategy.

20.1.5 February 23rd 2011: version 2.1.1 released

This is a release to ﬁx a couple of bugs that were introduced in 2.1.0.

• Bug ﬁx: Reading using the ‘bytes’ token had been broken (Issue 102).

• Fixed problem using some methods on ConstBitArray objects.

• Better exception handling for tokens missing values.

20.1. Full Version History 101

bitstring Documentation, Release 3.0.0

• Some performance improvements.

20.1.6 January 23rd 2011: version 2.1.0 released

20.1.7 New class hierarchy introduced with simpler classes

Previously there were just two classes, the immutable Bits which was the base class for the

mutable BitString class. Both of these classes have the concept of a bit position, from

which reads etc. take place so that the bitstring could be treated as if it were a ﬁle or stream.

Two simpler classes have now been added which are purely bit containers and don’t have a bit

position. These are called ConstBitArray and BitArray. As you can guess the former

is an immutable version of the latter.

The other classes have also been renamed to better reﬂect their capabilities. In-

stead of BitString you should use BitStream, and instead of Bits you can use

ConstBitStream. The old names are kept as aliases for backward compatibility.

The classes hierarchy is:

ConstBitArray

/ \

BitArray ConstBitStream (formerly Bits)

\ /

BitStream (formerly BitString)

Other changes

A lot of internal reorganisation has taken place since the previous version, most of which won’t

be noticed by the end user. Some things you might see are:

• New package structure. Previous versions have been a single ﬁle for the module and

another for the unit tests. The module is now split into many more ﬁles so it can’t be

used just by copying bitstring.py any more.

• To run the unit tests there is now a script called runtests.py in the test directory.

• File based bitstring are now implemented in terms of an mmap. This should be just an

implementation detail, but unfortunately for 32-bit versions of Python this creates a limit

of 4GB on the ﬁles that can be used. The work around is either to get a 64-bit Python, or

just stick with version 2.0.

• The ConstBitArray and ConstBitStream classes no longer copy byte data when

a slice or a read takes place, they just take a reference. This is mostly a very nice optimi-

sation, but there are occassions where it could have an adverse effect. For example if a

very large bitstring is created, a small slice taken and the original deleted. The byte data

from the large bitstring would still be retained in memory.

102 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

• Optimisations. Once again this version should be faster than the last. The module is still

pure Python but some of the reorganisation was to make it more feasible to put some of

the code into Cython or similar, so hopefully more speed will be on the way.

20.1.8 July 26th 2010: version 2.0.3 released

1. Bug ﬁx: Using peek and read for a single bit now returns a new bitstring as was intended,

rather than the old behaviour of returning a bool.

2. Removed HTML docs from source archive - better to use the online version.

20.1.9 July 25th 2010: version 2.0.2 released

This is a major release, with a number of backwardly incompatible changes. The main change

is the removal of many methods, all of which have simple alternatives. Other changes are quite

minor but may need some recoding.

There are a few new features, most of which have been made to help the stream-lining of the

API. As always there are performance improvements and some API changes were made purely

with future performance in mind.

20.1.10 The backwardly incompatible changes are:

Methods removed

About half of the class methods have been removed from the API. They all have simple alter-

natives, so what remains is more powerful and easier to remember. The removed methods are

listed here on the left, with their equivalent replacements on the right:

s.advancebit() -> s.pos += 1

s.advancebits(bits) -> s.pos += bits

s.advancebyte() -> s.pos += 8

s.advancebytes(bytes) -> s.pos += 8*bytes

s.allunset([a, b]) -> s.all(False, [a, b])

s.anyunset([a, b]) -> s.any(False, [a, b])

s.delete(bits, pos) -> del s[pos:pos+bits]

s.peekbit() -> s.peek(1)

s.peekbitlist(a, b) -> s.peeklist([a, b])

s.peekbits(bits) -> s.peek(bits)

s.peekbyte() -> s.peek(8)

s.peekbytelist(a, b) -> s.peeklist([8*a, 8*b])

s.peekbytes(bytes) -> s.peek(8*bytes)

s.readbit() -> s.read(1)

s.readbitlist(a, b) -> s.readlist([a, b])

s.readbits(bits) -> s.read(bits)

s.readbyte() -> s.read(8)

s.readbytelist(a, b) -> s.readlist([8*a, 8*b])

20.1. Full Version History 103

bitstring Documentation, Release 3.0.0

s.readbytes(bytes) -> s.read(8*bytes)

s.retreatbit() -> s.pos -= 1

s.retreatbits(bits) -> s.pos -= bits

s.retreatbyte() -> s.pos -= 8

s.retreatbytes(bytes) -> s.pos -= 8*bytes

s.reversebytes(start, end) -> s.byteswap(0, start, end)

s.seek(pos) -> s.pos = pos

s.seekbyte(bytepos) -> s.bytepos = bytepos

s.slice(start, end, step) -> s[start:end:step]

s.tell() -> s.pos

s.tellbyte() -> s.bytepos

s.truncateend(bits) -> del s[-bits:]

s.truncatestart(bits) -> del s[:bits]

s.unset([a, b]) -> s.set(False, [a, b])

Many of these methods have been deprecated for the last few releases, but there are some new

removals too. Any recoding needed should be quite straightforward, so while I apologise for

the hassle, I had to take the opportunity to streamline and rationalise what was becoming a bit

of an overblown API.

set / unset methods combined

The set/unset methods have been combined in a single method, which now takes a boolean

as its ﬁrst argument:

s.set([a, b]) -> s.set(1, [a, b])

s.unset([a, b]) -> s.set(0, [a, b])

s.allset([a, b]) -> s.all(1, [a, b])

s.allunset([a, b]) -> s.all(0, [a, b])

s.anyset([a, b]) -> s.any(1, [a, b])

s.anyunset([a, b]) -> s.any(0, [a, b])

all / any only accept iterables

The all and any methods (previously called allset,allunset,anyset and

anyunset) no longer accept a single bit position. The recommended way of testing a sin-

gle bit is just to index it, for example instead of:

>>> if s.all(True, i):

just use

>>> if s[i]:

If you really want to you can of course use an iterable with a single element, such as

s.any(False, [i]), but it’s clearer just to write not s[i].

104 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

Exception raised on reading off end of bitstring

If a read or peek goes beyond the end of the bitstring then a ReadError will be raised. The

previous behaviour was that the rest of the bitstring would be returned and no exception raised.

BitStringError renamed to Error

The base class for errors in the bitstring module is now just Error, so it will

likely appears in your code as bitstring.Error instead of the rather repetitive

bitstring.BitStringError.

Single bit slices and reads return a bool

A single index slice (such as s[5]) will now return a bool (i.e. True or False) rather than a

single bit bitstring. This is partly to reﬂect the style of the bytearray type, which returns an

integer for single items, but mostly to avoid common errors like:

>>> if s[0]:

... do_something()

While the intent of this code snippet is quite clear (i.e. do_something if the ﬁrst bit of s is set)

under the old rules s[0] would be true as long as swasn’t empty. That’s because any one-bit

bitstring was true as it was a non-empty container. Under the new rule s[0] is True if sstarts

with a 1bit and False if sstarts with a 0bit.

The change does not affect reads and peeks, so s.peek(1) will still return a single bit bit-

string, which leads on to the next item...

Empty bitstrings or bitstrings with only zero bits are considered False

Previously a bitstring was False if it had no elements, otherwise it was True. This is standard

behaviour for containers, but wasn’t very useful for a container of just 0s and 1s. The new

behaviour means that the bitstring is False if it has no 1 bits. This means that code like this:

>>> if s.peek(1):

... do_something()

should work as you’d expect. It also means that Bits(1000),Bits(0x00) and

Bits(’uint:12=0’) are all also False. If you need to check for the emptiness of a

bitstring then instead check the len property:

if s -> if s.len

if not s -> if not s.len

Length and offset disallowed for some initialisers

Previously you could create bitstring using expressions like:

20.1. Full Version History 105

bitstring Documentation, Release 3.0.0

>>> s=Bits(hex=’0xabcde’, offset=4, length=13)

This has now been disallowed, and the offset and length parameters may only be used when

initialising with bytes or a ﬁle. To replace the old behaviour you could instead use

>>> s=Bits(hex=’0xabcde’)[4:17]

Renamed format parameter fmt

Methods with a format parameter have had it renamed to fmt, to prevent hiding the built-in

format. Affects methods unpack,read,peek,readlist,peeklist and byteswap

and the pack function.

Iterables instead of *format accepted for some methods

This means that for the affected methods (unpack,readlist and peeklist) you will

need to use an iterable to specify multiple items. This is easier to show than to describe, so

instead of

>>> a,b,c,d=s.readlist(’uint:12’,’hex:4’,’bin:7’)

you would instead write

>>> a,b,c,d=s.readlist([’uint:12’,’hex:4’,’bin:7’])

Note that you could still use the single string ’uint:12, hex:4, bin:7’ if you pre-

ferred.

Bool auto-initialisation removed

You can no longer use True and False to initialise single bit bitstrings. The reasoning

behind this is that as bool is a subclass of int, it really is bad practice to have Bits(False)

be different to Bits(0) and to have Bits(True) different to Bits(1).

If you have used bool auto-initialisation then you will have to be careful to replace it as the

bools will now be interpreted as ints, so Bits(False) will be empty (a bitstring of length

0), and Bits(True) will be a single zero bit (a bitstring of length 1). Sorry for the confusion,

but I think this will prevent bigger problems in the future.

There are a few alternatives for creating a single bit bitstring. My favourite is to use a list with

a single item:

Bits(False) -> Bits([0])

Bits(True) -> Bits([1])

106 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

New creation from ﬁle strategy

Previously if you created a bitstring from a ﬁle, either by auto-initialising with a ﬁle object or

using the ﬁlename parameter, the ﬁle would not be read into memory unless you tried to modify

it, at which point the whole ﬁle would be read.

The new behaviour depends on whether you create a Bits or a BitString from the ﬁle. If you

create a Bits (which is immutable) then the ﬁle will never be read into memory. This allows

very large ﬁles to be opened for examination even if they could never ﬁt in memory.

If however you create a BitString, the whole of the referenced ﬁle will be read to store in

memory. If the ﬁle is very big this could take a long time, or fail, but the idea is that in saying

you want the mutable BitString you are implicitly saying that you want to make changes and

so (for now) we need to load it into memory.

The new strategy is a bit more predictable in terms of performance than the old. The main point

to remember is that if you want to open a ﬁle and don’t plan to alter the bitstring then use the

Bits class rather than BitString.

Just to be clear, in neither case will the contents of the ﬁle ever be changed - if you want to

output the modiﬁed BitString then use the tofile method, for example.

ﬁnd and rﬁnd return a tuple instead of a bool

If a ﬁnd is unsuccessful then an empty tuple is returned (which is False in a boolean sense)

otherwise a single item tuple with the bit position is returned (which is True in a boolean

sense). You shouldn’t need to recode unless you explicitly compared the result of a ﬁnd to True

or False, for example this snippet doesn’t need to be altered:

>>> if s.find(’0x23’):

... print(s.bitpos)

but you could now instead use

>>> found =s.find(’0x23’)

>>> if found:

... print(found[0])

The reason for returning the bit position in a tuple is so that ﬁnding at position zero can still be

True - it’s the tuple (0,) - whereas not found can be False - the empty tuple ().

20.1.11 The new features in this release are:

New count method

This method just counts the number of 1 or 0 bits in the bitstring.

>>> s=Bits(’0x31fff4’)

>>> s.count(1)

20.1. Full Version History 107

bitstring Documentation, Release 3.0.0

read and peek methods accept integers

The read,readlist,peek and peeklist methods now accept integers as parameters to

mean “read this many bits and return a bitstring”. This has allowed a number of methods to be

removed from this release, so for example instead of:

>>> a,b,c=s.readbits(5,6,7)

>>> if s.peekbit():

... do_something()

you should write:

>>> a,b,c=s.readlist([5,6,7])

>>> if s.peek(1):

... do_something()

byteswap used to reverse all bytes

The byteswap method now allows a format speciﬁer of 0 (the default) to signify that all

of the whole bytes should be reversed. This means that calling just byteswap() is almost

equivalent to the now removed bytereverse() method (a small difference is that byteswap

won’t raise an exception if the bitstring isn’t a whole number of bytes long).

Auto initialise with bytearray or (for Python 3 only) bytes

So rather than writing:

>>> a=Bits(bytes=some_bytearray)

you can just write

>>> a=Bits(some_bytearray)

This also works for the bytes type, but only if you’re using Python 3. For Python 2 it’s not

possible to distinguish between a bytes object and a str. For this reason this method should be

used with some caution as it will make you code behave differently with the different major

Python versions.

>>> b=Bits(b’abcd\x23\x00’)# Only Python 3!

set,invert,all and any default to whole bitstring

This means that you can for example write:

>>> a=BitString(100)# 100 zero bits

>>> a.set(1)# set all bits to 1

>>> a.all(1)# are all bits set to 1?

True

>>> a.any(0)# are any set to 0?

108 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

False

>>> a.invert() # invert every bit

New exception types

As well as renaming BitStringError to just Error there are also new exceptions which

use Error as a base class.

These can be caught in preference to Error if you need ﬁner control. The new exceptions

sometimes also derive from built-in exceptions:

3. ByteAlignError(Error) - whole byte position or length needed.

4. ReadError(Error, IndexError) - reading or peeking off the end of the bit-

string.

5. CreationError(Error, ValueError) - inappropriate argument during bit-

string creation.

6. InterpretError(Error, ValueError) - inappropriate interpretation of binary

data.

20.1.12 March 18th 2010: version 1.3.0 for Python 2.6 and 3.x re-

leased

20.1.13 New features

byteswap method for changing endianness

Changes the endianness in-place according to a format string or integer(s) giving the byte pat-

tern. See the manual for details.

>>> s=BitString(’0x00112233445566’)

>>> s.byteswap(2)

>>> s

BitString(’0x11003322554466’)

>>> s.byteswap(’h’)

>>> s

BitString(’0x00112233445566’)

>>> s.byteswap([2,5])

>>> s

BitString(’0x11006655443322’)

20.1. Full Version History 109

bitstring Documentation, Release 3.0.0

Multiplicative factors in bitstring creation and reading

For example:

>>> s=Bits(’100*0x123’)

Token grouping using parenthesis

For example:

>>> s=Bits(’3*(uint:6=3, 0b1)’)

Negative slice indices allowed

The start and end parameters of many methods may now be negative, with the same meaning

as for negative slice indices. Affects all methods with these parameters.

Sequence ABCs used

The Bits class now derives from collections.Sequence, while the BitString class de-

rives from collections.MutableSequence.

Keywords allowed in readlist,peeklist and unpack

Keywords for token lengths are now permitted when reading. So for example, you can write

>>> s=bitstring.pack(’4*(uint:n)’,2,3,4,5, n=7)

>>> s.unpack(’4*(uint:n)’, n=7)

[2, 3, 4, 5]

start and end parameters added to rol and ror

join function accepts other iterables

Also its parameter has changed from ‘bitstringlist’ to ‘sequence’. This is technically a backward

incompatibility in the unlikely event that you are referring to the parameter by name.

__init__ method accepts keywords

Rather than a long list of initialisers the __init__ methods now use a **kwargs dictionary

for all initialisers except ‘auto’. This should have no effect, except that this is a small backward

incompatibility if you use positional arguments when initialising with anything other than auto

(which would be rather unusual).

110 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

More optimisations

A number of methods have been speeded up.

Bug ﬁxed in replace method

(it could fail if start != 0).

20.1.14 January 19th 2010: version 1.2.0 for Python 2.6 and 3.x

released

20.1.15 New ‘Bits’ class

Introducing a brand new class, Bits, representing an immutable sequence of bits.

The Bits class is the base class for the mutable BitString. The differences between Bits and

BitStrings are:

• Bits are immutable, so once they have been created their value cannot change. This of

course means that mutating methods (append, replace, del etc.) are not available for Bits.

• Bits are hashable, so they can be used in sets and as keys in dictionaries.

• Bits are potentially more efﬁcient than BitStrings, both in terms of computation and

memory. The current implementation is only marginally more efﬁcient though - this

should improve in future versions.

You can switch from Bits to a BitString or vice versa by constructing a new object from the

old.

>>> s=Bits(’0xabcd’)

>>> t=BitString(s)

>>> t.append(’0xe’)

>>> u=Bits(t)

The relationship between Bits and BitString is supposed to loosely mirror that between bytes

and bytearray in Python 3.

Deprecation messages turned on

A number of methods have been ﬂagged for removal in version 2. Deprecation warnings will

now be given, which include an alternative way to do the same thing. All of the deprecated

methods have simpler equivalent alternatives.

>>> t=s.slice(0,2)

__main__:1: DeprecationWarning: Call to deprecated function slice.

Instead of ’s.slice(a, b, c)’ use ’s[a:b:c]’.

20.1. Full Version History 111

bitstring Documentation, Release 3.0.0

The deprecated methods are: advancebit,advancebits,advancebyte,

advancebytes,retreatbit,retreatbits,retreatbyte,retreatbytes,

tell,seek,slice,delete,tellbyte,seekbyte,truncatestart and

truncateend.

Initialise from bool

Booleans have been added to the list of types that can ‘auto’ initialise a bitstring.

>>> zerobit =BitString(False)

>>> onebit =BitString(True)

Improved efﬁciency

More methods have been speeded up, in particular some deletions and insertions.

Bug ﬁxes

A rare problem with truncating the start of bitstrings was ﬁxed.

A possible problem outputting the ﬁnal byte in tofile() was ﬁxed.

20.1.16 December 22nd 2009: version 1.1.3 for Python 2.6 and 3.x

released

This version hopefully ﬁxes an installation problem for platforms with case-sensitive ﬁle sys-

tems. There are no new features or other bug ﬁxes.

20.1.17 December 18th 2009: version 1.1.2 for Python 2.6 and 3.x

released

This is a minor update with (almost) no new features.

Improved efﬁciency

The speed of many typical operations has been increased, some substantially.

Initialise from integer

A BitString of ‘0’ bits can be created using just an integer to give the length in bits. So instead

>>> s=BitString(length=100)

you can write just

112 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

>>> s=BitString(100)

This matches the behaviour of bytearrays and (in Python 3) bytes.

• A defect related to using the set / unset functions on !BitStrings initialised from a ﬁle has

been ﬁxed.

20.1.18 November 24th 2009: version 1.1.0 for Python 2.6 and 3.x

released

Note that this version will not work for Python 2.4 or 2.5. There may be an update for these

Python versions some time next year, but it’s not a priorty quite yet. Also note that only one

version is now provided, which works for Python 2.6 and 3.x (done with the minimum of

hackery!)

20.1.19 New features

Improved efﬁciency

A fair number of functions have improved efﬁciency, some quite dramatically.

New bit setting and checking functions

Although these functions don’t do anything that couldn’t be done before, they do make some

common use cases much more efﬁcient. If you need to set or check single bits then these are

the functions you need.

•set /unset : Set bit(s) to 1 or 0 respectively.

•allset /allunset : Check if all bits are 1 or all 0.

•anyset /anyunset : Check if any bits are 1 or any 0.

>>> s=BitString(length=1000)

>>> s.set((10,100,44,12,1))

>>> s.allunset((2,22,222))

True

>>> s.anyset(range(7,77))

True

New rotate functions

ror /rol : Rotate bits to the right or left respectively.

>>> s=BitString(’0b100000000’)

>>> s.ror(2)

>>> s.bin

20.1. Full Version History 113

bitstring Documentation, Release 3.0.0

’0b001000000’

>>> s.rol(5)

>>> s.bin

’0b000000100’

Floating point interpretations

New ﬂoat initialisations and interpretations are available. These only work for BitStrings of

length 32 or 64 bits.

>>> s=BitString(float=0.2, length=64)

>>> s.float

0.200000000000000001

>>> t=bitstring.pack(’<3f’,-0.4,1e34,17.0)

>>> t.hex

’0xcdccccbedf84f67700008841’

‘bytes’ token reintroduced

This token returns a bytes object (equivalent to a str in Python 2.6).

>>> s=BitString(’0x010203’)

>>> s.unpack(’bytes:2, bytes:1’)

[’\x01\x02’, ’\x03’]

‘uint’ is now the default token type

So for example these are equivalent:

a,b=s.readlist(’uint:12, uint:12’)

a,b=s.readlist(’12, 12’)

20.1.20 October 10th 2009: version 1.0.1 for Python 3.x released

This is a straight port of version 1.0.0 to Python 3.

For changes since the last Python 3 release read all the way down in this document to version

0.4.3.

This version will also work for Python 2.6, but there’s no advantage to using it over the 1.0.0

release. It won’t work for anything before 2.6.

20.1.21 October 9th 2009: version 1.0.0 for Python 2.x released

Version 1 is here!

114 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

This is the ﬁrst release not to carry the ‘beta’ tag. It contains a couple of minor new features

but is principally a release to ﬁx the API. If you’ve been using an older version then you almost

certainly will have to recode a bit. If you’re not ready to do that then you may wish to delay

updating.

So the bad news is that there are lots of small changes to the API. The good news is that all the

changes are pretty trivial, the new API is cleaner and more ‘Pythonic’, and that by making it

version 1.0 I’m promising not to tweak it again for some time.

20.1.22 API Changes

New read / peek functions for returning multiple items

The functions read, readbits, readbytes, peek, peekbits and peekbytes now only ever return a

single item, never a list.

The new functions readlist, readbitlist, readbytelist, peeklist, peekbitlist and peekbytelist can

be used to read multiple items and will always return a list.

So a line like:

>>> a,b=s.read(’uint:12, hex:32’)

becomes

>>> a,b=s.readlist(’uint:12, hex:32’)

Renaming / removing functions

Functions have been renamed as follows:

‘‘seekbit‘‘ -> ‘‘seek‘‘

‘‘tellbit‘‘ -> ‘‘tell‘‘

‘‘reversebits‘‘ -> ‘‘reverse‘‘

‘‘deletebits‘‘ -> ‘‘delete‘‘

‘‘tostring‘‘ -> ‘‘tobytes‘‘

and a couple have been removed altogether:

•deletebytes - use delete instead.

•empty - use not s rather than s.empty().

20.1. Full Version History 115

bitstring Documentation, Release 3.0.0

Renaming parameters

The parameters ‘startbit’ and ‘endbit’ have been renamed ‘start’ and ‘end’. This affects the

methods slice,find,findall,rfind,reverse,cut and split.

The parameter ‘bitpos’ has been renamed to ‘pos’. The affects the methods seek,tell,

insert,overwrite and delete.

Mutating methods return None rather than self

This means that you can’t chain functions together so

>>> s.append(’0x00’).prepend(’0xff’)

>>> t=s.reverse()

Needs to be rewritten

>>> s.append(’0x00’)

>>> s.prepend(’0xff’)

>>> s.reverse()

>>> t=s

Affects truncatestart,truncateend,insert,overwrite,delete,append,

prepend,reverse and reversebytes.

Properties renamed

The ‘data’ property has been renamed to ‘bytes’. Also if the BitString is not a whole number

of bytes then a ValueError exception will be raised when using ‘bytes’ as a ‘getter’.

Properties ‘len’ and ‘pos’ have been added to replace ‘length’ and ‘bitpos’, although the longer

names have not been removed so you can continue to use them if you prefer.

Other changes

• The unpack method now always returns a list, never a single item.

• BitStrings are now ‘unhashable’, so calling hash on one or making a set will fail.

• The colon separating the token name from its length is now mandatory. So for example

BitString(’uint12=100’) becomes BitString(’uint:12=100’).

• Removed support for the ‘bytes’ token in format strings. Instead of

s.read(’bytes:4’) use s.read(’bits:32’).

116 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

20.1.23 New features

Added endswith and startswith functions

These do much as you’d expect; they return True or False depending on whether the BitString

starts or ends with the parameter.

>>> BitString(’0xef342’).startswith(’0b11101’)

True

20.1.24 September 11th 2009: version 0.5.2 for Python 2.x released

Finally some tools for dealing with endianness!

New interpretations are now available for whole-byte BitStrings that treat them as big, little, or

native-endian

>>> big =BitString(intbe=1, length=16)# or BitString(’intbe:16=1’) if you prefer.

>>> little =BitString(intle=1, length=16)

>>> print big.hex, little.hex

0x0001 0x0100

>>> print big.intbe, little.intle

1 1

‘Struct’-like compact format codes

To save some typing when using pack,unpack,read and peek, compact format codes

based on those used in the struct and array modules have been added. These must start with

a character indicating the endianness (>,<or @for big, little and native-endian), followed by

characters giving the format:

b 1-byte signed int

B 1-byte unsigned int

h 2-byte signed int

H 2-byte unsigned int

l 4-byte signed int

L 4-byte unsigned int

q 8-byte signed int

Q 8-byte unsigned int

For example:

>>> s=bitstring.pack(’<4h’,0,1,2,3)

creates a BitString with four little-endian 2-byte integers. While

>>> x,y,z=s.read(’>hhl’)

reads them back as two big-endian two-byte integers and one four-byte big endian integer.

20.1. Full Version History 117

bitstring Documentation, Release 3.0.0

Of course you can combine this new format with the old ones however you like:

>>> s.unpack(’<h, intle:24, uint:5, bin’)

[0, 131073, 0, ’0b0000000001100000000’]

20.1.25 August 26th 2009: version 0.5.1 for Python 2.x released

This update introduces pack and unpack functions for creating and dissembling BitStrings.

New pack() and unpack() functions

The pack function provides a ﬂexible new method for creating BitStrings. Tokens for BitString

‘literals’ can be used in the same way as in the constructor.

>>> from bitstring import BitString, pack

>>> a=pack(’0b11, 0xff, 0o77, int:5=-1, se=33’)

You can also leave placeholders in the format, which will be ﬁlled in by the values provided.

>>> b=pack(’uint:10, hex:4’,33,’f’)

Finally you can use a dictionary or keywords.

>>> c=pack(’bin=a, hex=b, bin=a’, a=’010’, b=’ef’)

The unpack method is similar to the read method except that it always unpacks from the

start of the BitString.

>>> x,y=b.unpack(’uint:10, hex’)

If a token is given without a length (as above) then it will expand to ﬁll the remaining bits in

the BitString. This also now works with read and peek.

New tostring() and toﬁle() methods

The tostring method just returns the data as a string, with up to seven zero bits appended

to byte align. The tofile method does the same except writes to a ﬁle object.

>>> f=open(’myfile’,’wb’)

>>> BitString(’0x1234ff’).tofile(f)

Other changes

The use of =is now mandatory in ‘auto’ initialisers. Tokens like uint12 100 will no longer

work. Also the use of a :before the length is encouraged, but not yet mandated. So the

previous example should be written as uint:12=100.

The ‘auto’ initialiser will now take a ﬁle object.

118 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

>>> f=open(’myfile’,’rb’)

>>> s=BitString(f)

20.1.26 July 19th 2009: version 0.5.0 for Python 2.x released

This update breaks backward compatibility in a couple of areas. The only one you probably

need to be concerned about is the change to the default for bytealigned in find,replace,

split, etc.

See the user manual for more details on each of these items.

Expanded abilities of ‘auto’ initialiser

More types can be initialised through the ‘auto’ initialiser. For example instead of

>>> a=BitString(uint=44, length=16)

you can write

>>> a=BitString(’uint16=44’)

Also, different comma-separated tokens will be joined together, e.g.

>>> b=BitString(’0xff’)+’int8=-5’

can be written

>>> b=BitString(’0xff, int8=-5’)

New formatted read and peek methods

These takes a format string similar to that used in the auto initialiser. If only one token is

provided then a single value is returned, otherwise a list of values is returned.

>>> start_code, width, height =s.read(’hex32, uint12, uint12’)

is equivalent to

>>> start_code =s.readbits(32).hex

>>> width =s.readbits(12).uint

>>> height =s.readbits(12).uint

The tokens are:

int n : n bits as an unsigned integer.

uint n : n bits as a signed integer.

hex n : n bits as a hexadecimal string.

oct n : n bits as an octal string.

bin n : n bits as a binary string.

ue : next bits as an unsigned exp-Golomb.

20.1. Full Version History 119

bitstring Documentation, Release 3.0.0

se : next bits as a signed exp-Golomb.

bits n : n bits as a new BitString.

bytes n : n bytes as a new BitString.

See the user manual for more details.

hex and oct methods removed

The special methods for hex and oct have been removed. Please use the hex and oct

properties instead.

>>> hex(s)

becomes

>>> s.hex

join made a method

The join function must now be called on a BitString object, which will be used to join the list

together. You may need to recode slightly:

>>> s=bitstring.join(’0x34’,’0b1001’,’0b1’)

becomes

>>> s=BitString().join(’0x34’,’0b1001’,’0b1’)

More than one value allowed in readbits, readbytes, peekbits and peekbytes

If you specify more than one bit or byte length then a list of BitStrings will be returned.

>>> a,b,c=s.readbits(10,5,5)

is equivalent to

>>> a=readbits(10)

>>> b=readbits(5)

>>> c=readbits(5)

bytealigned defaults to False, and is at the end of the parameter list

Functions that have a bytealigned paramater have changed so that it now defaults to False rather

than True. Also its position in the parameter list has changed to be at the end. You may need to

recode slightly (sorry!)

120 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

readue and readse methods have been removed

Instead you should use the new read function with a ‘ue’ or ‘se’ token:

>>> i=s.readue()

becomes

>>> i=s.read(’ue’)

This is more ﬂexible as you can read multiple items in one go, plus you can now also use the

peek method with ue and se.

Minor bugs ﬁxed

See the issue tracker for more details.

20.1.27 June 15th 2009: version 0.4.3 for Python 2.x released

This is a minor update. This release is the ﬁrst to bundle the bitstring manual. This is a PDF

and you can ﬁnd it in the docs directory.

New ‘cut’ method

This method returns a generator for constant sized chunks of a BitString.

>>> for byte in s.cut(8):

... do_something_with(byte)

You can also specify a startbit and endbit, as well as a count, which limits the number of items

generated:

>>> first100TSPackets =list(s.cut(188*8, count=100))

‘slice’ method now equivalent to __getitem__

This means that a step can also be given to the slice method so that the following are now the

same thing, and it’s just a personal preference which to use:

>>> s1 =s[a:b:c]

>>> s2 =s.slice(a, b, c)

ﬁndall gets a ‘count’ parameter

So now

20.1. Full Version History 121

bitstring Documentation, Release 3.0.0

>>> list(a.findall(s, count=n))

is equivalent to

>>> list(a.findall(s))[:n]

except that it won’t need to generate the whole list and so is much more efﬁcient.

Changes to ‘split’

The split method now has a ‘count’ parameter rather than ‘maxsplit’. This makes the interface

closer to that for cut, replace and ﬁndall. The ﬁnal item generated is now no longer the whole

of the rest of the BitString.

• A couple of minor bugs were ﬁxed. See the issue tracker for details.

20.1.28 May 25th 2009: version 0.4.2 for Python 2.x released

This is a minor update, and almost doesn’t break compatibility with version 0.4.0, but with the

slight exception of ﬁndall() returning a generator, detailed below.

Stepping in slices

The use of the step parameter (also known as the stride) in slices has been added. Its use is

a little non-standard as it effectively gives a multiplicative factor to apply to the start and stop

parameters, rather than skipping over bits.

For example this makes it much more convenient if you want to give slices in terms of bytes

instead of bits. Instead of writing s[a*8:b*8] you can use s[a:b:8].

When using a step the BitString is effectively truncated to a multiple of the step, so s[::8]

is equal to sif sis an integer number of bytes, otherwise it is truncated by up to 7 bits. So the

ﬁnal seven complete 16-bit words could be written as s[-7::16].

Negative slices are also allowed, and should do what you’d expect. So for example s[::-1]

returns a bit-reversed copy of s(which is similar to s.reversebits(), which does the

same operation on sin-place). As another example, to get the ﬁrst 10 bytes in reverse byte

order you could use s_bytereversed = s[0:10:-8].

Removed restrictions on offset

You can now specify an offset of greater than 7 bits when creating a BitString, and the use of

offset is also now permitted when using the ﬁlename initialiser. This is useful when you want

to create a BitString from the middle of a ﬁle without having to read the ﬁle into memory.

>>> f=BitString(filename=’reallybigfile’, offset=8000000, length=32)

122 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

Integers can be assigned to slices

You can now assign an integer to a slice of a BitString. If the integer doesn’t ﬁt in the size of

slice given then a ValueError exception is raised. So this is now allowed and works as expected:

>>> s[8:16]=106

and is equivalent to

>>> s[8:16]=BitString(uint=106, length=8)

Less exceptions raised

Some changes have been made to slicing so that less exceptions are raised, bringing the inter-

face closer to that for lists. So for example trying to delete past the end of the BitString will

now just delete to the end, rather than raising a ValueError.

Initialisation from lists and tuples

A new option for the auto initialiser is to pass it a list or tuple. The items in the list or tuple are

evaluated as booleans and the bits in the BitString are set to 1 for True items and 0 for False

items. This can be used anywhere the auto initialiser can currently be used. For example:

>>> a=BitString([True,7,False,0, ()]) # 0b11000

>>> b=a+[’Yes’,’’]# Adds ’0b10’

>>> (True,True,False)in a

True

Miscellany

•reversebits now has optional startbit and endbit parameters.

• As an optimisation findall will return a generator, rather than a list. If you still want

the whole list then of course you can just call list() on the generator.

• Improved efﬁciency of rfind.

• A couple of minor bugs were ﬁxed. See the issue tracker for details.

20.1.29 April 23rd 2009: Python 3 only version 0.4.1 released

This version is just a port of version 0.4.0 to Python 3. All the unit tests pass, but beyond

that only limited ad hoc testing has been done and so it should be considered an experimental

release. That said, the unit test coverage is very good - I’m just not sure if anyone even wants

a Python 3 version!

20.1. Full Version History 123

bitstring Documentation, Release 3.0.0

20.1.30 April 11th 2009: version 0.4.0 released

New methods

Added rfind,findall and replace. These do pretty much what you’d expect - see the

docstrings or the wiki for more information.

More special methods

Some missing methods were added: __repr__,__contains__,__rand__,__ror__,

__rxor__ and __delitem__.

Miscellany

A couple of small bugs were ﬁxed (see the issue tracker).

There are some small backward incompatibilities relative to version 0.3.2:

Combined find and findbytealigned

findbytealigned has been removed, and becomes part of find. The default start position

has changed on both find and split to be the start of the BitString. You may need to recode:

>>> s1.find(bs)

>>> s2.findbytealigned(bs)

>>> s2.split(bs)

becomes

>>> s1.find(bs, bytealigned=False, startbit=s1.bitpos)

>>> s2.find(bs, startbit=s1.bitpos) # bytealigned defaults to True

>>> s2.split(bs, startbit=s2.bitpos)

Reading off end of BitString no longer raises exception

Previously a read or peek function that encountered the end of the BitString would raise a

ValueError. It will now instead return the remainder of the BitString, which could be an empty

BitString. This is closer to the ﬁle object interface.

Removed visibility of offset

The offset property was previously read-only, and has now been removed from public view

altogether. As it is used internally for efﬁciency reasons you shouldn’t really have needed to

use it. If you do then use the _offset parameter instead (with caution).

124 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

20.1.31 March 11th 2009: version 0.3.2 released

Better performance

A number of methods (especially find and findbytealigned) have been sped up consid-

erably.

Bit-wise operations

Added support for bit-wise AND (&), OR (|) and XOR (^). For example:

>>> a=BitString(’0b00111’)

>>> print a&’0b10101’

0b00101

Miscellany

Added seekbit and seekbyte methods. These complement the ‘advance’ and ‘retreat’

functions, although you can still just use bitpos and bytepos properties directly.

>>> a.seekbit(100)# Equivalent to a.bitpos = 100

Allowed comparisons between BitString objects and strings. For example this will now work:

>>> a=BitString(’0b00001111’)

>>> a== ’0x0f’

True

20.1.32 February 26th 2009: version 0.3.1 released

This version only adds features and ﬁxes bugs relative to 0.3.0, and doesn’t break backwards

compatibility.

Octal interpretation and initialisation

The oct property now joins bin and hex. Just preﬁx octal numbers with ‘0o’:

>>> a=BitString(’0o755’)

>>> print a.bin

0b111101101

Simpler copying

Rather than using b = copy.copy(a) to create a copy of a BitString, now you can just use

b = BitString(a).

20.1. Full Version History 125

bitstring Documentation, Release 3.0.0

More special methods

Lots of new special methods added, for example bit-shifting via << and >>, equality testing

via == and !=, bit inversion (~) and concatenation using *.

Also __setitem__ is now supported so BitString objects can be modiﬁed using standard

index notation.

Proper installer

Finally got round to writing the distutils script. To install just python setup.py

install.

20.1.33 February 15th 2009: version 0.3.0 released

Simpler initialisation from binary and hexadecimal

The ﬁrst argument in the BitString constructor is now called ‘auto’ and will attempt to interpret

the type of a string. Preﬁx binary numbers with ‘0b’ and hexadecimals with ‘0x’:

>>> a=BitString(’0b0’)# single zero bit

>>> b=BitString(’0xffff’)# two bytes

Previously the ﬁrst argument was ‘data’, so if you relied on this then you will need to recode:

>>> a=BitString(’\x00\x00\x01\xb3’)# Don’t do this any more!

becomes

>>> a=BitString(data=’\x00\x00\x01\xb3’)

or just

>>> a=BitString(’0x000001b3’)

This new notation can also be used in functions that take a BitString as an argument. For

example:

>>> a=BitString(’0x0011’)+’0xff’

>>> a.insert(’0b001’,6)

>>> a.find(’0b1111’)

BitString made more mutable

The methods append,deletebits,insert,overwrite,truncatestart and

truncateend now modify the BitString that they act upon. This allows for cleaner and more

efﬁcient code, but you may need to rewrite slightly if you depended upon the old behaviour:

126 Chapter 20. Release Notes

bitstring Documentation, Release 3.0.0

>>> a=BitString(hex=’0xffff’)

>>> a=a.append(BitString(hex=’0x00’))

>>> b=a.deletebits(10,10)

becomes

>>> a=BitString(’0xffff’)

>>> a.append(’0x00’)

>>> b=copy.copy(a)

>>> b.deletebits(10,10)

Thanks to Frank Aune for suggestions in this and other areas.

Changes to printing

The binary interpretation of a BitString is now prepended with ‘0b’. This is in keeping with the

Python 2.6 (and 3.0) bin function. The preﬁx is optional when initialising using bin=.

Also, if you just print a BitString with no interpretation it will pick something appropriate - hex

if it is an integer number of bytes, otherwise binary. If the BitString representation is very long

it will be truncated by ‘...’ so it is only an approximate interpretation.

>>> a=BitString(’0b0011111’)

>>> print a

0b0011111

>>> a+= ’0b0’

>>> print a

0x3e

More convenience functions

Some missing methods such as advancebit and deletebytes have been added. Also a

number of ‘peek’ methods make an appearance as have prepend and reversebits. See

the Tutorial for more details.

20.1.34 January 13th 2009: version 0.2.0 released

Some fairly minor updates, not really deserving of a whole version point update.

20.1.35 December 29th 2008: version 0.1.0 released

First release!

20.1. Full Version History 127

Bitstring Manual