Developer

Get yourself into a Python cPickle

Serialization is a handy technique for packaging objects. Learn how to use Python's cPickle for easy data storage and retrieval.


Serialization is a useful technique that allows you to preserve object values in a string. There are many reasons to do this, and most languages today provide some prepackaged approach. Python is no different. In the Python scripting language, you have several serialization options, two of which are pickle, written in Python, and cPickle, written in C. I'll start with a little background information on serialization. Then, I'll walk through an example that shows how to use cPickle to serialize and deserialize list values.

Your basic pickle
Serialization, also called pickling or flattening, converts structured data into a data stream format. Essentially, this means that structures such as lists, tuples, functions, and classes are preserved using ASCII characters between data values. The pickle data format is standardized, so strings serialized with pickle can be deserialized with cPickle and vice versa.

The main difference between cPickle and pickle is performance. The cPickle module is many times faster to execute because it’s written in C and because its methods are functions instead of classes. While this improves performance, it also means that the cPickle methods can't be extended or customized, whereas pickle classes can.

Serialization is useful in a number of ways. It can be a time and resource saver when used on data that will be transmitted, encrypted, or stored in a database. Information is serialized in Python and then processed. When the data is retrieved, it is deserialized and used.

For additional information on cPickle and pickle, refer to the Python online reference manual’s section "3.14 pickle—Python object serialization."

Serving up condiments: A cPickle example
Now let's look at a simple example that demonstrates basic usage of the cPickle module.

First, the condiments.py script informs Python that we’ll be using the cPickle module:
import cPickle

Next, I define the object I want to serialize and store. In this case, it’s a list of condiments I’ve got in the fridge:
inFridge = ["ketchup", "mustard", "relish"]
print inFridge


That print statement’s output will display the following:
[‘ketchup’, ‘mustard’, ‘relish’]

I want to save my results in a file called fridge.txt, so the script creates a file handler and opens the file for writing:
FILE = open("fridge.txt", 'w')

Now the magic happens. I call the cPickle command, dump, to pickle my data and dump the results to the file:
cPickle.dump(inFridge, FILE)

I’m finished for now, so the script closes the file:
FILE.close()

I now have a file, fridge.txt, that contains the following:
(lp1
S'ketchup'
p2
aS'mustard'
p3
aS'relish'
p4
a.


The pickle and cPickle modules have an option to save the information in a binary format; however, I’ve used the default ASCII because it is human-readable.

Now I’ve looked in my kitchen and realized I also have pickles. I can add them to my list and reserialize it, and cPickle will remember what's contained there without duplicating the information:
inFridge.append(“pickles”)
print inFridge


The output of my print command now displays:
[‘ketchup’, ‘mustard’, ‘relish’, ‘pickles’]

That looks right, so I have my script repickle the inFridge list and add it to the file:
FILE = open("fridge.txt", 'w')
cPickle.dump(inFridge, FILE)
FILE.close()


To get my information out of the file and back into a useable list, I simply open the file for reading and use the cPickle.load command to unpickle it. For the purposes of demonstration, I’ve used a new variable, inFridgeFile, to store the results:
FILE = open("fridge.txt", 'r')
inFridgeFile = cPickle.load(FILE)
FILE.close()

print inFridgeFile

The output of the print command displays:
[‘ketchup’, ‘mustard’, ‘relish’, ‘pickles’]

When I repickled my list, cPickle recognized my original contents and didn't duplicate them. The inFridgeFile variable contains my information restored to its original list format.

To put a lid on it
You have various options for serializing your data in Python, including pickle and cPickle. My cPickle example showed that this implementation is truly handy, especially since these commands will preserve your original object and allow modifications to be made even after it has been processed. This functionality will keep you from being stuck in a pickle next time you’re saving or transmitting objects.

What tips would you like to see?
Let us know what scripting language tips you’d like covered. Post your suggestions in the discussion below or include them in an e-mail to our editors.

 

Editor's Picks

Free Newsletters, In your Inbox