Python Data Persistence - Pickle Module


Advertisements

Python’s terminology for serialization and deserialization is pickling and unpickling respectively. The pickle module in Python library, uses very Python specific data format. Hence, non-Python applications may not be able to deserialize pickled data properly. It is also advised not to unpickle data from un-authenticated source.

The serialized (pickled) data can be stored in a byte string or a binary file. This module defines dumps() and loads() functions to pickle and unpickle data using byte string. For file based process, the module has dump() and load() function.

Python’s pickle protocols are the conventions used in constructing and deconstructing Python objects to/from binary data. Currently, pickle module defines 5 different protocols as listed below −

Sr.No. Names & Description
1

Protocol version 0

Original “human-readable” protocol backwards compatible with earlier versions.

2

Protocol version 1

Old binary format also compatible with earlier versions of Python.

3

Protocol version 2

Introduced in Python 2.3 provides efficient pickling of new-style classes.

4

Protocol version 3

Added in Python 3.0. recommended when compatibility with other Python 3 versions is required.

5

Protocol version 4

was added in Python 3.4. It adds support for very large objects

Example

The pickle module consists of dumps() function that returns a string representation of pickled data.

from pickle import dump
dct={"name":"Ravi", "age":23, "Gender":"M","marks":75}
dctstring=dumps(dct)
print (dctstring)

Output

b'\x80\x03}q\x00(X\x04\x00\x00\x00nameq\x01X\x04\x00\x00\x00Raviq\x02X\x03\x00\x00\x00ageq\x03K\x17X\x06\x00\x00\x00Genderq\x04X\x01\x00\x00\x00Mq\x05X\x05\x00\x00\x00marksq\x06KKu.

Example

Use loads() function, to unpickle the string and obtain original dictionary object.

from pickle import load
dct=loads(dctstring)
print (dct)

Output

{'name': 'Ravi', 'age': 23, 'Gender': 'M', 'marks': 75}

Pickled objects can also be persistently stored in a disk file, using dump() function and retrieved using load() function.

import pickle
f=open("data.txt","wb")
dct={"name":"Ravi", "age":23, "Gender":"M","marks":75}
pickle.dump(dct,f)
f.close()

#to read
import pickle
f=open("data.txt","rb")
d=pickle.load(f)
print (d)
f.close()

The pickle module also provides, object oriented API for serialization mechanism in the form of Pickler and Unpickler classes.

As mentioned above, just as built-in objects in Python, objects of user defined classes can also be persistently serialized in disk file. In following program, we define a User class with name and mobile number as its instance attributes. In addition to the __init__() constructor, the class overrides __str__() method that returns a string representation of its object.

class User:
   def __init__(self,name, mob):
      self.name=name
      self.mobile=mob
   def __str__(self):
return ('Name: {} mobile: {} '. format(self.name, self.mobile))

To pickle object of above class in a file we use pickler class and its dump()method.

from pickle import Pickler
user1=User('Rajani', 'raj@gmail.com', '1234567890')
file=open('userdata','wb')
Pickler(file).dump(user1)
Pickler(file).dump(user2)
file.close()

Conversely, Unpickler class has load() method to retrieve serialized object as follows −

from pickle import Unpickler
file=open('usersdata','rb')
user1=Unpickler(file).load()
print (user1)
Advertisements