Object Oriented Python - Files and Strings


Advertisements

Strings

Strings are the most popular data types used in every programming language. Why? Because we, understand text better than numbers, so in writing and talking we use text and words, similarly in programming too we use strings. In string we parse text, analyse text semantics, and do data mining – and all this data is human consumed text.The string in Python is immutable.

String Manipulation

In Python, string can be marked in multiple ways, using single quote ( ‘ ), double quote( “ ) or even triple quote ( ‘’’ ) in case of multiline strings.

>>> # String Examples
>>> a = "hello"
>>> b = ''' A Multi line string,
Simple!'''
>>> e = ('Multiple' 'strings' 'togethers')

String manipulation is very useful and very widely used in every language. Often, programmers are required to break down strings and examine them closely.

Strings can be iterated over (character by character), sliced, or concatenated. The syntax is the same as for lists.

The str class has numerous methods on it to make manipulating strings easier. The dir and help commands provides guidance in the Python interpreter how to use them.

Below are some of the commonly used string methods we use.

Sr.No. Method & Description
1

isalpha()

Checks if all characters are Alphabets

2

isdigit()

Checks Digit Characters

3

isdecimal()

Checks decimal Characters

4

isnumeric()

checks Numeric Characters

5

find()

Returns the Highest Index of substrings

6

istitle()

Checks for Titlecased strings

7

join()

Returns a concatenated string

8

lower()

returns lower cased string

9

upper()

returns upper cased string

10

partion()

Returns a tuple

11

bytearray()

Returns array of given byte size

12

enumerate()

Returns an enumerate object

13

isprintable()

Checks printable character

Let’s try to run couple of string methods,

>>> str1 = 'Hello World!'
>>> str1.startswith('h')
False
>>> str1.startswith('H')
True
>>> str1.endswith('d')
False
>>> str1.endswith('d!')
True
>>> str1.find('o')
4
>>> #Above returns the index of the first occurence of the character/substring.
>>> str1.find('lo')
3
>>> str1.upper()
'HELLO WORLD!'
>>> str1.lower()
'hello world!'
>>> str1.index('b')
Traceback (most recent call last):
   File "<pyshell#19>", line 1, in <module>
      str1.index('b')
ValueError: substring not found
>>> s = ('hello How Are You')
>>> s.split(' ')
['hello', 'How', 'Are', 'You']
>>> s1 = s.split(' ')
>>> '*'.join(s1)
'hello*How*Are*You'
>>> s.partition(' ')
('hello', ' ', 'How Are You')
>>>

String Formatting

In Python 3.x formatting of strings has changed, now it more logical and is more flexible. Formatting can be done using the format() method or the % sign(old style) in format string.

The string can contain literal text or replacement fields delimited by braces {} and each replacement field may contains either the numeric index of a positional argument or the name of a keyword argument.

syntax

str.format(*args, **kwargs)

Basic Formatting

>>> '{} {}'.format('Example', 'One')
'Example One'
>>> '{} {}'.format('pie', '3.1415926')
'pie 3.1415926'

Below example allows re-arrange the order of display without changing the arguments.

>>> '{1} {0}'.format('pie', '3.1415926')
'3.1415926 pie'

Padding and aligning strings

A value can be padded to a specific length.

>>> #Padding Character, can be space or special character
>>> '{:12}'.format('PYTHON')
'PYTHON '
>>> '{:>12}'.format('PYTHON')
' PYTHON'
>>> '{:<{}s}'.format('PYTHON',12)
'PYTHON '
>>> '{:*<12}'.format('PYTHON')
'PYTHON******'
>>> '{:*^12}'.format('PYTHON')
'***PYTHON***'
>>> '{:.15}'.format('PYTHON OBJECT ORIENTED PROGRAMMING')
'PYTHON OBJECT O'
>>> #Above, truncated 15 characters from the left side of a specified string
>>> '{:.{}}'.format('PYTHON OBJECT ORIENTED',15)
'PYTHON OBJECT O'
>>> #Named Placeholders
>>> data = {'Name':'Raghu', 'Place':'Bangalore'}
>>> '{Name} {Place}'.format(**data)
'Raghu Bangalore'
>>> #Datetime
>>> from datetime import datetime
>>> '{:%Y/%m/%d.%H:%M}'.format(datetime(2018,3,26,9,57))
'2018/03/26.09:57'

Strings are Unicode

Strings as collections of immutable Unicode characters. Unicode strings provide an opportunity to create software or programs that works everywhere because the Unicode strings can represent any possible character not just the ASCII characters.

Many IO operations only know how to deal with bytes, even if the bytes object refers to textual data. It is therefore very important to know how to interchange between bytes and Unicode.

Converting text to bytes

Converting a strings to byte object is termed as encoding. There are numerous forms of encoding, most common ones are: PNG; JPEG, MP3, WAV, ASCII, UTF-8 etc. Also this(encoding) is a format to represent audio, images, text, etc. in bytes.

This conversion is possible through encode(). It take encoding technique as argument. By default, we use ‘UTF-8’ technique.

>>> # Python Code to demonstrate string encoding 
>>> 
>>> # Initialising a String 
>>> x = 'Howcodex' 
>>> 
>>> #Initialising a byte object 
>>> y = b'Howcodex'
>>> 
>>> # Using encode() to encode the String >>> # encoded version of x is stored in z using ASCII mapping 
>>> z = x.encode('ASCII') 
>>> 
>>> # Check if x is converted to bytes or not 
>>> 
>>> if(z==y): 
   print('Encoding Successful!') 
else: 
   print('Encoding Unsuccessful!') 
Encoding Successful!

Converting bytes to text

Converting bytes to text is called the decoding. This is implemented through decode(). We can convert a byte string to a character string if we know which encoding is used to encode it.

So Encoding and decoding are inverse processes.

>>> 
>>> # Python code to demonstrate Byte Decoding 
>>> 
>>> #Initialise a String 
>>> x = 'Howcodex' 
>>> 
>>> #Initialising a byte object 
>>> y = b'Howcodex' 
>>> 
>>> #using decode() to decode the Byte object 
>>> # decoded version of y is stored in z using ASCII mapping 
>>> z = y.decode('ASCII')
>>> #Check if y is converted to String or not 
>>> if (z == x): 
   print('Decoding Successful!') 
else: 
   print('Decoding Unsuccessful!') Decoding Successful! 
>>>

File I/O

Operating systems represents files as a sequence of bytes, not text.

A file is a named location on disk to store related information. It is used to permanently store data in your disk.

In Python, a file operation takes place in the following order.

  • Open a file
  • Read or write onto a file (operation).Open a file
  • Close the file.

Python wraps the incoming (or outgoing) stream of bytes with appropriate decode (or encode) calls so we can deal directly with str objects.

Opening a file

Python has a built-in function open() to open a file. This will generate a file object, also called a handle as it is used to read or modify the file accordingly.

>>> f = open(r'c:\users\rajesh\Desktop\index.webm','rb')
>>> f
<_io.BufferedReader name='c:\\users\\rajesh\\Desktop\\index.webm'>
>>> f.mode
'rb'
>>> f.name
'c:\\users\\rajesh\\Desktop\\index.webm'

For reading text from a file, we only need to pass the filename into the function. The file will be opened for reading, and the bytes will be converted to text using the platform default encoding.

Advertisements