Python - Constrained Search


Advertisements

Many times, after we get the result of a search we need to search one level deeper into part of the existing search result. For example, in a given body of text we aim to get the web addresses and also extract the different parts of the web address like the protocol, domain name etc. In such scenario we need to take help of group function which is used to divide the search result into various groups bases on the regular expression assigned. We create such group expression by separating the main search result using parentheses around the searchable part excluding the fixed words we want match.

import re
text = "The web address is https://www.howcodex.com"

# Taking "://" and "." to separate the groups 
result = re.search('([\w.-]+)://([\w.-]+)\.([\w.-]+)', text)
if result :
    print "The main web Address: ",result.group()
    print "The protocol: ",result.group(1)
    print "The doman name: ",result.group(2) 
    print "The TLD: ",result.group(3) 

When we run the above program, we get the following output −

The main web Address:  https://www.howcodex.com
The protocol:  https
The doman name:  www.howcodex
The TLD:  com
Advertisements