RSS (Rich Site Summary) is a format for delivering regularly changing web content. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it. In python we take help of the below package to read and process these feeds.
pip install feedparser
In the below example we get the structure of the feed so that we can analyse further about which parts of the feed we want to process.
import feedparser NewsFeed = feedparser.parse("https://timesofindia.indiatimes.com/rssfeedstopstories.cms") entry = NewsFeed.entries[1] print entry.keys()
When we run the above program, we get the following output −
['summary_detail', 'published_parsed', 'links', 'title', 'summary', 'guidislink', 'title_detail', 'link', 'published', 'id']
In the below example we read the title and head of the rss feed.
import feedparser NewsFeed = feedparser.parse("https://timesofindia.indiatimes.com/rssfeedstopstories.cms") print 'Number of RSS posts :', len(NewsFeed.entries) entry = NewsFeed.entries[1] print 'Post Title :',entry.title
When we run the above program we get the following output −
Number of RSS posts : 5 Post Title : Cong-JD(S) in SC over choice of pro tem speaker
Based on above entry structure we can derive the necessary details from the feed using python program as shown below. As entry is a dictionary we utilise its keys to produce the values needed.
import feedparser NewsFeed = feedparser.parse("https://timesofindia.indiatimes.com/rssfeedstopstories.cms") entry = NewsFeed.entries[1] print entry.published print "******" print entry.summary print "------News Link--------" print entry.link
When we run the above program we get the following output −
Fri, 18 May 2018 20:13:13 GMT ****** Controversy erupted on Friday over the appointment of BJP MLA K G Bopaiah as pro tem speaker for the assembly, with Congress and JD(S) claiming the move went against convention that the post should go to the most senior member of the House. The combine approached the SC to challenge the appointment. Hearing is scheduled for 10:30 am today. ------News Link-------- https://timesofindia.indiatimes.com/india/congress-jds-in-sc-over-bjp-mla-made-pro-tem-speaker-hearing-at-1030-am/articleshow/64228740.cms