How to scrape information on a page and serve them as an RSS feed?
This Python tutorial explains how scrape the information published on a webpage and transform them in a smart and clean RSS feed.
Let’s begin from this trial page I’ve built. What we want to do is transform all the different paragraphs, titles and dates into a valid RSS feed.
In order to do so I will use two libraries: lxml for the scraping and Yattag for generating the XML code of the feed.
Here’s the full code, I go through it in the comments. Here you find it in a github gist.