'Tutorials'

Building a simple RSS feed reader with PHP

18 JAN 2011 1

If you aren't already familiar with what RSS stands for then I would highly advise you to have a look at this great article on ProBlogger, then get back to read the rest of the entry. In this article I will also assume you are more or less familiar with Object Oriented Programming and you know some PHP.

Back in the days of PHP 4.x reading an RSS stream would have involved opening a TCP connection to the URI, reading the content and parsing it with an endless stream of regular expressions. PHP 5 allows you to do it seamlessly in a few lines of code with the help of simplexml_load_file().

Prerequisites

You will need the libxml extension installed. To check for it, point your browser to the phpinfo page and look for something similar to the image below:

image

If the extension is not enabled, you will need to recompile PHP with the --enable-libxml argument (or ask your web hosting provider to set it up for you if you are on a shared host). Also, since you will be opening remote XML feeds, you will need allow_url_fopen = on in your php.ini file.

Reading the remote feed with simplexml_load_file()

Getting the RSS feed is as simple as

[cc lang="php"][/cc]

For this example, we will be using the Lifeline Design feed. Provided there are no errors, $rss is now a SimpleXMLElement object, with one property we are interested in: $rss->channel contains all information in the RSS feed. $rss->channel is a SimpleXMLElement object as well, with the following properties:

  • title: The title of the blog—the one that you see at the very top of your browser window when you open the front page. In our case, it will be “Lifeline Blog.”
  • link: The URL of the blog—”https://www.lifelinedesign.ca/blog”
  • description: Description of the site. It might not be relevant for the blog, since a lot of Wordpress themes don’t take it into account. With LLD, the description is the standard “Just another WordPress weblog.”
  • lastBuildDate: The time and date of the last update, namely the latest post published.
  • generator: Usually the name and version of the script the site is running on. With Wordpress blogs, this string is usually in the form of “http://wordpress.org/?v=x.x.x”
  • language: The language the site runs on, in locale format.
  • item: An array containing the posts offered in the RSS feed.

To get the actual posts, you will need to loop through the $rss->channel->item object, which is also an instance of SimpleXMLElement, and read the following properties:

  • title: the title of the article;
  • link: the Permalink to the post;
  • comments: the URL for comments on the article (with most Wordpress installations this would be http://example.com/permalink-to-article/#comments)
  • pubDate: self-explanatory, the date and time when the article was published;
  • category: a SimpleXMLElement object with the category under which the article lies;
  • guid: the URL of the article in standard canonical format (http://example.com/?p=article_id);
  • description: depending on the settings on that site, this will contain either a preview or the full article.

The code

As outlined in the beginning, as little as a few lines of code can do the job:

[cc lang="php"]

$rss = simplexml_load_file('https://www.lifelinedesign.ca/blog/feed/');

if($rss) { echo '

'.$rss->channel->title.'

'; //channel title, aka the name of the blog echo 'Last updated on: '.$rss>channel->lastBuildDate; //date of the last post echo '

    ';

    foreach($rss->channel->item as $item) { //loop through the articles echo '

  • ' . $item->title . ''; //title and link echo ' (' . $item->pubDate . ')'; //publishing date echo '
    ' . $item->description . '
  • '; //contents } echo '

'; }

?>

[/cc]

Of course, there is still room for improvement (for example, if there is no feed at that URL, libxml will throw a set of nasty warnings), yet the code can be used to jumpstart your own RSS reader.

Fill out the form below to get started

find out what we can do for you 877 543 3110