Max SchmittMS
10th June 2020

XML RSS feeds with Node.js

RSS feeds are great for keeping readers of your website in the loop and are a nice source of traffic for frequently updated sites.

I recently created an RSS feed for my own site and I'm going to show you how you can create one yourself using Node.js.

Anatomy of an RSS feed

Here is what the XML markup of an RSS feed basically looks like:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<!-- Here goes some general info about your site -->
<atom:link href="https://maximilianschmitt.me/feed.rss" rel="self" type="application/rss+xml" />
<title>Max Schmitt</title>
<link>https://maximilianschmitt.me</link>
<description>Posts and articles by a web developer specializing in Node.js and React</description>
<language>en-us</language>
<!-- Every <item> contains a blog post -->
<item>
<title>Testing mobile, tablet and desktop devices with Cypress</title>
<!-- <pubDate> expects your date to be in RFC-822 format -->
<pubDate>Mon, 01 Jul 2019 00:00:00 +0200</pubDate>
<link>https://maximilianschmitt.me/posts/cypress-testing-mobile-tablet-desktop</link>
<!-- The <guid> can be any unique identifier, so the URL would work well -->
<guid>https://maximilianschmitt.me/posts/cypress-testing-mobile-tablet-desktop</guid>
<description><![CDATA[<p>Put your post&apos;s HTML inside CDATA</p>]]></description>
</item>
<!-- Here is another <item> -->
<item>
<title>Error reporting with Rollbar and Next.js</title>
<pubDate>Fri, 10 Jan 2020 00:00:00 +0100</pubDate>
<link>https://maximilianschmitt.me/posts/error-reporting-rollbar-nextjs</link>
<guid>https://maximilianschmitt.me/posts/error-reporting-rollbar-nextjs</guid>
<description><![CDATA[<!-- This is where your article's HTML goes -->]]></description>
</item>
<!-- ... -->
</channel>
</rss>

Most of it is pretty self-explanatory but there are some things to watch out for.

Gotcha: <pubDate> format

The <pubDate> of an item is expected in RFC-822 format. In a library like Moment.js or Day.js you would express it with the following tokens:

ddd, DD MMM YYYY HH:mm:ss ZZ // RFC-822 date-time

Gotcha: <guid> tag

The <guid> tag inside the items of an RSS feed must be unique. If you have a post ID, you can use that, otherwise just use the post's URL.

Gotcha: HTML, CDATA and the <description> tag

The correct place to put the HTML markup of your post, is the <description> tag. If you don't want to escape every HTML entity, you can use a CDATA section to tell the XML parser to not parse any tags inside it.

<description>
<!-- ❌ Invalid, HTML is not escaped: -->
<p>Welcome to my blog post</p>
<!-- ✅ Valid, HTML is escaped: -->
&lt;p&gt;Welcome to my blog post&lt;/p&gt;
<!-- ✅ Valid, HTML is inside CDATA section: -->
<![CDATA[<p>Welcome to my blog post</p>]]>
</description>

Converting JSON to XML/RSS

Now that you know what the end-result should look like, we'll create the XML by feeding a good old JSON object to the amazing xml package that you can get from npm.

const xml = require('xml')
const xmlObject = {
rss: [
{
_attr: {
version: '2.0',
'xmlns:atom': 'http://www.w3.org/2005/Atom'
}
},
{
channel: [
{
'atom:link': {
_attr: {
href: SITE_BASE_URL + '/feed.rss',
rel: 'self',
type: 'application/rss+xml'
}
}
},
{ title: 'Max Schmitt' },
{ link: 'https://maximilianschmitt.me' },
{ description: 'A short description' },
{ language: 'en-us' },
...posts.map((post) => {
const absoluteHREF = SITE_BASE_URL + post.href
return {
item: [
{ title: post.data.title },
{ pubDate: post.rfc822Date },
{ link: absoluteHREF },
{ guid: absoluteHREF },
{ description: { _cdata: post.html } }
]
}
})
]
}
]
}
const xmlString = '<?xml version="1.0" encoding="UTF-8"?>' + xml(xmlObject)

To make your RSS feed easily discoverable, put this inside the <head> tag of your site:

<link rel="alternate" type="application/rss+xml" href="https://maximilianschmitt.me/feed.rss" title="Maximilian Schmitt">

Using cheerio to generate absolute URLs

An RSS reader will expect absolute URLs for any links or images inside your posts.

If, like me, you use relative URLs inside your post HTML, you can use cheerio to go over the markup and make all relative URLs absolute:

const cheerio = require('cheerio')
const absoluteHREF = SITE_BASE_URL + post.href
const $ = cheerio.load(post.html)
$('img').each(function (i, el) {
const originalSRC = $(el).attr('src')
const absoluteSRC = url.resolve(absoluteHREF + '/', originalSRC)
$(el).attr('src', absoluteSRC)
})
$('a').each(function (i, el) {
const originalLink = $(el).attr('href')
const absoluteLink = url.resolve(absoluteHREF + '/', originalLink)
$(el).attr('href', absoluteLink)
})
const postHTML = $('#rss-main').html()

And that's how I generate the RSS feed for my site, as part of my custom static site generator of course. 🙃

If you enjoyed this post, you might also be interested in creating a sitemap.xml with Node.js.

If you're looking for an in-depth explanation of the various XML tags available in RSS, check out this great article about RSS 2.0.