Migrating from Radiant CMS to... *anything* else

I mentioned recently about the painful transition of this website from the Ruby/Rails Radiant content management server to... well, anything that would actually work. Given its popularity, I have to assume that Ruby and Rails can be made to work well -- or that 1000s of development teams are herd-following idiots, but that can't be true, right? -- but my experience was a nightmare.

Mysterious Rakefiles, UI-disaster server commands, awful integration with system packages, god-awful outdated Radiant documentation, and changes with every release. In the end, an update of the base Ubuntu OS completely broke Radiant. I tried using Ruby Gems in all the ways I could find, and updated every package to the latest that Radiant thought it wanted but couldn't get it to run again. I tried making a new Radiant site and migrating the database via the advertised commands: it crashed. And in the end it seemed that Radiant's own declaration of package dependencies was inconsistent. This was just the final straw after several years of expecting a Rails epiphany, and dreading every time that I'd have to restart the server and somehow get the creaking mess up and running again.

Well, enough was enough. I'venow moved to using the Nikola static site generator instead and couldn't be happier: it's got a great command-line UI, it's totally clear what's going on, I can hack and extend it if I want to, and my data is forever in a human-readable, editable (even when offline!) format.

Radiant's page data is categorically not available in a human-readable format, so a significant part of the effort to get this site back to life was the need to write a script to access its article database, and dump out the pages in a form I could use. Fortunately the db is just an sqlite single-file database, and the table structure was pretty simple, so the dump script was easy. Here it is for posterity:

radiant2txt (Source)

#! /usr/bin/env python

"Convert a RadiantCMS SQLite3 db file into separate page and header text files"

import optparse, os
op = optparse.OptionParser()
op.add_option("-o", "--out", dest="OUTDIR", default="out")
opts, args = op.parse_args()

import sqlite3
conn = sqlite3.connect(args[0])
conn.row_factory = sqlite3.Row
c = conn.cursor()

import unicodedata
def norm(s):
    return unicodedata.normalize("NFD", s).encode("ascii", "ignore")

import datetime
def date(s):
    return datetime.datetime.strptime(s, "%Y-%m-%d %H:%M:%S").date().isoformat() if s else ""

import textwrap, re
class DocWrapper(textwrap.TextWrapper):
    """Wrap text in a document, processing each paragraph individually"""

    def __init__(self):
        self.tw = textwrap.TextWrapper(width=120, break_long_words=False)

    def wrap(self, text):
        """Override textwrap.TextWrapper to process 'text' properly when
        multiple paragraphs present"""
        para_edge = re.compile(r"(\n\s*\n)", re.MULTILINE)
        paragraphs = para_edge.split(text)
        wrapped_lines = []
        for para in paragraphs:
            if para.isspace():
                wrapped_lines.append('')
            else:
                wrapped_lines.extend(self.tw.wrap(para))
        return wrapped_lines

dw = DocWrapper()

for page in conn.execute("SELECT * FROM pages"):
    pagename = page["slug"] if page["slug"] != "/" else "index"
    outfile = os.path.join(opts.OUTDIR, "%s.md" % pagename)
    with open(outfile, "w") as f:
        f.write("<!-- \n")
        f.write(".. title: " + norm(page["title"]) + "\n")
        f.write(".. slug: " + pagename + "\n")
        if page["published_at"]:
            f.write(".. date: " + page["published_at"] + "\n")
        else:
            f.write(".. date: 2008-06-01 12:00:00\n")
        f.write(".. type: text\n")
        f.write(".. category: blog\n")
        f.write("-->")
        f.write("\n\n")
        for part in conn.execute("SELECT * FROM page_parts WHERE page_id = ? ORDER BY page_parts.name", (page["id"],)):
            text = dw.fill(norm(part["content"]))
            if text:
                f.write(text + "\n")

To get a bunch of pages out in the format I wanted (my site was using Markdown syntax, so the script writes out to a bunch of .md files), I ran this like:

./radiant2txt myradiantsite/db/radiant_live.sqlite.db -o out-nikola

A bit of manual hacking followed, but 95% of the job was done by the script above. Use if you like, but don't ask me for support; if you need something a bit different, hack it!

Comments

Comments powered by Disqus