How to write a blog engine in Haskell Part 2

In my last post layed out the top level structure of the blog engine, how to find the right files from the posts folder and how to represent posts as data types.

This time I'd like to introduce a type-safe way to render HTML directly in Haskell without sacrificing readability or ease of use.

There are probably a million ways to render HTML but in practice there are only a handful of possibilities you might consider for putting data into HTML code.

Template Languages

The most popular way of rendering HTML are good old PHP-style templates, which let you interleave HTML code with executable bits of a dumb language to fill in dynamic values. This is a straight-forward approach which gets the job done but certainly has some disadvantages:

  • You might want to validate your HTML during development. As the template language itself is not a subset of HTML you can't vaidate the template itself so easily.

  • Reusable code lives as helper function which just returns dumb strings instead of data structures leading to error-prone code.

  • In general no type checking which makes it hard to build abstractions on top of some rendering logic.

Blaze Html

Let me introduce the BlazeHtml HTML combinator library for Haskell. It's incredible simple to use, always generates valid HTML, offers type-safety and is blazingly fast as the name already states.

Have a look at the type signature of a typical HTML combinator:

a :: Html -> Html

This function takes an Html element as content for a tag and returns another Html element, which represents a link in this case.

A more complete example:

import Text.Blaze.Html5
import Text.Blaze.Html5.Attributes

renderPage = docTypeHtml $ do
  body $ do
    ul $ forM_ [1 .. 10] (li . toHtml)

renderPage generates HTML for a list with numbers from 1 to 10. Note that you don't have any impedance mismatch between HTML and the host language. Code and data is easily mixed without losing type-safety. This is a big deal for me. Just imagine all the time you lost while reloading a web page after some minor editing just to see you have to switch back to your editor again.

Rendering Posts with Sundown

Rendering of posts should be simple and straighforward as possible. A blog post is just a regular markdown file with a title as first line. So to convert a blog post into html we basically just convert the file via Sundown, the markdown library from Github, and insert the resulting HTML into a layout template.

{-# LANGUAGE OverloadedStrings #-}
import Text.Blaze.Html4.Strict hiding (head, map, title, contents)
import Text.Blaze.Html4.Strict.Attributes hiding (content, title)
import qualified Text.Blaze.Html4.Strict as H
import qualified Text.Blaze.Html4.Strict.Attributes as A
import Text.Sundown.Html.String as S
import Data.List.Split

data Blog = Blog {
  blogUri,
  blogTitle :: String
};

data Post = Post {
  postFolder,
  postFile,
  postText:: String
};

-- Take the first line of a post file as post title.
postTitle :: Post -> String
postTitle post = head $ lines $ postText post

-- Return the filename of the post without extension.
postName :: Post -> String
postName post = head $ splitOn "." $ postFile post

-- Returns the path to the post on the website.
postLink :: Post -> String
postLink post = "/" ++ (postFolder post) ++ "/" ++ (postName post) ++ ".html"

-- Returns just the rendered body of a post without title.
postBody :: Post -> String
postBody post = S.renderHtml s allExtensions noHtmlModes True Nothing
  where s = concat $ intersperse "\n" $ drop 3 $ lines $ postText post

-- Render the html layout, insert the blog title, post title and post content.
renderLayout :: Blog -> Html -> Html
renderLayout blog content = do
  docType
  html $ do
    H.head $ do
      H.title $ toHtml $ blogTitle blog
    body $ do
      h2 ! id "header" $ do
        a ! href "/" $ toHtml $ blogTitle blog
      div ! class_ "content" $ do
        preEscapedToHtml content

-- Render a single post.
renderPost :: Post -> Html
renderPost post =
  div ! class_ "article" $ do
    h1 $ do
      a ! href (toValue (postLink post)) $ toHtml $ postTitle post
    preEscapedToHtml $ postBody post

-- Render a complete page containing one post.
renderPostPage :: Blog -> Post -> String
renderPostPage blog post = H.renderHtml $ renderLayout blog $ renderPost post

BlazeHtml escapes any value by default to prevent XSS. So any value you want to insert has to be of type Html or AttributeValue. Look at the code for the post title inside renderPost. The href for the link needs to be converted and the text of the link as well.

preEscapedToHtml is an explicit way to insert raw strings into the HTML document. In our case it is used to insert the page content into the layout and to insert the rendered markdown into the post template.

Atom and RSS feeds

Next post we will have a look at rendering feeds with the feed package.

Quick Guide for Passenger on Ubuntu Hardy

This is a short guide for installing Phusion Passenger on Ubuntu Hardy. This includes the installation of Ruby 1.8.6, Apache 2.2.8, MySQL 5.0.51a, Git 1.5.4 and Rails 2.1.1.

Essential Build Tools

First we need to install the compiler toolchain (make, gcc and libc).

$ apt-get install build-essential

Git

This guide is based on Git, so we install the git package:

$ apt-get install git-core

If you want to host a git repository on this machine, initialize a new repository:

$ mkdir /var/git
$ mkdir /var/git/myapp
$ cd /var/git/myapp
$ git --bare init

Now you can push your application code from your local machine to your repository:

$ cd ~/myapp
$ git remote add origin ssh://myserver.com/var/git/myapp
$ git push origin master

Ruby

We are going to install Ruby and all the essential ruby libraries.

$ apt-get install ruby ruby1.8-dev rubygems irb ri rdoc rake libruby libruby-extras

Gem Executable Path

Strangely the rubygems package does not setup the path for executables, so we add the following line to /etc/profile.

export PATH=/var/lib/gems/1.8/bin:$PATH

To immediately use the new executable path, we source the profile file:

$ . /etc/profile

Apache

This is just a basic Apache install. We need the devlopment files for compiling passenger:

$ apt-get install apache2 apache2-prefork-dev

MySQL

I use MySQL, so I needed to install the server and client packages and the Ruby gem, which compiles a native extension:

$ apt-get install mysql-server mysql-client
$ gem install mysql

Phusion Passenger

This is now the actual Passenger install, which consists of installing a gem and compiling the Apache module:

$ gem install passenger
$ passenger-install-apache2-module

Apache configuration

The compilation of the Passenger Apache module finished with an instruction for your httpd.conf. Depending on you passenger version, you will get something like this, which you add to your /etc/apache2/httpd.conf:

LoadModule passenger_module /var/lib/gems/1.8/gems/passenger-2.0.3/ext/apache2/mod_passenger.so
PassengerRoot /var/lib/gems/1.8/gems/passenger-2.0.3
PassengerRuby /usr/bin/ruby1.8

Additionally you probably want to enable mod_rewrite, which is needed for Rails:

$ a2enmod rewrite

Installing your Rails app

We create a app folder in /var/www and checkout the source from our git repository:

$ cd /var/www
$ mkdir myapp
$ cd myapp
$ git init
$ git remote add origin /var/git/myapp
$ git pull origin master

Installing Rails

We don't install Rails as Gem, because your application should be pinned to a specific Rails version. Git submodules allow you to embed a foreign repository in your source tree.

We are now going to link the Rails repository to vendor/rails and checking out Version 2.1.1, finally we commit the submodule link to our repository:

$ cd /var/www/myapp/
$ git submodule add git://github.com/rails/rails.git vendor/rails
$ cd vendor/rails
$ git checkout v2.1.1
$ cd ../..
$ git commit -m 'linked rails as submodule'

Probably you need to setup your database:

$ mysaladmin create myapp_production
$ mysaladmin create myapp_development
$ mysaladmin create myapp_test
$ rake db:migrate

Now your Rails app should be able to run as a Webrick Server:

$ ./script/server

Virtual host

Adding a virtual host for your rails application is now super easy thanks to Passenger. Create a file named /etc/apache2/sites-available/myapp:

<VirtualHost *:80>
    ServerName myserver.com
    DocumentRoot /var/www/myapp/public
</VirtualHost>

Now we disable the default site and add our new virtual host:

$ a2dissite default
$ a2ensite myapp

After restarting Apache your Rails application should run on Apache:

$ /etc/init.d/apache2 restart

User authentication

In case your Rails app is not meant to be seen on public, I recommend protecting it with HTTP Authentication.

Create a password file:

htpasswd2 /var/www/myapp/config/auth myusername

And add this to your virtual host configuration (Inside the VirtualHost section):

<Location />
    AuthType Basic
    AuthName "My App"
    AuthUserFile /var/www/myapp/config/auth
    Require valid-user
</Location>

Conclusion

Phusion Passenger simplifies the Installation of Rails applications significantly. I don't have to worry about mod_proxy, mod_proxy_balancer, mongrel and mongrel_cluster or even FastCGI. This is definitely simpler.

I have to mention, that Rails is just one option for your Ruby application. Setting up any other Ruby framework should be possible through the support of the Rack interface.

I really hope, that the specification of using one rackup file and one public folder will settle down as a standard for Ruby web applications, so that hosting companies will focus on supporting this standard and ruby developers don't need to worry about finding support for their favorite web frameworks.

Kontrol - a micro framework

Kontrol is a small web framework written in Ruby, which runs directly on Rack. It provides a simple pattern matching algorithm for routing and uses GitStore as data storage.

All examples can be found in the examples folder of the kontrol project, which is hosted on this github page.

Kontrol has its own project page now! Please look for current information there.

Quick Start

Create a file named hello_world.ru:

require 'kontrol'

class HelloWorld < Kontrol::Application
  map do
    get '/' do
      "Hello World!" 
    end
  end
end

run HelloWorld.new

Now run:

rackup hello_world.ru

Browse to http://localhost:9292 and you will see “Hello World”.

Features

Kontrol is just a thin layer on top of Rack. It provides a routing algorithm, a simple template mechanism and some convenience stuff to work with GitStore.

A Kontrol application is a class, which provides some context to the defined actions. You will probably use these methods:

  • request: the Rack request object
  • response: the Rack response object
  • params: union of GET and POST parameters
  • cookies: shortcut to request.cookies
  • session: shortcut to request.env['rack.session']
  • redirect(path): renders a redirect response to specified path

Routing

Routing is just as simple as using regular expressions with groups. Each group will be provided as argument to the block.

Create a file named routing.ru:

require 'kontrol'

class Routing < Kontrol::Application
  map do
    get '/pages/(.*)' do |name|
      "This is the page #{name}!"
    end

    get '/(\d*)/(\d*)' do |year, month|
      "Archive for #{year}/#{month}"
    end
  end
end

run Routing.new

Now run this application:

rackup routing.ru

You will now see, how regex groups and parameters are related. For example if you browse to localhost:9292/2008/12, the app will display Archive for 2008/12.

Nested Routes

Routes can be nested. This way you can avoid repeating patterns and define handlers for a set of HTTP verbs. Each handler will be called with the same arguments.

require 'kontrol'

class Nested < Kontrol::Application
  map do
    map '/blog' do
      get '/archives' do
        "The archives!"
      end
    end

    map '/(.*)' do
      get do |path|
        "<form method='post'><input type='submit'/></form>"
      end

      post do |path|
        "You posted to #{path}"
      end
    end
  end
end

run Nested.new

Now run this app like:

rackup nested.ru

The second route catches all paths except the /blog route. Inside the second route there are two different handlers for GET and POST actions.

So if you browse to /something, you will see a submit button. After submitting you will see the result of the second handler.

Templates

Rendering templates is as simple as calling a template file with some parameters, which are accessible inside the template as instance variables. Additionally you will need a layout template.

Create a template named templates/layout.rhtml:

<html>
  <body>
    <%= @content %>
  </body>
</html>

And now another template named templates/page.rhtml:

<h1><%= @title %></h1>
<%= @body %>

Create a templates.ru file:

class Templates < Kontrol::Application
  map do
    get '/(.*)' do |name|
      render "page.rhtml", :title => name.capitalize, :body => "This is the body!"
    end
  end
end

run Templates.new

Now run this example:

rackup templates.ru

If you browse to any path on localhost:9292, you will see the rendered template. Note that the title and body parameters have been passed to the render call.

Using GitStore

GitStore is another library, which allows you to store code and data in a convenient way in a git repository. The repository is checked out into memory and any data may be saved back into the repository.

Install GitStore and Grit by:

$ gem sources -a http://gems.github.com (you only have to do this once)
$ sudo gem install mojombo-grit georgi-git_store

We create a Markdown file name pages/index.md:

Hello World
===========

This is the **Index** page!

We have now a simple page, which should be rendered as response. We create a simple app in a file git_app.ru:

require 'bluecloth'

class GitApp < Kontrol::Application
  map do
    get '/(.*)' do |name|
      BlueCloth.new(store['pages', name + '.md']).to_html
    end
  end
end

run GitApp.new

Add all these files to your repo:

git init
git add pages/index.md
git commit -m 'init'

Run the app:

rackup git_app.ru

Browse to http://localhost:9292/index and you will see the rendered page generated from the markdown file.

This application runs straight from the git repository. You can delete all files except the rackup file and the app will still serve the page from your repo.

Using Javascript Templates for a Delicious Sidebar

Processing JSON data from an external source with Javascript templates is a natural fit. Create a template inside your HTML Document by adding class names and variables and write a few lines for fetching the JSON, that's all. This tutorial is an example for my Javascript Template Engine called Patroon.

Writing the template

In my sidebar you can see the result of my example. My latest bookmarks are shown as a list. Quite simple. The template looks like this:

<div class="bookmarks">
  <ul id="bookmarks-template">
    <li class="bookmark">
      <a href="{u}">{d}</a>
    </li>
  </ul>
</div>

There a two variables here u and d. I don't know if Delicious wants to save some bytes here, but descriptive names wouldn't hurt in this case. u is refering to the url of the bookmark and d is the title. We are expanding an array of bookmarks into the li element, which is marked by the class name bookmark.

Fetching the JSON Feed

The Feed resides on a different domain, so we have to use a script tag to fetch the data. This is because of security restrictions, which limits AJAX calls to the same domain of the current web page.

The feed url for your bookmarks looks like this:

http://feeds.delicious.com/v2/json/{username}

If you want to fetch some of the other feeds, just look at the documentation, which describes 18 different feed types.

A very useful option in our case is to provide a callback function, which gets called after the JSON script was loaded. We define renderBookmarks as our callback.

The following code inserts the script tag to load the Delicious JSON feed of my bookmarks. This is done when the page is loaded:

$(function() {
    var head = document.getElementsByTagName("head")[0];
    var script = document.createElement('script');

    script.setAttribute("src", "http://feeds.delicious.com/v2/json/matthias_georgi?callback=renderBookmarks");
    script.setAttribute("type", "text/javascript");

    head.appendChild(script);
});

I'm using jQuery here for the window load event. Other libraries would need some other api call.

Rendering the JSON data

The code for rendering consists of just two lines. First we are instantiating the Template. We have to provide the id of the template node (the template is part of your document).

Second we expand the template using the jQuery helper. The variable data contains just the array of bookmarks. To match the li element of the template, which has the class name bookmark, we must set the template variable bookmark to hold the bookmarks array.

function renderBookmarks(data) {
  var template = new Template('bookmarks-template');
  $('.bookmarks').expand(template, { bookmark: data });
}

Result

The resulting HTML of my bookmark sidebar looks like this:

<div class="bookmarks">
  <ul id="bookmarks-template">            
    <li class="bookmark">
      <a href="http://delicious.com/help/json/">
        <span>delicious/help/feeds</span>
      </a>
    </li>
    <li class="bookmark">
      <a href="http://code.google.com/apis/youtube/reference.html">
        <span>Reference Guide: Data API Protocol - YouTube APIs and Tools - Google Code</span>
      </a>
    </li>
    <li class="bookmark">
      <a href="http://rewrite.rubyforge.org/">
      <span>rewrite</span>
      </a>
    </li>
    <li class="bookmark">
      <a href="http://www.infoq.com/interviews/Rewrite-Reginald-Braithwaite">
        <span>InfoQ: Reginald Braithwaite on Rewrite</span>
      </a>
    </li>
  </ul>
</div>

You may wonder, why there are extra span elements around the variable expansions. Well this is necessary for inserting HTML from a variable. If I want to replace a text node with some HTML, I have to insert a span element and use the innerHTML property. If you know something better, please let me know.

Conclusion

Using Javascript templates with JSON feeds is simple and efficient. You write standards-compliant HTML sprinkled with some variables and expand this with some JSON data, that's all.

Related Work

There are some other libraries for javascript templating, which are related to Patroon:

Patroon is probably the smallest templating solution around and consists only of 130 lines of code.

Quick Guide for Passenger on Natty Narwhal

This is a short guide for installing Phusion Passenger and Ruby Enterprise Editition on Ubuntu Natty Narwhal. Depending on your machine this will take 30-60 minutes on a fresh Ubuntu install.

Installing build tools and libraries

First we need to install the compiler toolchain (make, gcc and libc) and necessary libraries.

$ apt-get install build-essential zlib1g-dev libssl-dev libreadline5-dev libmysqlclient-dev

Ruby Enterprise Edition

We are going to download and compile Ruby Enterprise Edition. The installer asks for the target directory. I would recommend to install into /opt/ruby unless you want to host different versions on this machine.

$ wget http://rubyenterpriseedition.googlecode.com/files/ruby-enterprise-1.8.7-2011.03.tar.gz
$ tar xzf ruby-enterprise-1.8.7-2011.03.tar.gz
$ cd ruby-enterprise-1.8.7-2011.03
$ ./installer

Now we include the path to the ruby binaries in /etc/environment. It should look like this:

PATH="/opt/ruby/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games"

After relogin you should be able to type ruby -v and get a response like this:

ruby 1.8.7 (2011-02-18 patchlevel 334) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2011.03

Apache and Passenger

We need to install Apache and necessary development libraries to compile Phusion Passenger.

$ apt-get install libcurl4-openssl-dev apache2-mpm-prefork apache2-prefork-dev libapr1-dev libaprutil1-dev
$ passenger-install-apache2-module

Apache configuration

The compilation of the Passenger Apache module finished with an instruction for your httpd.conf. Depending on you passenger version, you will get something like this, which you add to your /etc/apache2/httpd.conf:

LoadModule passenger_module /opt/ruby/lib/ruby/gems/1.8/gems/passenger-3.0.7/ext/apache2/mod_passenger.so
PassengerRoot /opt/ruby/lib/ruby/gems/1.8/gems/passenger-3.0.7
PassengerRuby /opt/ruby/bin/ruby

If you browse to your url, you should see the standard apache “It works” page.

MySQL

The Ruby Enterprise Installer already compiled Ruby's mysql client library, now we need the server and client.

$ apt-get install mysql-server mysql-client

Virtual host config

Adding a virtual host for your rails application is easy. Assuming that your application resides in /var/www/myapp create a file named /etc/apache2/sites-available/myapp and fill in :

<VirtualHost *:80>
    ServerName myserver.com
    DocumentRoot /var/www/myapp/public
</VirtualHost>

Now we disable the default site and add our new virtual host:

$ a2dissite default
$ a2ensite myapp

After restarting Apache your Rails application should run on Apache:

$ /etc/init.d/apache2 restart

Viewing RI in a web browser

I'm a big fan of the Firefox keyword search. For example I have keywords for LEO, Wikipedia and Man pages. Sometimes I want to look up API documentation in Ruby and typing ri camelize into the address bar and viewing the documentation as web page seems to be quite natural for me. So I wrote a quick and dirty cgi, which calls RI and outputs HTML.

CGI script

I've put the following code in file named /usr/lib/cgi-bin/ri.b. This is the default location for cgi scripts on my system for Apache.

#!/usr/bin/env ruby

require 'rdoc/ri/ri_driver'
require 'rubygems'

print "Content-type: text/html\r\n\r\n"

ARGV << '-f' << 'html'

ri = RiDriver.new

print '<html><body style="width:600px; margin:auto; padding:20px">'
ri.process_args
print '</body></html>'

This script does the same thing as if you typed ri somequery -f html. I put some HTML around it to give it some style, but that's it.

The Keyword Search

So I want to type ri String.capitalize and the browser should send a request to http://localhost/cgi-bin/ri.rb?String.capitalize.

Just add a new bookmark and give it the keyword ri and use as url this one:

 http://localhost/cgi-bin/ri.rb?%s

Now we're done. One thing I would like to improve is to add hyperlinks to the output. For example viewing the documentation of a class brings up all documented methods. Each of them should be a link to the actual documentation. Probably some monkey patching on the HtmlFormatter class would do the job.