Git Store - using Git as versioned data store in Ruby

December 26, 2008

GitStore is a small Ruby library, providing an easy interface to the version control system Git. It aims to use Git as a versioned data store much like the well known PStore. Basically GitStore checks out the repository into a in-memory representation, which can be modified and finally committed. In this way your data is stored in a folder structure and can be checked out and examined, but the application may access the data in a convenient hash-like way. This library is based on Grit, the main technology behind GitHub.

GitStore has its own project page now! Please look for current information there.

Installation

GitStore can be installed as gem easily, if you have RubyGems 1.2.0:

$ gem sources -a http://gems.github.com (you only have to do this once)
$ sudo gem install mojombo-grit georgi-git_store

If you don’t have RubyGems 1.2.0, you may download the package on the github page and build the gem yourself:

$ gem build git_store.gemspec
$ sudo gem install git_store

Usage Example

First thing you should do, is to initialize a new git repository.

$ mkdir test
$ cd test
$ git init

Now you can instantiate a GitStore instance and store some data. The data will be serialized depending on the file extension. So for YAML storage you can use the ‘yml’ extension:

  1. class WikiPage < Struct.new(:author, :title, :body); end
  2. class User < Struct.new(:name); end
  3. store = GitStore.new('.')
  4. store['users/matthias.yml'] = User.new('Matthias')
  5. store['pages/home.yml'] = WikiPage.new('matthias', 'Home', 'This is the home page...')
  6. store.commit 'Added user and page'

Note that directories will be created automatically.

Another way to access a path is:

  1. store['config', 'wiki.yml'] = { 'name' => 'My Personal Wiki' }

Finally you can access the git store as a Hash of Hashes, but in this case you have to create the Tree objects manually:

  1. store['users'] = GitStore::Tree.new
  2. store['users']['matthias.yml'] = User.new('Matthias')

Where is my data?

When you call the commit method, your data is written back straight into the git repository. No intermediate file representation. So if you want to look into your data, you can use some git browser like git-gui or just checkout the files:

$ git checkout

Iteration

Iterating over the data objects is quite easy. Furthermore you can iterate over trees and subtrees, so you can partition your data in a meaningful way. For example you may separate the config files and the pages of a wiki:

  1. store['pages/home.yml'] = WikiPage.new('matthias', 'Home', 'This is the home page...')
  2. store['pages/about.yml'] = WikiPage.new('matthias', 'About', 'About this site...')
  3. store['pages/links.yml'] = WikiPage.new('matthias', 'Links', 'Some useful links...')
  4. store['config/wiki.yml'] = { 'name' => 'My Personal Wiki' }
  5. store.each { |obj| ... } # yields all pages and the config hash
  6. store['pages'].each { |page| ... } # yields only the pages

Serialization

Serialization is dependent on the filename extension. You can add more handlers if you like, the interface is like this:

  1. class YAMLHandler
  2. def read(id, name, data)
  3. YAML.load(data)
  4. end
  5. def write(data)
  6. data.to_yaml
  7. end
  8. end
  9. GitStore::Handler['yml'] = YAMLHandler.new

Shinmun uses its own handler for files with md extension:

  1. class PostHandler
  2. def read(name, data)
  3. Post.new(:filename => name, :src => data)
  4. end
  5. def write(post)
  6. post.dump
  7. end
  8. end
  9. GitStore::Handler['md'] = PostHandler.new

Related Work

John Wiegley already has done something similar for Python. His implementation has its own git interface, GitStore uses the wonderful Grit library.


Posted in category Ruby by Matthias Georgi. Tagged with git, database.
Similar Posts