ThoughtWorks
  • Contact
  • Español
  • Português
  • Deutsch
  • 中文
Go to overview
  • Engineering Culture, Delivery Mindset

    Embrace a modern approach to software development and deliver value faster

    Intelligence-Driven Decision Making

    Leverage your data assets to unlock new sources of value

  • Frictionless Operating Model

    Improve your organization's ability to respond to change

    Platform Strategy

    Create adaptable technology platforms that move with your business strategy

  • Experience Design and Product Capability

    Rapidly design, deliver and evolve exceptional products and experiences

    Partnerships

    Leveraging our network of trusted partners to amplify the outcomes we deliver for our clients

Go to overview
  • Automotive
  • Cleantech, Energy and Utilities
  • Financial Services and Insurance
  • Healthcare
  • Media and Publishing
  • Not-for-profit
  • Public Sector
  • Retail and E-commerce
  • Travel and Transport
Go to overview

Featured

  • Technology

    An in-depth exploration of enterprise technology and engineering excellence

  • Business

    Keep up to date with the latest business and industry insights for digital leaders

  • Culture

    The place for career-building content and tips, and our view on social justice and inclusivity

Digital Publications and Tools

  • Technology Radar

    An opinionated guide to technology frontiers

  • Perspectives

    A publication for digital leaders

  • Digital Fluency Model

    A model for prioritizing the digital capabilities needed to navigate uncertainty

  • Decoder

    The business execs' A-Z guide to technology

All Insights

  • Articles

    Expert insights to help your business grow

  • Blogs

    Personal perspectives from ThoughtWorkers around the globe

  • Books

    Explore our extensive library

  • Podcasts

    Captivating conversations on the latest in business and tech

Go to overview
  • Application process

    What to expect as you interview with us

  • Grads and career changers

    Start your tech career on the right foot

  • Search jobs

    Find open positions in your region

  • Stay connected

    Sign up for our monthly newsletter

Go to overview
  • Conferences and Events
  • Diversity and Inclusion
  • News
  • Open Source
  • Our Leaders
  • Social Change
  • Español
  • Português
  • Deutsch
  • 中文
ThoughtWorksMenu
  • Close   ✕
  • What we do
  • Who we work with
  • Insights
  • Careers
  • About
  • Contact
  • Back
  • Close   ✕
  • Go to overview
  • Engineering Culture, Delivery Mindset

    Embrace a modern approach to software development and deliver value faster

  • Experience Design and Product Capability

    Rapidly design, deliver and evolve exceptional products and experiences

  • Frictionless Operating Model

    Improve your organization's ability to respond to change

  • Intelligence-Driven Decision Making

    Leverage your data assets to unlock new sources of value

  • Partnerships

    Leveraging our network of trusted partners to amplify the outcomes we deliver for our clients

  • Platform Strategy

    Create adaptable technology platforms that move with your business strategy

  • Back
  • Close   ✕
  • Go to overview
  • Automotive
  • Cleantech, Energy and Utilities
  • Financial Services and Insurance
  • Healthcare
  • Media and Publishing
  • Not-for-profit
  • Public Sector
  • Retail and E-commerce
  • Travel and Transport
  • Back
  • Close   ✕
  • Go to overview
  • Featured

  • Technology

    An in-depth exploration of enterprise technology and engineering excellence

  • Business

    Keep up to date with the latest business and industry insights for digital leaders

  • Culture

    The place for career-building content and tips, and our view on social justice and inclusivity

  • Digital Publications and Tools

  • Technology Radar

    An opinionated guide to technology frontiers

  • Perspectives

    A publication for digital leaders

  • Digital Fluency Model

    A model for prioritizing the digital capabilities needed to navigate uncertainty

  • Decoder

    The business execs' A-Z guide to technology

  • All Insights

  • Articles

    Expert insights to help your business grow

  • Blogs

    Personal perspectives from ThoughtWorkers around the globe

  • Books

    Explore our extensive library

  • Podcasts

    Captivating conversations on the latest in business and tech

  • Back
  • Close   ✕
  • Go to overview
  • Application process

    What to expect as you interview with us

  • Grads and career changers

    Start your tech career on the right foot

  • Search jobs

    Find open positions in your region

  • Stay connected

    Sign up for our monthly newsletter

  • Back
  • Close   ✕
  • Go to overview
  • Conferences and Events
  • Diversity and Inclusion
  • News
  • Open Source
  • Our Leaders
  • Social Change
Blogs
Select a topic
View all topicsClose
Technology 
Agile Project Management Cloud Continuous Delivery  Data Science & Engineering Defending the Free Internet Evolutionary Architecture Experience Design IoT Languages, Tools & Frameworks Legacy Modernization Machine Learning & Artificial Intelligence Microservices Platforms Security Software Testing Technology Strategy 
Business 
Financial Services Global Health Innovation Retail  Transformation 
Careers 
Career Hacks Diversity & Inclusion Social Change 
Blogs

Topics

Choose a topic
  • Technology
    Technology
  • Technology Overview
  • Agile Project Management
  • Cloud
  • Continuous Delivery
  • Data Science & Engineering
  • Defending the Free Internet
  • Evolutionary Architecture
  • Experience Design
  • IoT
  • Languages, Tools & Frameworks
  • Legacy Modernization
  • Machine Learning & Artificial Intelligence
  • Microservices
  • Platforms
  • Security
  • Software Testing
  • Technology Strategy
  • Business
    Business
  • Business Overview
  • Financial Services
  • Global Health
  • Innovation
  • Retail
  • Transformation
  • Careers
    Careers
  • Careers Overview
  • Career Hacks
  • Diversity & Inclusion
  • Social Change
Technology

An Incremental Approach to Content Management Using Git

Andy Robinson Andy Robinson

Published: Jun 17, 2014

One of the many challenges with building or refreshing a website is the selection of a Content Management System (CMS). Despite our best efforts the CMS can often be a source of difficulty in a project, but there are alternatives. Read about the approach we took on www.thoughtworks.com to developing functionality to support content management in an incremental fashion.

When it comes to content management, organisations will often select a CMS product early in the life cycle of a website development.  This is frequently the source of pain later on in the project; frameworks usually enforce their view of the world upon their developers, and trying to choose correctly at the point where you understand the least about the project is nigh-on impossible.

Apart from wanting to avoid a big and difficult decision at the outset, there are a number of reasons why we took the decision not to use a CMS on www.thoughtworks.com from the get-go:

  1. We needed to start quickly without having to learn and configure a framework.
  2. We wanted to use all of the practices which we ourselves espouse (TDD, continuous integration, continuous delivery).  Many content management systems have poor support for these practices.
  3. We wanted a service-based architecture, so that we could evolve the system, experiment with different technologies and replace parts of the whole without a complete rewrite.
  4. We wanted flexibility in terms of the languages and tools we used.
  5. We are, after all, software developers!

Without a framework to install and configure we were able to start building the website quickly.  The first pages were working within the first week or two, and within eight weeks we had a functioning website.  The website you see today represents more than a year of additional iterative development, with the team delivering value every week.

#1 Content as static files

In order to get up and running quickly we initially built a largely static site, using templates to encapsulate and share presentation elements, and keeping the page data in separate content files stored as JSON [https://en.wikipedia.org/wiki/Json] and Slim[http://slim-lang.com/] (a HAML[http://haml.info/]-like mark-up and templating system for Ruby).  With a CD pipeline which allowed us to push changes to live easily we were able to maintain and push content without absorbing much developer time, but there we still requirements we needed to meet:

  • Content needed to be released outside of the application release cycle.
  • Contributors wanted to be able to change content without needing intervention from the development team (and the development team did not want to be involved in every content change).

Some form of content management was required.

#2 Not all content is equal

One of the features of our content we observed was that the frequency of changes of different kinds of content is very different – for example event or news information is constantly in motion, and each event has a limited shelf-life.  Information on some other pages (for example software-testing [https://www.thoughtworks.com/software-testing]) is much longer-lived and slow moving.  

This allowed us to prioritise work on making content editable, with slower moving items remaining as static text.   It was clear that we didn't have to replicate an entire CMS, or even make all of the content on the website editable – we just needed to make sure that the fast moving items could be changed.

#3 Introducing the content service

Our first step was to move a single fast moving item (news) to a new content service, initially still as static files.  Moving just a single kind of data allowed us to stand up a working service relatively quickly, and then we added other kinds content gradually.  Even this simple change added value;  we could now change news items by releasing the content service, we no longer needed to release the entire web application.

#4 Moving to GitHub [github.com]

The next stage was to allow the content service to accept updates to the information it was managing. At this point we needed to consider the kind of data store we would use going forward, and how we would ensure that the data was consistent between different load balanced instances of the content service. Typically this would be the point that we might move to a database (actually the data store is one of the decisions usually made right at the start – and indeed we had GitHub in mind from the early stages of the project).

The decision to use git and GitHub was based on a number of factors:

  • git provides a simple migration from a file-based data store; the files can just be checked-in to a repository.
  • git is widely used and understood.
  • git provides version control and workflow, which can be used to support content management.
  • GitHub provides a safe cloud-based store for data.
  • Simple editing and publishing can be achieved just by committing.

In order to move the content data files a number of pieces of functionality were required:

  • A new content repository was created and all content was committed to the repository
  • A local repository was created on each instance of the content service as part of the deployment process
  • Files were served from the local repository folders – no change from serving them from the file system as we were previously.
  • The local repositories in each instance of the content service had to be updated when changes were made to the master copy (in GitHub).  This is straightforward as GitHub provides a facility called Webhooks [https://help.GitHub.com/articles/post-receive-hooks], where a post will automatically be sent to a list of URLs when the repository is changed.

We made the decision to use a local repository for two reasons:

  • Serving content directly from GitHub proved too slow to be practical
  • A local repository gives a robust solution – even if there were problems with GitHub, our application could continue to serve content to the web.

At this point we had completely disconnected the content publishing cycle from the application (and service) releases.  Content changes (for content served through the content service) could be made simply by pushing to the content GitHub repository.

#5 Allowing content updates through the content service

So far we were only supporting reading GitHub content from the content service.  For a fully functional service we needed to be able to support the creation and updating of content.  Allowing updates to the local repository in a load balanced environment and then pushing to git would introduce the possibility of clashes between different content service instances.  Instead we elected to push all changes directly to the GitHub API, and then let these changes propagate back to all the content service instances through the Web hook mechanism.

#6 The GitHub API [https://developer.github.com/v3/]

The GitHub API provides access to most of the features of git via a RESTful [https://en.wikipedia.org/wiki/Representational_State_Transfer] interface. You can manipulate the data held in GitHub by referencing the underlying building blocks of git (Blobs, Commits and Trees) or through a high-level interface by referring to files.  We started by using the more high-level interface, thus:

def create_file(content, path, message) 
  base64_content = Base64.strict_encode64(content) 
  @octokit_client.create_contents("my-git-account/my-repo", 
  message, base64_content)
end

create_file("Contents to store","my/path/to/content.txt","Commit msg") 

This example illustrates creating a file in GitHub using the Octokit Ruby gem [https://github.com/octokit/octokit.rb] to take care of the communication directly with GitHub.  The file contents must be Base64 encoded, and then it's a single method call telling GitHub where to store the file.  The API creates a new commit with this file added to the repository.

We found that the simple interface proved unreliable in practice – GitHub API had a nasty habit of not always updating the head ref, leaving the repository in an inconsistent state.  The API is still under development, and we have found that many of the earlier wrinkles have been ironed out, so this may not be a problem now.

We moved to the lower level interface, which we found more reliable, but as you can see from the following equivalent code snippet involves more work:

def create_content(content, path, message) 
  content_reference = @octokit_client.create_blob(content) 
  commit_reference = create_commit_reference(message,content_reference, path) 
  @octokit_client.update_head_ref_to(commit_reference) 
end 

def create_commit_reference(commit_message, content_reference, path) 
  head_reference = @octokit_client.get_head_reference 
  base_tree_reference = @octokit_client.get_tree(head_reference) 
  tree_reference = @octokit_client.create_tree(base_tree_reference, content_reference, path) 
  @octokit_client.create_commit(head_reference, tree_reference, commit_message)
end 

create_content("Contents to store","my/path/to/content.txt","Commit msg") 

At the lower level, you need to:

1.  Create your new content blob (file contents)

2. Get the current head reference and it's associated tree

3. Create a new tree based on the current tree, but with the new blob added with the correct path

4. Create a new commit referencing the new tree

5. Move the head reference to point to the new commit

This is actually what git does when you do a commit, but thankfully most of the time you don't need to think about the lower level operations.  If you are interested in reading more, git from the bottom up [http://ftp.newartisans.com/pub/git.from.bottom.up.pdf] is a comprehensive walk through how git actually works under the covers.

With content updates being passed through to the GitHub API, we now had a fully functioning content service.

Read Part 2 where I go over the details of how we implemented the high level content management operations (save and publish) using git.

Technology Radar

Don't miss our opinionated guide to technology frontiers.

Subscribe
  • What we do
  • Who we work with
  • Insights
  • Careers
  • About
  • Contact

WeChat

×
QR code to ThoughtWorks China WeChat subscription account

Media and analyst relations | Privacy policy | Modern Slavery statement ThoughtWorks| Accessibility | © 2021 ThoughtWorks, Inc.