Seeing meaning with semantic markup
Aegir Hallmundur on Wednesday, 15th July, 2009

Picture: col_adamson on Flickr
How do you update your website? If you have a content management system, it will probably have some kind of WYSIWYG (What You See Is What You Get) text editor built-in that gives you a set of controls something like this:
![]()
The basic TinyMCE toolbar
Tools like this are good because they give you a lot of control over what your content looks like, but that control and freedom comes with an important trade-off: Unlike a more restrictive system (think of a stock management system, say), it can’t tell you when something has gone wrong. Content can be formatted in so many ways and still, to the human eye, remain meaningful. Look at the example below:

If you were to come across that, you’d be able to work out what was going on without any trouble. It all looks right, right? To a human being with decent eyesight and a lifetime of experience of signs, books, menus, shopping lists etc. it’s an easy one to work out. However, what if you can’t see it, that you rely on a computer to read it out? Imagine a search engine coming across the text above, assuming it was formatted in a way popular until recently, it would read it like so:

Not so great. Since the content is mashed up together it makes it hard for the search engine to work out what the page is about. It might come up in a keyword search, but many search engines are developing ways of presenting more meaningful ‘intelligent’ results to users, so a search for “part time course in Southamption” could return the exact bit of text from the page, and if it’s well formatted, it remains readable. A mash of text, however, might put someone off and they’ll click elsewhere.
You need to provide some clues, some way of identifying what’s what that a computer can deal with correctly.
The benefits of doing so are various – in addition to some of the search engine optimisation benefits, you’ll have a more attractive and accessible site, your editors will be happier and since well-structured code tends to be leaner, your pages will download faster too too.
Showing your meaning
So really what I’m talking about here is known as semantic markup, i.e. indicating the meaning, rather than the appearance, of a document. There are best practice guidelines for sticking to semantic principles online, but these are usually targeted at web developers – the people who build sites – rather than the people who have to maintain the content of sites. As I’ve mentioned, the tools available to content creators don’t really help all that much with this – the only options they give for making sure the code behind the text is correct is to, well, look at the code, and edit it by hand. It’s not really the best option for a lot of people.
Now, this isn’t to say that the tools available do a particularly bad job, but it’s clear from experience that they could be better.
There are already some tools and standards that allow you to mark up the meaning of your text – Markdown, Textile and the kind of conventions that Wikipedia follows, but these are either very limited or can get complicated fast. Trying to create a table of data with a lot of them can be extraordinarily difficult.
A glimpse of the future
Fortunately there are some new tools being developed that will, in the near future, let you mark up your text meaningfully while being easy to use. These are WYSIWYM editors, *What You See Is What You Mean*.
The way they work is to give you tools for marking up your text according to what kind of information it is. This is a header, that’s a paragraph, that other bit is a list, and here’s a table. You see your content displayed in a specific way – it won’t reflect your site’s design in the editor, but you’ll be able to see what’s going on much more clearly and be sure there aren’t any bits of code ‘hiding’ that will mess up your pages. If it’s there, you’ll be able to see it, edit it and remove it.

WYM Editor
The tools that exist, such as WYM Editor (above) are very good, but still need some development work to make them easy enough to use. We’re following the developments closely and will contribute where we can, and hopefully we’ll be able to offer a WYSIWYM editor in our own CMSs very soon.