Saturday, April 12, 2008

Web 3.0 or the Semantic Web

If web 2.0 represented a revolution to computer users, allowing more or less anyone with an internet connection to quickly upload content and communicate ideas and creative works to the world, then web 3.0 represents a very different type of revolution which may be far less immediately apparent but may offer the biggest opportunity for the sophisticated rethinking of scholarly communications.

Web 2.0's greatest asset was the fact that users did not need to learn any sort of programming or coding in order to create content, it could all be done visually with WYSWYG (What You See Is What You Get) editors in almost all web services. Web 3.0 utilises what is known as the "semantic web" to make the bridge between human and computer understanding even more reliable. XML is the backbone of the semantic web (Dunsire et al 2008) as it separates content from formatting. XML is a data structure for carrying and transferring data which it does in a flexible standardised but platform neutral text only format (W3Schools 2008). The standard allows creators of content to select or create a suitable data structure which self-describes the file's content, this means that the content has been converted into a machine-readable but also a contextualised and descriptive way. In combination with formatting from HTML, XHTMLCSS data can therefore easily be moved between otherwise incompatible systems and software and reformatted quickly into a logical structure. For instance this example from W3Schools (2008) shows how a logical structure (defined by the content creator) is established, used and expressed in XML:

and/or
tree style diagram of a bookstore structure created with xml



lang</span>="en">Everyday Italian
Giada De Laurentiis
2005
30.00


lang</span>="en">Harry Potter
J K. Rowling
2005
29.99


lang</span>="en">Learning XML
Erik T. Ray
2003
39.95


You can see that tags are descriptive and indicate the type of text contained, rather than defining the presentation or universal role (e.g. "title") of the content. This may not look revolutionary but it does represent a whole new way of structuring information for the web that is and will continue to make for more intelligent and flexible data usage and re-purposing which should also have an impact on the accuracy and relevance of semantic web search. Since XML is very much format neutral and descriptive it can therefore describe sound, video, text. etc. in comparable and equally machine readable ways. With HTML 5 (the current draft of which includes specific tags for embedding sound, video etc.) XML will be able to support fully multi-modal materials in a way that makes them as easy, if not easier, to connect to, find and manipulate than current html pages. In addition the separation of form and content offers far greater support for accessibility tools than traditional data structures. It is, for instance, obvious from looking at the XML is the example above that a screen reader may be written or adapted to use the contextual structure to better explain the content to a visually impaired reader. Similarly tools used to change the appearance of the web to suit users (high contrast, moving or larger text etc) will be able to work better with XML data than with data loaded only in HTML

Also important to the semantic web (and indeed the social web) is the concept of the Resource Description Format or RDF (Bilder 2008) which allows the authoring of content to be declared with information such as date of edit, author's name, copyright information etc. A combination of RDF and XML allows for the construction of traceable machine readable information with a high level of context and connectivity (e.g. to other works by another author). For real benefit however RDF and XML must take place at the point of creating resources and this is why scholarly communications, including e-learning materials should be looking towards creating materials in XML so that it can be used effectively in the new semantic web.

No comments: