Friday, August 26, 2011

Document Object Model (DOM)

I remember once attending a seminar by Microsoft where they were describing themselves as the cleverest people in the world for inventing what they called the Component Object Model (COM). It was only years later that I learned that they didn't really invent the idea at all, but copied it from other object oriented programming languages.

Anyway, the Document Object Model (DOM), although it rhymes with COM, has nothing to do with Microsoft, but is a World Wide Web Consortium (W3C) standard defined as:

A platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of a document.

According to the W3Schools tutorial on DOM:

The DOM is separated into 3 different parts / levels:

  • Core DOM - standard model for any structured document
  • XML DOM - standard model for XML documents
  • HTML DOM - standard model for HTML documents

The DOM defines the objects and properties of all document elements, and the methods (interface) to access them. [It] is a standard for how to get, change, add, or delete [XML/HTML] elements.

According to the tutorial:

DOM views an HTML document as a tree-structure ... called a node-tree.

And it goes on to show a diagram of a node tree. I'm not sure how useful this form of representation is for me, but I'll show it below for completeness:

Essentially, the DOM breaks web documents into a set of objects, which it calls nodes, and these nodes have properties which can be read or acted upon by methods. The Tutorial provides a link to a reference page listing DOM objects and their properties, along with JavaScript objects and Browser objects.

The Tutorial gives a "hello world" example, which to me illustrates the extraordinary messiness of this language/methodology. Before citing the example, I shall mention that two of the objects listed on the reference page are the document object and the HTMLElement object. One of the methods listed for the is the getElementById() method. One of the properties listed for the HTMLElement object is .innerHTML, which in any other language would probably be .text. Another is .id. So now here is the code:


<p id="intro">Hello World!</p>

<script type="text/javascript">
document.write("<p>The intro paragraph text is: " + txt + "</p>");


So the first paragraph is assigned the id "intro" by the HTML. Then in the JavaScript this HTMLElement object is referred to by its id, and its innerHTML property is assigned to the JavaScript variable txt. The content of this variable is then concatenated with the text of a second paragraph created with the JavaScript document.write method. The result is shown in the image below:

That seems a lot of work to achieve something fairly trivial, but at least now I have some understanding of what DOM is all about. It is essentially a labeling system for HTML elements, which enables them to be changed programmatically (usually with JavaScript). In fact it is really just an extension of JavaScript to embrace HTML elements and their properties.

No comments: