XML – Elements, Attributes, Entities – IT Training and Consulting

XML – Elements, Attributes, Entities

In this tutorial you will learn about Elements, Anatomy of tags, Tag naming rules, Invalid tags, Valid tags, Root and child elements, Attributes, When Do I use Attributes? Entities, Character data sections, Comments and Processing instructions.

{mos_ri}

Elements

Elements are the basic building blocks of XML

It may contain
– Other elements
– Character data
– Character references
– Entity references
– Comments
– These are collectively known as element content

Ex: < student > Mason Hill < /student >

An Element consists of three parts
1. Opening Tag < student >
2. Description Mason Hill
3. Closing Tag < /student >

Anatomy of tags

All elements must have a beginning and ending tag. The opening tag of an element is written between (< ) less than and ( >)greater than sign example, < student >. The ending tag is written between (< ) less than followed by a (/) forward slash and the ( >)greater than sign example, < /student >.

Data between the opening and closing tags of an element are its contents.

For example,

< student >Nick Price< /student >

Here Nick Price is the content of the element. Most of the browsers ignore whitespaces between the tags

< student > Nick Price < /student >

Is same as

< student >
Nick Price
< /student >

Note: Unlike HTML single tags(like < /br > in HTML ) in XML are not possible.

Tag naming rules

XML Names must begin with
– A letter,
– underscore(_),
– colon (:) and
– valid name characters including the preceding plus digits, hyphens (-) or full stops .
The colon character should not be used, except as a namespace delimiter
XML naming conventions is not limited to ASCII characters and ideographic characters could be used.
It may not begin with the string “xml”,”XML”, or any match of these characters

Based on above rules examples of

Invalid tags

< .stock >< /.stock >
< product1 >< / product1 >
< product^stock >< /product^stock >

Valid tags

< _stock >< /_stock >
< product1 >< /product1 >
< product-stock >< /product-stock >

Root and child elements

The root Element is the first element in a document and it contains all other elements. In the following example student is the root element and all other elements are contained within it (name, roll-number) are child elements.

< student >
< name >
Bill Gates
< /name >
< roll-number >
55
< /roll-number >
< /student >

In XML one cannot overlap tags. The opening and ending tags of child elements must be inside the parent element. Overlap of tags with siblings is not allowed as shown in the following example.

< student >
< name >
< roll-number >
Jason
< /name >
< /roll-number >
< /student >

The proper format is as follows

< student >
< name >
Jason
< /name >
< roll-number >
< /roll-number >
< /student >

The root element is also called the Document element. There is only one root element . All other elements lie within the root.

NOTE: A Tag could be empty i.e. contain no data like the roll-number tag in above example. Such tags are called EMPTY ELEMENTS.

Attributes

Attributes give the information about the elements. They can be specified only in the element start tag and their values are enclosed strictly in double quotation-mark. This is unlike HTML where attributes could be in single, double or without quotations.

Syntax: < tag attribute = “value” >description < /tag >

Example:

< problem size=“huge” cause=“unknown” solution=“run away” >

If elements are the “nouns” of XML, then attributes are its “adjectives”.An Element can have zero, one or more attributes. Also an attribute name can only appear once within an element

Bad: < Test name=“John” name=“Doe” / >
Good: < Test first=“John” last=“Doe” / >

The million dollar question

When Do I use Attributes?

Unfortunately there is no definite answer to this question. There are many contrarian views on the use of attributes. It is widely accepted belief that attributes are metadata i.e. data about data . In such scenarios use of attributes is recommended. For example , the lang attribute describing the language of the content of the element.

Entities

Entity references are placeholders for other values that are otherwise reserved in the language or that maybe misinterpreted. For example the less than (< ) and the greater than ( > ) symbols are reserved for demarking the tags. If the entity description itself contains one of these symbols the data would be misinterpreted. To avoid such a scenario Entities are used. The ampersand (&) symbol is reserved to indicate start of an entity.

The various predefined entities are as follows

<	LESS THAN
>	GREATER THAN
&	AMPERSAND
"	QUOTATIONS
&apos	APOSTROPHEE

.

Character data sections

Character data sections contain raw data that are not parsed by XML parsers.

Syntax : < ![CDATA[ raw data ]] >

Example:

< book ISIN = “INB101235647” >
< author >
Kacey Price
< ![CDATA[ kacey has also authored “Complete Reference ” series]] >
< /author >
< /book >

Comments

Comments are enclosed in < !—Comments — >

Example :

< !—This is start of second child element — >

Processing instructions

Processing instructions are used to pass information to applications which use this information to execute special task.

Syntax : < ? ? >

Example

< ?xml version=”1.0” encoding= “ISO-8859-1”? >

NOTE: Here the version attribute specifies the version of XML being used while encoding gives the encoding format for parsers.

A xml document displayed in IE 5.0 or above.

[catlist id=166].