XML – Elements in Document Type Definitions (DTD)

XML – Elements in Document Type Definitions (DTD)

In this tutorial you will learn about Elements in DTD, Elements, child elements (nested elements), declaring elements with character data only, declaring elements with mixed content, declaring elements with any content, declaring elements with no content and element order indicators and qualifiers.

Elements in DTD.

ELEMENTS

Every element used in the valid XML document must be declared in the Document’s DTD.

SYNTAX : < !ELEMENT element_name content_specification >

element_name: Specifies name of the XML tag
Content_specification: Specifies the contents of the element which could of the following five types

I) Standard Content
II) Only Character Data
III) Mixed Content
IV) AnyType of Content
V) No Content

CHILD ELEMENTS (NESTED ELEMENTS)

Most Element declarations define one or more child elements.

For Example, < !ELEMENT customer (customer_name) >

Here , Element customer contains one and only one nested element i.e. customer_name

< !ELEMENT Address (Name, Street, City) >

Here, element Address contains three nested elements Name, Street and City respectively

SYNTAX: < !ELEMENT parent (child) >

OR

< !ELEMENT parent (child1,child2, . . . , childN) >

DECLARING ELEMENTS WITH CHARACTER DATA ONLY

Top level elements generally contain other elements but low-level elements may contain parsed character data. In XML, #PCDATA is the keyword to declare elements with parsed character data. An element declared as #PCDATA
Can contain character data
Can contain Entities such as <, >
Cannot contain other elements

SYNTAX < !ELEMENT element_name (#PCDATA) >

Example:

< !ELEMENT Street (#PCDATA) >

• Element Street contains the parsed character data

#CDATA is another keyword to declare character data. But unlike #PCDATA, whitespaces are retained as it is in #CDATA.


SYNTAX < !ELEMENT element_name (#CDATA) >

Example: < !ELEMENT City (#CDATA) >

Here, DTD declares the City element to contain character data.
In XML, document < City > London < /City >
The XML, parse will take take the data as “ London ” and not as “London”
as in the case of #PCDATA

{mospagebreak}

.

.


DECLARING ELEMENTS WITH MIXED CONTENT

At times it is required to declare elements with mixed content i.e. both data and other elements. In such situations the pipe symbol (|) is used.

SYNTAX < !ELEMENT parent (#CDATA or #PCDATA,child1,child2, . . . , childN) >

Example:

< bank >
This account is Active
< account >123456< /account >
This account is Closed
< account >423578< /account >
< /bank >

DECLARING ELEMENTS WITH ANY CONTENT

In Real world scenarios, the developer is many a times not sure about the exact document structure while creating the DTD. At such times, ANY keyword comes handy. An element declared as ANY can

  • Contain child elements
  • Contain character data
  • Contain mixed content

SYNTAX: < !ELEMENT element_name ANY >

DECLARING ELEMENTS WITH NO CONTENT

Sometimes it is required that an elements has only attributes but no data. In such scenarios the EMPTY keyword is used.

SYNTAX : < !ELEMENT element_name EMPTY >

ELEMENT ORDER INDICATORS AND QUALIFIERS

The various order and qualification governing symbols are listed in the table append below

TYPE

VALUE

CONTEXT

DESCRIPTION

ORDER

|

Choice

Either one child element or another can occur

()

Group

Groups related elements together

,

Sequence

Element must follow another element

QUALIFIER

?

Optional

Elements appear once or not at all

*

Optional and Repeatable

Elements appear zero or more times

+

Required and Repeatable

Elements appear one or more times

EXAMPLES:

The pipe symbol (|) specifies choice. So occurrence of either of the chiold element is considered valid by the parser.

Following declaration specifies that name must contain either first_name or last_name

< !ELEMENT name (fist_name | last_name) >


Thus,
< name >
< first_name >Nick< /first_name >
< /name >

as well as

< name >
< last_name >Price< /last_name >
< /name >

are valid.

{mospagebreak}

.

.

.


The sequence operator (,) is used to provide sequence of child elements.
The following declaration requires first_name followed by middle_name followed by last_name

< ! ELEMENT name (fist_name, middle_name, last_name ) >

< name >
< first_name >Nick< /first_name >
< middle_name >John< /middle_name >
< last_name >Price< /last_name >
< /name >

is valid while

< name >
< last_name >Price< /last_name >
< first_name >Nick< /first_name >
< middle_name >John< /middle_name >
< /name >

or

< name >
< middle_name >John< /middle_name >
< first_name >Nick< /first_name >
< last_name >Price< /last_name >
< /name >


or any other such combinations are invalid.

The Optional operator (?) is used to declare zero or once appearance of the element

Thus for the declaration

< !ELEMENT EVENT (LOCATION,SPONSOR?) >

< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< /EVENT >

OR

< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< SPONSOR >Flying Toys< /SPONSOR >
< /EVENT >


are valid.

while

< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< SPONSOR >Flying Toys< /SPONSOR >
< SPONSOR >Plastic Toys< /SPONSOR >
< /EVENT >

is invalid.

The plus sign (+) is used to indicate one or more instances of the element.

Thus for the declaration

< !ELEMENT EVENTLIST (EVENT+) >

< EVENTLIST >
< EVENT >Balsa Wood Flyer Days< /EVENT >
< EVENT >Sundays in the Park< /EVENT >
< EVENT >Teach Your Child to Fly< /EVENT >
< /EVENTLIST >

or

< EVENTLIST >
< EVENT >Balsa Wood Flyer Days< /EVENT >
< /EVENTLIST >

are valid

while

< EVENTLIST >
< /EVENTLIST >

is not valid.

The asterisk (*) signifies zero or more appearances of the elements.

Thus for the declaration

< !ELEMENT EVENT (LOCATION*, EVENT-NAME) >

< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< LOCATION >North Side Park< /LOCATION >
< EVENT-NAME >Sundays in the Park< /EVENT-NAME >
< /EVENT >


OR

< EVENT >
< EVENT-NAME >Sundays in the Park< /EVENT-NAME >
< /EVENT >

are valid.

[catlist id=166].

Related posts