XML – Elements in Document Type Definitions (DTD)
In this tutorial you will learn about Elements in DTD, Elements, child elements (nested elements), declaring elements with character data only, declaring elements with mixed content, declaring elements with any content, declaring elements with no content and element order indicators and qualifiers.
Elements in DTD.
ELEMENTS
Every element used in the valid XML document must be declared in the Document’s DTD.
SYNTAX : < !ELEMENT element_name content_specification >
element_name: Specifies name of the XML tag
Content_specification: Specifies the contents of the element which could of the following five types
I) Standard Content
II) Only Character Data
III) Mixed Content
IV) AnyType of Content
V) No Content
CHILD ELEMENTS (NESTED ELEMENTS)
Most Element declarations define one or more child elements.
For Example, < !ELEMENT customer (customer_name) >
Here , Element customer contains one and only one nested element i.e. customer_name
< !ELEMENT Address (Name, Street, City) >
Here, element Address contains three nested elements Name, Street and City respectively
SYNTAX: < !ELEMENT parent (child) >
OR
< !ELEMENT parent (child1,child2, . . . , childN) >
DECLARING ELEMENTS WITH CHARACTER DATA ONLY
Top level elements generally contain other elements but low-level elements may contain parsed character data. In XML, #PCDATA is the keyword to declare elements with parsed character data. An element declared as #PCDATA
Can contain character data
Can contain Entities such as <, >
Cannot contain other elements
SYNTAX < !ELEMENT element_name (#PCDATA) >
Example:
– < !ELEMENT Street (#PCDATA) >
• Element Street contains the parsed character data
#CDATA is another keyword to declare character data. But unlike #PCDATA, whitespaces are retained as it is in #CDATA.
SYNTAX < !ELEMENT element_name (#CDATA) >
Example: < !ELEMENT City (#CDATA) >
Here, DTD declares the City element to contain character data.
In XML, document < City > London < /City >
The XML, parse will take take the data as “ London ” and not as “London”
as in the case of #PCDATA
.
.
DECLARING ELEMENTS WITH MIXED CONTENT
At times it is required to declare elements with mixed content i.e. both data and other elements. In such situations the pipe symbol (|) is used.
SYNTAX < !ELEMENT parent (#CDATA or #PCDATA,child1,child2, . . . , childN) >
Example:
< bank >
This account is Active
< account >123456< /account >
This account is Closed
< account >423578< /account >
< /bank >
DECLARING ELEMENTS WITH ANY CONTENT
In Real world scenarios, the developer is many a times not sure about the exact document structure while creating the DTD. At such times, ANY keyword comes handy. An element declared as ANY can
- Contain child elements
- Contain character data
- Contain mixed content
SYNTAX: < !ELEMENT element_name ANY >
DECLARING ELEMENTS WITH NO CONTENT
Sometimes it is required that an elements has only attributes but no data. In such scenarios the EMPTY keyword is used.
SYNTAX : < !ELEMENT element_name EMPTY >
ELEMENT ORDER INDICATORS AND QUALIFIERS
The various order and qualification governing symbols are listed in the table append below
TYPE |
VALUE |
CONTEXT |
DESCRIPTION |
ORDER |
| |
Choice |
Either one child element or another can occur |
|
() |
Group |
Groups related elements together |
|
, |
Sequence |
Element must follow another element |
QUALIFIER |
? |
Optional |
Elements appear once or not at all |
|
* |
Optional and Repeatable |
Elements appear zero or more times |
|
+ |
Required and Repeatable |
Elements appear one or more times |
EXAMPLES:
The pipe symbol (|) specifies choice. So occurrence of either of the chiold element is considered valid by the parser.
Following declaration specifies that name must contain either first_name or last_name
< !ELEMENT name (fist_name | last_name) >
Thus,
< name >
< first_name >Nick< /first_name >
< /name >
as well as
< name >
< last_name >Price< /last_name >
< /name >
are valid.
.
.
.
The sequence operator (,) is used to provide sequence of child elements.
The following declaration requires first_name followed by middle_name followed by last_name
< ! ELEMENT name (fist_name, middle_name, last_name ) >
< name >
< first_name >Nick< /first_name >
< middle_name >John< /middle_name >
< last_name >Price< /last_name >
< /name >
is valid while
< name >
< last_name >Price< /last_name >
< first_name >Nick< /first_name >
< middle_name >John< /middle_name >
< /name >
or
< name >
< middle_name >John< /middle_name >
< first_name >Nick< /first_name >
< last_name >Price< /last_name >
< /name >
or any other such combinations are invalid.
The Optional operator (?) is used to declare zero or once appearance of the element
Thus for the declaration
< !ELEMENT EVENT (LOCATION,SPONSOR?) >
< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< /EVENT >
OR
< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< SPONSOR >Flying Toys< /SPONSOR >
< /EVENT >
are valid.
while
< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< SPONSOR >Flying Toys< /SPONSOR >
< SPONSOR >Plastic Toys< /SPONSOR >
< /EVENT >
is invalid.
The plus sign (+) is used to indicate one or more instances of the element.
Thus for the declaration
< !ELEMENT EVENTLIST (EVENT+) >
< EVENTLIST >
< EVENT >Balsa Wood Flyer Days< /EVENT >
< EVENT >Sundays in the Park< /EVENT >
< EVENT >Teach Your Child to Fly< /EVENT >
< /EVENTLIST >
or
< EVENTLIST >
< EVENT >Balsa Wood Flyer Days< /EVENT >
< /EVENTLIST >
are valid
while
< EVENTLIST >
< /EVENTLIST >
is not valid.
The asterisk (*) signifies zero or more appearances of the elements.
Thus for the declaration
< !ELEMENT EVENT (LOCATION*, EVENT-NAME) >
< EVENT >
< LOCATION >West Bay Ballpark< /LOCATION >
< LOCATION >North Side Park< /LOCATION >
< EVENT-NAME >Sundays in the Park< /EVENT-NAME >
< /EVENT >
OR
< EVENT >
< EVENT-NAME >Sundays in the Park< /EVENT-NAME >
< /EVENT >
are valid.