Settings Files (2)

From TrillWiki

Jump to: navigation, search

by DREADNOUGHT on 09/04/04 Source

To give you a better idea of how these things work, let's take the example given above and go step by step. First, the whole thing:

<!ENTITY % color 'blue'>
<!ENTITY % ENTcolor '<!ENTITY &#37; colordtd SYSTEM "mods\%color;\colors.dtd" >'>
%ENTcolor;
%colordtd;

Yum... BlueTurtle code -Tometheus

Now, at the beginning, we have

<!ENTITY % color 'blue'>

That's the easy part. It's a dtd entity %color; defined to 'blue'. Next, we have this long entity:

<!ENTITY % ENTcolor '<!ENTITY &#37; colordtd SYSTEM "mods\%color;\colors.dtd" >'>

That's a dtd entity %ENTcolor;, and when it's called, there are two things that get processed: First, the &#37; forces the parser to evaluate the string, since parsers are required to translate it to '%' immediately. While the string is being parsed, '%color;' gets converted to 'blue'. So, when the line is processed by the call:

%ENTcolor;

Since %ENTcolor; is a dtd entity, it gets evaluated immediately and Trillian (or whatever is parsing the XML) now sees that string as if it were part of the dtd itself:

<!ENTITY % colordtd SYSTEM "mods\blue\colors.dtd">

Finally, the dtd calls %colordtd;. Now that we've defined the %colordtd; entity (albeit in a roundabout way), This can now be safely used, and it evaluates the contents of %colordtd. Previously, when we evaluated %ENTcolor;, it got turned into the string

<!ENTITY % colordtd SYSTEM "mods\blue\colors.dtd">

but the same thing does not happen for %colordtd; because it is a SYSTEM entity. To put that another way, Trillian (or whatever is parsing the XML) does not see the string "mods\blue\colors.dtd" as if it were part of the dtd itself. Instead, it sees the contents of that file as if they were a part of the dtd itself.

Let's say mods\blue\colors.dtd contained this:

<!ENTITY windowColor '
<color red="255" green="128" blue="0">
<rect>
<left num="0" width="0"/><right num="0" width="1"/>
<top num="0" height="0"/><bottom num="0" height="1"/>
</rect>
</color>
'>


Once our dtd above has been processed, our original contents:

<!ENTITY % color 'blue'>
<!ENTITY % ENTcolor '<!ENTITY % colordtd SYSTEM "mods%color;colors.dtd" >'>
%ENTcolor;
%colordtd;


will have been changed into this: (without the comments that I've added)

<!ENTITY % color 'blue'>
<!ENTITY % ENTcolor '<!ENTITY % colordtd SYSTEM "mods\%color;\colors.dtd" >'>
<!-- the following line used to be %ENTcolor; -->
<!ENTITY % colordtd SYSTEM "mods\blue\colors.dtd">
<!-- the following line used to be %colordtd; -->
<!ENTITY windowColor '
<color red="255" green="128" blue="0">
<rect>
<left num="0" width="0"/><right num="0" width="1"/>
<top num="0" height="0"/><bottom num="0" height="1"/>
</rect>
</color>
'>


Now, let's say that we have this in our trillian.xml file:

%color;
%ENTcolor;
%colordtd;
&windowColor;


Once that has been processed, it will look like this:

%color;
%ENTcolor;
%colordtd;
<color red="255" green="128" blue="0">
<rect>
<left num="0" width="0"/><right num="0" width="1"/>
<top num="0" height="0"/><bottom num="0" height="1"/>
</rect>
</color>


Notice that the dtd entities didn't get parsed. If we had tried to force them to be parsed by using, for example, &ENTcolor; instead of %ENTcolor;, we would have gotten an error, because &ENTcolor; has not been defined; only %ENTcolor; has.

Now, let's see another example, to try the opposite (using an XML entity in the DTD).

Let's start with a fairly simple trillian.dtd:

<!ENTITY ent1 '<foo/>'>
<!ENTITY % ent2 '&ent1;'>
<!ENTITY ent3 '%ent2;'>
<!ENTITY ent4 '&ent1;'>


The XML parser will see %ent2; defined as the string "&ent1;" (not as "<foo/>"), and it will also see &ent3; defined as the string "&ent1;" (not as "%ent2;"). The entity &ent4;, as you might expect, is also defined as the string "&ent1;"

If we were to use all of these in trillian.xml:

&ent1;
%ent2;
&amo;ent3;
&ent4;


the result would be

<foo/>
%ent2;
<foo/>
<foo/>


You might be wondering, "You told me &ent3; and &ent4; were both defined as the string &ent1;. Why did &ent3; and &ent4; evaluate to in the xml instead of evaluating to the string &ent1;?" This happens because when trillian.xml was being parsed, the parser didn't just use the contents of &ent3; and &ent4; directly, it evaluated them first, similar to how %ENTcolor; was evaluated before its contents were placed into the dtd in the previous example.

If you had actually wanted &ent3; and &ent4; to contain the string "&ent1;" after being evaulated in trillian.xml, you would have needed to declare ent3 as:

<!ENTITY ent3 '&ent1;'>


Now comes the tricky part (only the most curious or masochistic should read on, but I'll try to make it as simple as I can).

If we had an ent5:

<!ENTITY ent5 '&ent4;'>

and used it in our .xml file, it wouldn't show up as "&ent4;", nor even as "&ent1;", but as "<foo/>". It would seem as though ent5 got parsed repeatedly until it couldn't be parsed anymore, before it was finally placed into the .xml. However, this is not exactly the case, as if we had declared ent5 to be '&ent4;', it would have evaluated to '&ent4;' but it would not be evaluated again after that. What really happens for ent5 to evaluate all the way down to is this:

  1. The parser looks in the .xml file and sees &ent5;
  2. The parser looks up ent5 and sees that it is "&ent4;"
  3. The parser parses that string, and when it gets to &ent4;, it has to go look that up as well.
  4. It looks up ent4 and parses it; ent4 contains "&ent1;", so the parser continues on with looking up ent1
  5. The parser finds ent1 and parses it; it contains "<foo/>", which is already fully parsed; the parser is done

For anyone out there that didn't notice a difference, let's look at what the parser sees in the other string, "&ent4;". It parses the string, and it first sees &&, which it converts to just &. The & symbol cannot be parsed any further, so it's done with that part of the string. It then looks to the rest of the string and sees just "ent4;". This cannot be parsed, so the parser is done.


[edit] See also

Personal tools