bcxml taxonomy format

Introduction

This is the document describing the (early) 2003 revision of the bcxml taxonomy format. It is called a taxonomy because it uses an inheritance hierarchy. Having a structure (for instance a hierarchy) is a core property of a taxonomy. As it formally specifies the concepts in the hierachy by adding properties (and also describing/defining those properties), it can rightfully be called a (formal) ontology. Historically, though, it is called a taxonomy, so we'll stick with that.

The bcxml taxonomy format and the data format were originally developed in the eConstruct (http://www.econstruct.org/) project in 2000-2002. Main authors were Michel Böhms, Peter Bonsma, Reinout van Rees and Frits Tolman. The format described in this document introduces only relatively small changes.

The UML diagram shows the model in an overview. The utility concepts used by the rest of the model are white, the core concepts, taxonomy and object, are light yellow (or light gray in b/w) and the property handling concepts are dark gray.

The diamonds indicate xml elements contained in other elements, the one with the diamond being the enclosing one. A closed one indicates that elements cannot be accessed directly, whereas <Object> can be accessed individually as it is connected with an open diamond. In practice this means that you can select individual <Object>s in the taxonomy and that you can access the properties, units etcetera only through that object.
UML attributes are mapped to attributes in xml, for instance <Taxonomy xml:lang="de"/>.
The triangles indicate that elements inherit the enclosed elements and attributes of the element which the triangle points towards. So <Taxonomy>, <Object> etc. contain <Name>s, <Synonym>s etc. because they inherit this from <Description>.
The arrows with a label xxxxRef indicate likewise named elements that contain the value of the "id" attribute of the <Object> they wish to point towards. So an <PartRef>Door</PartRef> points towards the object with id "Door".
{xor} and {[A-Z]} indicate restrictions. The first is used to indicate that a <Property> can either contain a <AllowedObjectRef> or a <Measure>. The second is used to restrict the allowed values.

Figure 1. UML diagram - the bcxml taxonomy format

Utility concepts used for translation and description

Before we start off with the explanation of the core concepts in the bcxml taxonomy format, it is best to first introduce two utility concepts that are used in the rest of the document.

Figure 2. UML diagram - description and translatable

Translations

Small abstract type definition to add an xml:lang attribute to a string. xml:lang allows you to specify the language of the string. Languages are of the form 'en' for English, 'nl' for Dutch etc. Country specifications can be added to them to allow for different words in the different language variants: 'en-US', 'en-UK', 'nl-NL', 'nl-BE'. See http://www.ietf.org/rfc/rfc1766.txt for a detailed explanation

Figure 3. XML schema - TranslatableType

<xs:complexType name="TranslatableType" abstract="true">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute ref="xml:lang" use="required"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

Figure 4. XML - example for translation

<Name xml:lang="en-UK">elevator</Name>
<Name xml:lang="en-US">lift</Name>
<Name xml:lang="nl">lift</Name>

External reference

<ExternalReference> allowes you to link to an external document. That document can be an image, an archicad file, a legal document, etcetera. You can indicate the type of document using the optional type attribute. There is no prescription for the content of type, but we propose the use of "norm", "image", "legal", "documentation" and "cad".

Figure 5. XML schema - ExternalReferenceType

<xs:complexType name="ExternalReferenceType" abstract="false">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="type" use="optional"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

Figure 6. XML - example for external reference

<ExternalReference type="documentation">
  http://www.m-w.com/cgi-bin/dictionary?book=Dictionary&va=elevator
</ExternalReference>
<ExternalReference type="image">
  http://www.bcxml.org/images/Door.jpg
</ExternalReference>

Description

<Description> is also an abstract type that won't show up in the taxonomy, the functionality it offers is used, though, in <Taxonomy>, <Object>, <Property> and <RestrictedValue>. It allows the following four describing elements to be added. The first three use the mandatory xml:lang attribute to specify the language of the text.

<Name> is the principal name. If multiple terms are equally appropriate you still HAVE to pick one as <Name> as it isn't much use to have the computer pick a different one every time at random. You have to have at least one <Name>.
<Synonym>s can be added for alternatives. It is optional.
A <Explanation> is a textual explanation (or description). Just a string without additional formatting. It is optional.
Also available is <ExternalReference>, allowing you to add URLs pointing towards additional documentation, legal documents, images, etc. It is optional.

Figure 7. XML schema - DescriptionType

<xs:complexType name="DescriptionType" abstract="true">
  <xs:sequence>
    <xs:element name="Name" type="TranslatableType"
      minOccurs="1" maxOccurs="unbounded"/>
    <xs:element name="Explanation" type="TranslatableType"
      minOccurs="0" maxOccurs="unbounded"/>
    <xs:element name="Synonym" type="TranslatableType"
      minOccurs="0" maxOccurs="unbounded"/>
    <xs:element name="ExternalReference" type="ExternalReferenceType"
      minOccurs="0" maxOccurs="unbounded"/>
    <xs:any namespace="##other" processContents="lax"
      minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
  <xs:anyAttribute namespace="##other" processContents="lax"/>
</xs:complexType>

Figure 8. XML - example for description (and translation)

<...some...element... id="Elevator">
  <Name xml:lang="en-UK">elevator</Name>
  <Name xml:lang="en-US">lift</Name>
  <Name xml:lang="nl">lift</Name>
  <Synonym xml:lang="nl">personenlift</Name>
  <Synonym xml:lang="nl">goederenlift</Name>
  <Explanation xml:lang="en">
    an apparatus or machine used for hoisting
  </Explanation>
  <ExternalReference type="documentation">
    http://www.m-w.com/cgi-bin/dictionary?book=Dictionary&va=elevator
  </ExternalReference>
</...some...element...>

Core concepts: taxonomy with objects

Taxonomy

Figure 9. UML diagram - taxonomy

<Taxonomy> is the toplevel element. It contains either the complete taxonomy or a subset extracted from it. It contains elements of type <Object>.

Figure 10. XML schema - Taxonomy

<xs:element name="Taxonomy" type="TaxonomyType"/>

Figure 11. XML schema - TaxonomyType

<xs:complexType name="TaxonomyType">
  <xs:complexContent>
    <xs:extension base="DescriptionType">
      <xs:sequence>
        <xs:element ref="Object"
          minOccurs="1" maxOccurs="unbounded"/>
      </xs:sequence>
      <xs:attribute ref="xml:lang" use="required"/>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

Figure 12. XML - example for taxonomy

<Taxonomy xmlns="http://www.bcXML.org/2003/bcXML" xml:lang="en">
  <Name xml:lang="en">bcbuildingdefinitions</Name>
  <Explanation xml:lang="en">
    Taxonomy exported from the LexiCon. 
    Default language is English.
  </Explanation>
  <Object ....>
    ....
  </Object>
  <Object ....>
    ....
  </Object>
  <!-- etcetera -->
</Taxonomy>

Object

Figure 13. UML diagram - object

<Object>s are always embedded in a <Taxonomy> toplevel element. <Object> is the most important element, containing the information on the objects, or concepts, that are described and defined in the taxonomy.

An object can be defined as an concept that is of interest to the building industry which you want to make explicit by adding properties. So on the one hand side it could be something concrete, something touchable like a table, a foundation pile, a dragline. On the other hand it can be an activity (like painting or pile driving) or something more abstract or generic like material, process, space.

<Object> can point to a supertype object (from which it inherits it's properties etc.) using the <SupertypeRef> element. The bcxml taxonomy format offers only a simple form of inheritance. All <PartRef>s and <Property>s of the supertype(s) are copied. There is no mechanism to allow you to overwrite part of the supertype's properties as a property's identity hinges on the "id" attribute and you can't have duplicate id's.

Next follows a list of <Property>s.

<PartRef>s reference <Object>s that are parts of this object. This is formally called "decomposition". A letterbox is part of a door, for instance. Or a room is part of a floor is part of a building is part of a building site.

Lastly, a mandatory id attribute is made available. This is used to uniquely identify items within the taxonomy. It is defined to have the NCName type, which means that it should start with a letter (a-z, A-Z). It may only contain letters, digits (0-9), underscores (_), dashes (-) and points (.). The rest is prohibited (including spaces, tabs and slashes (/)).

Figure 14. XML schema - ObjectType

<xs:complexType name="ObjectType">
  <xs:complexContent>
    <xs:extension base="DescriptionType">
      <xs:sequence>
        <xs:element name="SupertypeRef" type="xs:NCName"
          minOccurs="0"/>
        <xs:element name="Property" type="PropertyType"
          minOccurs="0" maxOccurs="unbounded"/>
        <xs:element name="PartRef" type="xs:NCName"
          minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
      <xs:attribute name="id" use="required" type="xs:NCName"/>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

The element definition below adds a number of key definitions to speed up crossreferences to <Object>s in the taxonomy.

Figure 15. XML schema - Object

<xs:element name="Object" type="ObjectType">
  <xs:key name="ObjectKey">
    <xs:selector xpath="Object"/>
    <xs:field xpath="@id"/>
  </xs:key>
  <xs:keyref name="SupertypeKeyRef" refer="ObjectKey">
    <xs:selector xpath="Object"/>
    <xs:field xpath="SupertypeRef"/>
  </xs:keyref>
  <xs:keyref name="PartKeyRef" refer="ObjectKey">
    <xs:selector xpath="Object"/>
    <xs:field xpath="PartRef"/>
  </xs:keyref>
  <xs:keyref name="RestrictedValueKeyRef" refer="ObjectKey">
    <xs:selector xpath="Object/Property"/>
    <xs:field xpath="AllowedObjectRef"/>
  </xs:keyref>
</xs:element>

Figure 16. XML - example for object

<Object id="Wood">
  <Name xml:lang="en">wood</Name>
</Object>
<Object id="Oak">
  <Name xml:lang="en">oak</Name>
  <SupertypeRef>Wood</SupertypeRef>
</Object>
<Object id="Letterbox">
  <Name xml:lang="en">letter box</Name>
</Object>
<Object id="WoodenDoorleaf">
  <Name xml:lang="en">wooden door leaf</Name>
  <Property localid="doorleafHeight">
    <!-- See "Property handling" -->
  </Property>
  <Property localid="madeOf">
    <AllowedObjectRef>Wood</AllowedObjectRef>
  </Property>
  <PartRef>Letterbox</PartRef>
</Object>

Property handling

Property

Figure 17. UML diagram - Property

<Property>s are always associated with <Object>s, they are never used on their own. Properties should be seen in their context, which means their enclosing object. Semantics (meaning) depends on the context. A property like "height" has a completely diffent meaning (semantics) when it is the height of a door compared to the height of the budget.

Properties can handle values in two different ways:

A measure, which basically wraps around a unit and a value. <Measure> is descibed in the next section.
An alternative is to use an <AllowedObjectRef> to point from the property to an object. That object is then seen as the allowed value, so that object and it's subtypes can be used as values. Indicating the object "Window" as the AllowedObjectRef allows you to fill in any of the different kinds of windows that have "Window" as their supertype.
This allows you to be as specific as you need in your data. If "wood" is specific enough as value of the property "material", fine. If you need "oak" (which has "wood" as it's supertype), also fine. You can also fill in some or all of it's properties to make it even more specific.

This element has an localid attribute to make it identifiable within the context of the <Object>.

Figure 18. XML schema - PropertyType

<xs:complexType name="PropertyType">
  <xs:complexContent>
    <xs:extension base="DescriptionType">
      <xs:choice>
        <xs:element name="Measure"
          minOccurs="0" maxOccurs="unbounded">
          The part about MeasureType is displayed in Figure 21
        </xs:element>
        <xs:element name="AllowedObjectRef" type="xs:NCName"
          minOccurs="0" maxOccurs="unbounded"/>
      </xs:choice>
      <xs:attribute name="localid" use="required" type="xs:NCName"/>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

Figure 19. XML - example for property

<Property localid="doorleafHeight">
  <Measure>
    <!-- See "Measure" in next section -->
  </Measure>
</Property>
<Property localid="madeOf">
  <AllowedObjectRef>Wood</AllowedObjectRef>
</Property>

Measure

Figure 20. UML diagram - Measure

When you have multiple measures, a <Context> can be added to each measure to indicate the context where that measure is to be used ('NL' for use in the Netherlands for instance, or 'ISO1234' to indicate a official standard).

An optional attached <Unit> can be specified. All characters and digits are OK, as are *, ^ and /. EUR/m^2 is OK for instance. But no spaces. Everything should be specified in S.I. units. So kg, N and m and no mg, kN and km. Also allowed are the standard currency codes like EUR, USD, GBP, SIT, NLG. There is no easy way to check this without making the model substantially bigger, so you have to take good care when writing units.

Also some <RestrictedValue>s can be added to the measure. When present they limit the allowed values for the property to just those values. This element (like <Property>) has an localid attribute to make it identifiable within the context of the <Object>.

Figure 21. XML schema - MeasureType

<xs:complexType>
  <xs:sequence>
    <xs:element name="Unit"
      minOccurs="0" maxOccurs="1">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:pattern value="[a-zA-Z0-9/*^]+"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:element>
    <xs:element name="Context" type="xs:string"
      minOccurs="0" maxOccurs="1"/>
    <xs:element name="RestrictedValue"
      minOccurs="0" maxOccurs="unbounded">
      <xs:complexType>
        <xs:complexContent>
          <xs:extension base="DescriptionType">
            <xs:attribute name="localid" use="required" type="xs:NCName"/>
          </xs:extension>
        </xs:complexContent>
      </xs:complexType>
    </xs:element>
    <xs:any namespace="##other" processContents="lax"
      minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
  <xs:anyAttribute namespace="##other" processContents="lax"/>
</xs:complexType>

Figure 22. XML - example for measure

<Property localid="...">
  <Measure>
    <Unit>kg/m</Unit>
  </Measure>
</Property>
<Property localid="fireResistance">
  <!-- Fireresistance is measured differently in the UK and
      NL. UK: a class. NL: in minutes. Each according to a 
      local standard. -->
  <Measure>
    <Context>UK</Context>
    <RestrictedValue localid="FD30">
      <Name xml:lang="en">FD30</Name>
      <Explanation xml:lang="en">
        fire resistance class "FD30" according to UK standard
      </Explanation>
    </RestrictedValue>
    <RestrictedValue localid="FD60">
      <Name xml:lang="en">FD60</Name>
      <Explanation xml:lang="en">
        fire resistance class "FD60" according to UK standard
      </Explanation>
    </RestrictedValue>
  </Measure>
  <Measure>
    <Context>NL</Context>
    <Unit>s</Unit> <!-- That's seconds. -->
  </Measure>
</Property>

XML schema frontmatter

The definitions in the rest of the document are wrapped in the standard xml schema wrapper, declaring namespaces and importing the xml namespace (for the xml:id attribute).

Figure 23. XML schema - Schema frontmatter

<xs:schema targetNamespace="http://www.bcXML.org/2003/bcXML" elementFormDefault="qualified">
  <xs:import namespace="http://www.w3.org/XML/1998/namespace"
    schemaLocation="http://www.w3.org/2001/xml.xsd"/>
  The part about Taxonomy is displayed in Figure 10
  The part about TaxonomyType is displayed in Figure 11
  The part about Object is displayed in Figure 15
  The part about ObjectType is displayed in Figure 14
  The part about PropertyType is displayed in Figure 18
  The part about DescriptionType is displayed in Figure 7
  The part about TranslatableType is displayed in Figure 3
  The part about ExternalReferenceType is displayed in Figure 5
</xs:schema>