Thesauri and lists of keywords are stored by DC-X in its topic map – a set of database tables modeled after the XML Topic Maps (XTM) 1.0 standard. For an introduction to topic maps, see the wonderful article The TAO of Topic Maps by Steve Pepper.

So far we have implemented merely half of the XTM standard; we’ll look into supporting more of it when the need arises. But the core concepts are all there. – [By the way: Why not RDF? Because topic maps are a higher-level abstraction (RDF triples have less semantics built in) and seemed to provide more value „out of the box“…]

The benefits of treating thesaurus and list terms as topics in a topic map:

  • Built-in support for multiple names, which we’re using to store translations for terms: All lists and thesauri can now be multi-lingual.
  • Class/instance relationship between terms; the „City“ list is itself a topic, „Hamburg“ and „Oslo“ are instances of the „City“ topic. This way an unlimited number of lists or thesauri can co-exist. Terms can even belong to multiple lists.
  • Arbitrary relations between terms: A thesaurus hierarchy is modeled using associations like „broader/narrower“ or „synonym/preferred term“. Geographic hierarchies can use „part/whole“ associations.
  • External identifier URIs can be specified for any term, so metadata can be mapped to metadata of other software using RDF, or anything else that points to the same URI.
  • Custom metadata can be attached to any term. We’ll use this for thesaurus „scope notes“, geo coordinates for cities etc.

We are already importing the (multi-lingual) IPTC subject codes thesaurus and CLDR language and country name lists into the DC-X topic map via the XTM XML format. Importing custom thesauri (in a few common text file formats) is also supported. A couple of DC-X fields are set up to auto-fill lists in the topic map as documents with new values come in. Lists and thesauri can be used for auto-completion during document editing, or for lookup in an „assistant dialog“.

In an upcoming DC-X release we will add a simple topic map browser and editor so that administrators can modify lists and thesauri, and we will be looking into automatically following „use/preferred term“ relations so that the administrator can define values that are automatically to be corrected during document import.

Differences compared to DC5: Lists and thesauri are not stored as flat files in the file system anymore, they live in the database. They are available out of the box in DC-X with much less configuration overhead. Multiple languages are now supported. All kinds of relations between terms are now possible, not just simple hierarchies.

6 Responses to DC-X: The Topic Map
  1. Avatar

    I’m happy to see that you are creating a Topic Maps engine, but slightly worried that you seem to be using the XTM 1.0 spec as your reference. That has lead people astray before. You may find that you are better off working from the Topic Maps Data Model(TMDM).

    I assume DC-X is a commercial system, right? And probably only sold together with services?

    • Avatar

      Yes, DC-X is a commercial system – and at the moment we or one of our partners will need to set it up, but I hope we’ll soon offer a cheaper, standardized package. (But I’m not a sales guy.) Why are you asking?

      Thanks a lot for the TMDM link!

      And thanks for all your work on Topic Maps… I’ve been watching TM for years and am very happy that I finally can start implementing a topic map in our product. Wish me luck 🙂

  2. […] the company’s blog entry: The benefits of treating thesaurus and list terms as topics in a topic […]

  3. Avatar

    I was asking because if DC-X is commercially available I want to list it in my index of Topic Maps tools. When the standardized package is available, please let me know so I can list it.

    And good luck with the topic map! 🙂

  4. […] activities could be centralized in one piece of software, as it is with Ontopia or apparently with DC-X, to name an open source and commercial […]

  5. A glimpse of the DC-X topic map engine @ Digital Collections Blog 27. August 2010 at 15:33

    […] have written before about the embedded topic map engine in DC-X. It’s not meant to be a standalone engine for generic use, but does a great job […]


Leave a Reply