Latest Publications

A glimpse of the DC-X topic map engine

Posted in August 27, 201015:33hTim StrehleNo Comments »

I have written before about the embedded topic map engine in DC-X. It’s not meant to be a standalone engine for generic use, but does a great job powering the lists, synonyms and thesauri in DC-X.

Until recently, the engine’s functionality had been hidden in the backend and exposed only through our PHP API, SQL, and the command line. But we finally had the time to implement basic topic editing in the DC-X administration interface. I’ve been a topic maps fan since 2003, so it’s been very exciting for me to actually build software based on topic maps (and getting paid for it)! Of course, everything’s still quite ugly, it’s not 100% topic map compliant and lots of features are missing (e.g., editing occurrences). But we’re working on it, and already rolling the first versions out to our customers…

Just two screenshots for those who are curious (I know that the content that’s shown doesn’t make much sense):

TAGS: ,

DC-X 1.3.4 is available

Posted in July 16, 201011:35hTim StrehleNo Comments »

The new DC-X version 1.3.4 is now available for download. It’s one of the larger releases, here’s a few changes:

  • A new “Rights mapping” feature lets you specify which rights profile should be automatically attached to a document when it is imported. All metadata fields can be used for this mapping. Example: If Creator is John Doe and Hotfolder is Images, use the rights profile “John Doe’s images”.
  • During import, the name of the hotfolder will be added to the document metadata. This way you always know how where a document came from.
  • More hotfolder settings can be defined in the administration interface: Pool, required MIME types, MIME type checks.
  • RSS or Atom feeds can be sent to different hotfolders after download. Use this to collect feeds for different user groups, each into their own pool.
  • The first version of a Topic Map editor has been added to the admin interface. Now you can manage list entries, add alias names and translate names.
  • Translation strings for the DC-X user interface can also be edited in the admin interface, making lives easier for our partners who are localizing DC-X into their language.
TAGS:

The Wiki for DC-X documentation has moved

Posted in July 12, 201014:21hTim StrehleNo Comments »

Documentation for DC-X is still a bit thin, but we’re working on it. The documentation we’re writing is in a Wiki that is accessible to our customers and partners.

This Wiki has moved to a new URL today:

wiki.digicol.de

Please let us know if you have difficulties accessing the Wiki.

TAGS:

Demo server upgraded to DC-X 1.3.3

Posted in July 12, 201012:46hTim StrehleNo Comments »

Our demo server has now been upgraded from DC-X 1.2.2 to 1.3.3 – quite a jump! Please let us know if you discover something that seems broken after the upgrade.

Some highlights in the new version:

  • Optimized query performance by shrinking the Solr index and not loading filters with each query. (Click on the filter panel to load filters.)
  • After tagging a document, it should now appear immediately afterwards when clicking on the tab name in the left-hand navigation.
  • New actions “Edit rights” and “Edit usages”.
  • Actions not permitted for the selected documents are now disabled.
  • The document view displayed on double click now shows image captions.
  • Safari 5 has been added to the list of supported browsers.
  • Lots of bug fixes and minor tweaks, as well as a lot of code cleanup behind the scenes and administration improvements.
TAGS: ,

Demo server upgrade

Posted in July 12, 201010:12hTim StrehleNo Comments »

We’re currently upgrading our demo server to the latest DC-X release, which means a few minutes downtime. We’ll post a list of changes here when the update is done.

TAGS:

DC-X UI broken by Vodafone 3G transparent proxy

Posted in June 18, 201016:06hTim StrehleNo Comments »

Our sales team found out that sometimes the DC-X user interface doesn’t work when loaded via a Vodafone Germany 3G UMTS connection – logging in is possible, but the following main page is broken, JavaScript and CSS contain errors and are not executed by the browser.

A look at the HTML source of the page shows that the Vodafone proxy messes with all web pages delivered over UMTS: It inserts some custom bmi.js JavaScript, apparently compresses each image, includes all referenced JavaScript and CSS files directly in the HTML page and minifies it. And somewhere in that process it is destroying the integrity of the JavaScript and CSS, breaking the application.

There’s probably not much we can do about this. We might try minifying all JavaScript and CSS ourselves and hope that it helps… Would be great if Vodafone could at least document that behaviour!

If you are a DC-X user and are affected by this problem, please let us know.

TAGS:

XSLT: xsl:include and XML namespace handling fixed in xsltproc

Posted in May 7, 201000:24hTim Strehle1 Comment »

You’re probably unlikely to run into the same problem, but just in case… I ran into an issue today where on one server, empty XML namespaces were inserted where they hadn’t been before. As I understand it, my sloppy XSLT programming has been tolerated by a bug in xsltproc (libxml2/libxslt) but that bug has been fixed.

Here’s my test.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<in/>

An XSLT file, include.xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template name="external-element">
    <xsl:element name="p"/>
  </xsl:template>
</xsl:stylesheet>

And the main XSLT file, test.xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    xmlns="http://www.digicol.com/xmlns/dcx"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

  <xsl:output method="xml" encoding="UTF-8"/>

  <xsl:include href="include.xslt"/>

  <xsl:template name="internal-element">
    <xsl:element name="p"/>
  </xsl:template>

  <xsl:template match="/in">
    <document>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <xsl:element name="p"/>
        <xsl:call-template name="internal-element"/>
        <xsl:call-template name="external-element"/>
      </body>
    </document>
  </xsl:template>

</xsl:stylesheet>

Our older xsltproc (“Using libxml 20626, libxslt 10117 and libexslt 813″) transformed this into:

<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="http://www.digicol.com/xmlns/dcx">
  <body xmlns="http://www.w3.org/1999/xhtml">
    <p/>
    <p xmlns="http://www.digicol.com/xmlns/dcx"/>
    <p/>
  </body>
</document>

A more current version of xsltproc (“Using libxml 20703, libxslt 10124 and libexslt 813″) and Saxon 6.5.5 both return what I suppose is correct:

<?xml version="1.0" encoding="UTF-8"?>
<document xmlns="http://www.digicol.com/xmlns/dcx">
  <body xmlns="http://www.w3.org/1999/xhtml">
    <p/>
    <p xmlns="http://www.digicol.com/xmlns/dcx"/>
    <p xmlns=""/>
  </body>
</document>

Interesting (if you’re into that kind of thing). I must admit that I hadn’t given namespaces much thought in this case.

(The fix is simple: Set the “namespace” attribute of xsl:element.)

TAGS: ,

DC-X: Managing image rights

Posted in March 1, 201014:15hTim StrehleNo Comments »

Usage rights for images – and other types of assets like video – are a tricky, but important part of Digital Asset Management. Am I allowed to use this image for my publication, under which conditions, and will I have to pay for it? Are there any restrictions? A DAM system must help answer these questions.

As is the case whenever money is involved, details matter. Only a subset of my publications might be allowed to use the image, and only for a limited time. I may or may not have to pay again if I reuse the same image. Assets may have to be prohibited from reuse due to legal reasons. An exclusive deal may lock out everyone else from using the same image (or even variants – and probably only for a limited time, or only in a certain geographical region).

Capturing all these details as structured data is the goal of the PLUS Coalition‘s Picture Licensing Universal System. It looks like a great standard for exchanging rights metadata, but we feel it is overkill to use internally within our DC-X DAM system: We need something that’s simple enough that checking usage rights for hundreds of assets has no noticeable impact on response times. (But we’re still planning to add support for PLUS metadata during image import and export.)

Here’s how DC-X manages usage rights metadata:

There’s a database table for rights metadata properties, with columns for a property name and a value, scope, publication, remark, and date from/to. A simple example: Property name=”Fee Required”, value=”1″, scope=”Online” means that usage on an online website incurs a fee (value=”0″ would mean “no fee”).

Predefined property names are “Contract”, “Embargo”, “Exclusive Rights”, “External Syndication”, “Fee Required”, “Internal Syndication”, “Notice”, “Price Category”, “Purchased”, “Rights Agent”, “Rights Unclear”, “Singular Usage”, “Usage Permitted”. This list will certainly be expanded, and customers can define their own types.

Usually a contract exists with each provider, defining common rights for all images sent by them. To make this easier to handle, DC-X rights metadata properties can be bundled into “rights profiles”. We recommend creating one rights profile per provider (or multiple rights profiles if there’s different conditions for subsets of their images). Example: A rights profile named “Reuters images” could combine the properties UsagePermitted=”1″, FeeRequired=”0″, ExternalSyndication=”0″, Notice=”Credit required” if your contract with Reuters allowed you to use all images sent by them with no additional per-image fee, but images must be credited and redistribution is not allowed. (If different parts of your organization have different contracts with Reuters, you could even use the “scope” or “publication” fields to limit properties to a certain part.) And if your contract changes, you simply update the rights profile; all images will immediately reflect the changes.

Rights profiles must be attached to documents (multiple profiles per document are allowed). This can be done manually, or automatically during ingestion: At the moment, you can configure DC-X hotfolders to automatically attach a certain rights profile to all images coming in through it. In the future, it will also be possible to have DC-X determine the appropriate rights profile by looking at the document’s metadata (like the IPTC Credit or ByLine field).

If you need to override certain properties – like when a certain Reuters image must not be used anymore for legal reasons – you can also attach rights metadata properties directly to documents (without rights profiles). Here’s a screenshot from DC-X that shows an image with both a rights profile and a directly attached property:

dcx-rights

To visualize usage rights in search results, DC-X displays icons like the Euro sign or the globe in the screenshot above. They are called “flags” and represent rules being dynamically applied to each document. Example for the “Euro sign” flag definition: “Display if the rights property FeeRequired=1 exists.” (Flags can do much more; they can inspect other document metadata like IPTC fields or image file properties like size and colorspace.) Any number of flags can be defined by the customer.

Rights properties can be used in the DC-X user interface to detect whether certain actions are allowed (i.e., the export to the online CMS can detect that you’re trying to export an image for which you do not have online usage rights). Finally, usage rights can be queried and even changed through the DC-X Web Service API.

Differences compared to DC5: DC5 had no notion of usage rights built in, rights handling was meant to be implemented during the installation and customization phase.

Import performance numbers from a real-world DC-X installation

Posted in February 25, 201012:54hTim StrehleNo Comments »

Being a few months into a medium-sized DC-X installation, I’d like to share a few real-world numbers regarding image and text import speed. During mass import runs, the system had a relatively high load but was still usable. I’m quite happy with the performance so far:

  • 50,000 images imported per hour (off-the-shelf DC-X importer); includes generation of preview images
  • 400,000 text articles (XML) imported per hour (minor performance tweaks needed, 8 parallel processes)
  • 800,000 documents indexed per hour by the Solr full-text search server

The total number of documents in that DC-X instance is currently 3.2 million, with the data taking up 44 GB in MySQL and 31 GB in Solr (plus the actual image, PDF and other files). A full optimization run of the Solr index takes 25 minutes.

The servers DC-X is running on (set up by Janz):

  • Three IBM System x3650 M2, each with:
  • two quad-core Intel Nehalem processors (Xeon X5570 @ 2.93GHz/1333MHz/8MB L3)
  • 48 GB RAM

The first server is running MySQL and Apache, the second one Solr and regular import processes and Apache, the third one Apache plus occasional mass import processes. Storage being used:

Software:

Major DC-X demo server update

Posted in February 11, 201016:09hThorsten MannNo Comments »

Starting tomorrow we will roll out a major DC-X upgrade on our demo server. Since the last update was back in October, there have been a lot of new features added meanwhile. The upgrade process will start at 10am (GMT+1) and should be finished in the late afternoon. During that time we are expecting a downtime and things will look broken.

If you need access to a working demo server tomorrow or during the weekend, please contact us.

TAGS: ,