I’m currently at a customer site, and I found that a lot of documents had data in the wrong field („IPTC02015“ instead of „Category“). An excellent opportunity to play with the DC-X command line tools:

php /opt/dcx/bin/dcx_textquery.php –app default ‚+IPTC02015:[* TO *]‘ -m 1000 | php /opt/dcx/bin/dcx_export.php –app default -t document – | sed ’s/IPTC02015>/Category>/g‘ | php /opt/dcx/bin/dcx_update.php –app default –

This invocation performs a fulltext search for all documents which have the field „IPTC02015“ set, feeds the document IDs into a script that exports these documents as XML, runs „sed“ to change the field name, and uses the modified XML to update the document in the DC-X database. (The fulltext index is automatically updated.) Marvellous!

Tim Strehle
About Tim Strehle

Tim was part of Digital Collections' Research & Development team from 1999 to 2017. He is an expert for Metadata and Thesauri.

Leave a Reply