Being a few months into a medium-sized DC-X installation, I’d like to share a few real-world numbers regarding image and text import speed. During mass import runs, the system had a relatively high load but was still usable. I’m quite happy with the performance so far:

  • 50,000 images imported per hour (off-the-shelf DC-X importer); includes generation of preview images
  • 400,000 text articles (XML) imported per hour (minor performance tweaks needed, 8 parallel processes)
  • 800,000 documents indexed per hour by the Solr full-text search server

The total number of documents in that DC-X instance is currently 3.2 million, with the data taking up 44 GB in MySQL and 31 GB in Solr (plus the actual image, PDF and other files). A full optimization run of the Solr index takes 25 minutes.

The servers DC-X is running on (set up by Janz):

  • Three IBM System x3650 M2, each with:
  • two quad-core Intel Nehalem processors (Xeon X5570 @ 2.93GHz/1333MHz/8MB L3)
  • 48 GB RAM

The first server is running MySQL and Apache, the second one Solr and regular import processes and Apache, the third one Apache plus occasional mass import processes. Storage being used:

Software:

Leave a Reply