Regarding those slow XMMS2 queries…

The screaming groupies disturb my concentration, so I will reluctantly take a few minutes to silence the general hysteria. The secret is no more: we have über-optimized queries in XMMS2!

All thanks go to nesciens, who dived in the optimization documentation of SQLite to fix some stupid bottlenecks of my query generator. The two most important elements are the use of

COLLATION NOCASE

for case-insensitive comparison (instead of using

LOWER(value)

, which disabled indexing), and the creation of better indices on the Media table (including collated indices).

Let’s get some numbers out!

Quick note: in addition to a pair of [key, value] indices (one with BINARY collation and one with NOCASE), he added a pair of [id, key, value] indices. I couldn’t highlight any effect on performance of the latter two, but I left them in for those tests. (Hint: more tests and insight are welcome!)

I ran a series of benchmarks with different search patterns. The execution time has been averaged over 100 executions ran with the command

time for i in $(seq 100); do $CMD > /dev/null; done

. I compared the performance of execution in two versions of XMMS2:

  • default DrKosmos (git hash: f171d33ca13e1715d3b167e8fa958a724eb032ce)
  • optimized DrKosmos (git hash: 94661d22a7437b48c9cb5d4b1cd6483350d0a9a4, nesciens-sqloptims in my tree)

The queries used were the following (all matched songs):

Q1) artist:Air album:”Moon Safari”
Q2) artist:AIR
Q3) artist:Air album:”Moon*”
Q4) artist~Air


In the first set of tests, I used xmms2 as a black-box and ran queries from the CLI. The real execution time was extracted. The data is presented in the following table and graph.

Query Standard Optimized
Q1 368.25 68.46
Q2 396.11 158.97
Q3 331.23 77.23
Q4 961.69 869.53

CLI search execution time
The optimized version is clearly faster than the default version, except for the partial matching query (Q4) which is expensive in both cases.

In the second set of tests, I ran the same queries directly in the SQLite database to measure the pure query time. The user execution time was extracted in that case. Here comes the data in a table and graph.

Query Standard Optimized
Q1 168.89 8.52
Q2 168.02 7.16
Q3 164.53 7.43
Q4 203.00 140.97

SQLite queries execution time
The difference is even more visible here, although I cannot explain why. The benefit of using indices (in Q1-Q3) is unquestionable.

I also benchmarked the performance of running

xmms2 info 1

(read all properties from the DB) with both versions, and the execution time was similar.

In short, these performance improvements should make it possible to activate find-as-you-type in clients, or at least very responsive searches.

Another optimization I would like to implement is searching for a given value in multiple properties (e.g. “artist:Air OR title:Air”). Currently, this is done with expensive JOINs, but it doesn’t have to. I just have to decide in which abstraction layer to improve search in multiple fields (query generator, collections API, etc).

It looks like S4 is not as desperately needed as we first though, although it might still make a very interesting GSoC 2008 project (did you apply yet?)!

Leave a Reply