Albin Larsson: Blog

Culture, Climate, and Code

Cache Busting Wikidata SparQL Queries

26th January 2018

UPDATE: By setting a cache-control: no-cache header you can disable this query caching.

The problem

Whenever you write a Wikidata query form which you do not expect the result to be the same each time, you will run into the issue of caching. Lets take the following example, returning a random cat:

SELECT ?item ?itemLabel (MD5(CONCAT(STR(?item), STR(RAND()))) as ?random)
WHERE 
{
  ?item wdt:P31 wd:Q146.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
} ORDER BY ?random 
LIMIT 1

Run in the Wikidata Query Editor

If you run this once you will get a random cat if you run this twice you will get the same random cat.

This is because Wikidata has saved the result, to serve you faster the second time.

The Solution

A solution some people uses is to “just add a space somewhere”, this is possible because Wikidata cache queries based on the SparQL string, not the actual parsed and preformed query.

When used in implementations keeping track of spaces is not an option so I decided to use random comments that could be generated from hashing functions etc:

#01e8c03a6bdfe392431d8189130fcfc0
SELECT ?item ?itemLabel (MD5(CONCAT(STR(?item), STR(RAND()))) as ?random)
WHERE 
{
  ?item wdt:P31 wd:Q146.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
} ORDER BY ?random 
LIMIT 1

Changing the comment string and rerunning the query will result in an new random cat.

When actually used in real world implementations where you might minify your queries I suggest appending the comment just before preforming it.

Final Notes

Related posts