Albin Larsson: Blog

Culture, Climate, and Code and Wikidata

12th November 2016

The Kyrksok logo

A few weekends ago Jan Ainali, Lars Lundqist, Ulrika Nilsson, David Zardini and myself attended the Hack4Heritage hackathon in Stockholm. We ended up creating the site, a directory for churches in Sweden.

Kyrksök links together various sources such as Wikipedia, Commons and Bebyggelseregistret and makes their content more accessible and discoverable.

Setting up a site and display content is never that interesting or much of a challenge not even if there is content from a lot of different sources as long as there is links between them.

Guess what,

there was no such links,

Luckily Wikidata is a perfect place to link all the datasets together.

Actually most of the time spent on Kyrksok was spent on linking Bebyggelseregistret to Wikidata and on the existing data, mostly common tasks such as normalizing labels and verifying existing statements and links.

Once all the third-party URIs was in Wikidata, we could fetch a list of churches which had the required data using Wikidatas SparQL endpoint(Kyrksok query: visualized/source).

By using Pywikibot and KSamsok-PY we could then index data from all the data sources we needed without much work. Everything was indexed to a SQLite file. That’s mainly it, we had by then a great dataset indexed form a bunch of different sources.

We then fired up a rest API(Python/Flask) for this SQLite file with both search/bounding box methods, still this API is made in less the 80 lines of code. Just check it out.

The best thing about this is that this approach can be applied to any idea, the data in Wikidata is extremely diverse. Wikidata has become a incredible important source for many of the project I’m involved in including Biocaching.

Go ahead and check out all the Kyrksök repositories over at Github.

(Kyrksök = “church search”)

Related posts