Albin Larsson: Blog

Culture, Climate, and Code

An Actionable Approach to Data Quality for Cultural Heritage Institutions

30th October 2019

In this post I’m introducing a new data quality portal that we are currently testing with the K-samsöks data partners. It’s a data quality tool without any percentages or metrics.

To solve data quality issues at cultural heritage institutions (or anywhere) two key things need to be achieved:

  1. Awareness - individuals need to be aware of the specific data quality issues present in their data.
  2. Actionability - individuals need to have the tooling and knowledge to find individual and fix quality issues.

The first one is something that aggregation actors have been targeting for a while through spreadsheets and percentages. The screenshot below shows the second(ish) iteration of our licensing statistics/issues spreadsheet that we share with data partners (the code behind it is written by my colleague Marcus and it’s open source).

Sceenshot of spreadsheet containing SOCH license statitics

Some institutions have been able to act upon the insights given by this, while others have not. A common issue is the lack of being able to query their own data, often this is because of lacking capabilities of collection management systems, sometimes it’s a combination of this and user knowledge. No matter what it’s a barrier.

Being an aggregator allows us to lower barriers for 70+ data partners so by curating and building a GUI around advanced queries. So instead of not being able to query their data or being stuck in writing some boolean and/or whatever query in their CMS they can now list problematic objects within a few clicks.

Screenshot of the main SOCH data portal showing possible queries

The current data quality queries available to data partners in the current proof of concept version of the data portal (we and partners have plenty of other ones in mind).

Screenshot of an quality page displaying items with errors in a table.

Following the selection of quality query and providing institution a list of problematic or possible problematic objects are presented to the user in a list containing a link back to the source page or CMS as well as other metadata that might be relevant for the given query.

In a couple of weeks it will be clear if this new approach have an impact. I know what I’m betting on.

My new approach to online privacy

4th July 2019

I have for the last few yeas had a online privacy approach in the style of “Do not put all eggs in the same basket” or exemplified in the style of “If I use Google for email I won’t use it for browsing the web”.

Now after a few years of empirical learning I have decided to change this approach. It’s clear that the owner of “my” online data (the irony) is seldom static nor does it keep the data within its own walls.

My new approach is to create as little online data as possible. Below are some actual examples of things that has lead me towards this decision.

There are probably plenty of cases were these types of issues have been combined and exposed information about me to third parties unknown to me.

What I’m doing to limit online data about me

One might see me as paranoid or a privacy geek but these actions comes from actual concerns and real world examples.

How to set up a Generous Interface Prototype in Less than a Day 🔗

9th April 2019

An Awesome List

9th February 2019

sceenshot of list contents

I created one of those “awesome lists” for K-samsök resources, I have personally found awesome lists useful when starting with something new our just needs to investigate useful components a hobby project. Therefor I decided that it might help someone else to have one list for all the best K-samsök projects and resources so that people might get started quicker with one of Europe’s largest Linked Open Data platforms for heritage data.

The list is on Github and licensed under CC0.

Older PostsNewer Posts