Twice recently, in two different large companies, people whom I have otherwise respected as being quite sensible have said to me that their organisations had appointed people who, within their own function, were ‘responsible for data quality’.
Well, when I went to Data Governance School back at the beginning of the century, the first thing that we were taught – and the first principle of data quality management – was:
So what’s going on? Is it me? Have I missed some major new insight into the effective management of data quality? I don’t think so – please tell me if I am wrong – but this is so fundamental that we need to get it clear.
There are a number of aspects to managing data quality – let’s look at them in turn.
- Deciding what ‘good quality’ looks like. It doesn’t necessarily have to be perfect, but it has to be fit for the purpose for which it is intended to be used. Somebody needs to be responsible for this, and for letting people know when it’s not fit for purpose; this is the role of someone who is normally called a ‘data steward’.
- Measuring the quality. If you can’t measure it, you don’t know whether it’s any good – and you won’t be able to tell if it’s getting better (or worse, come to that). It’s normally the job of IT to design and implement ‘data quality scorecards’, but the business needs to be able to state the criteria.
- Improving the quality. Maybe this is what people mean when they say that they’ve nominated an individual to be ‘responsible for quality’ – but this is not the same thing by a long way. If you ask someone to do this, expect them to ask three questions:
- “What’s the quality like today?”
- “How will I know if it’s got better?”
- “What if I can’t get the resource because other people’s problems are bigger than ours?”
You can’t improve it if you can’t measure it, and you shouldn’t be expected to improve it if it’s already fit for purpose and / or other business areas have a higher priority need to improve the data that they use.
- Running the data quality management process. Best practice dictates a formal process for managing data quality:
- identifying the problem and giving it a quick triage to see how urgent it is;
- logging it on a formal register, with enough information for the analysts to prioritise the diagnosis;
- analysing it to find the root cause, and finding out how to fix it;
- prioritising the fix, maybe with sticking-plaster in the short term while the ‘proper’ solution gets planned and executed;
- fixing it; and finally
- monitoring it to make sure you really did fix it.
Of course you need somebody to run this and to manage any ‘fix’ projects; if you have a central data quality team, that’s where responsibility should lie (and if you haven’t got one, get one).
It’s really not that difficult
True, there are a number of aspects to consider as we’ve seen – and it needs to be backed up by proper data governance to make sure that the fixes stay fixed and the quality doesn’t deteriorate again.
But, if you’re going to tell specific individuals that they’re ‘responsible for data quality’, don’t expect the rest of the organisation to feel ownership. Don’t make data quality “somebody else’s problem”