Data Profiling is nothing new, and unsurprisingly there’s a lot to read about on the topic.
There are even a few tools out there too, and vendors will happily tell you what profiling is. They’ll even tell you why and how you should do it.
The problem is, as tools become more and more powerful and users are overwhelmed with the options available to them, it becomes increasingly difficult for data-folk to know where to start.
Modern Data Profiling tools often come with handy wizards and pointers on how the creators think you should do profiling, but there’s one thing usually missing: relevance to you and your business.
Data Profiling for your business:
You already know what profiling is and how to get started with it, so once you’ve got your hands on a tool (or whilst you’re waiting for it to download/to receive your license) these are the things you should start to think about:
What is my business’ Data Strategy?
If your business doesn’t have a data strategy, it will have aims and objectives. Before you begin ingesting and profiling your business data, you should know what you’re trying to achieve.
What does the data mean to each part of the business?
Marketing is going to have a very different understanding and culture around data to, say, your HR department.
Before you start profiling the data, you should assign business values to it. This can include risk, financial and business value.
Who cares about the data?
Based on your findings from assigning value to the data, you should quickly get an idea of who would be interested in the findings from your profiling. If you can’t find anybody that cares about the data you shouldn’t waste time knocking on closed doors, seek out new advocates.
Who has the ability to change the data?
You don’t want to begin a project only to find out halfway through that you don’t have the correct security permissions to even see the data, let alone fix any problems your profiling uncovers. Make sure you get this sorted before you start.
Who will a change to the data impact?
If you’re not in a test environment, you’re likely to uncover issues with data quality that could have a big impact on the business and the people in it.
This is a good opportunity for you to think about how you might become everybody’s best friend. If your analysts are having a horrible time manually uncovering duplicates that you can quickly find with profiling, you’ve got an opportunity to remove some real business pain for them.
Who does the data help?
This is an important question, but it’s rhetorical because the answer should be everybody.
As an organisation climbs the maturity curve and becomes more data literate, your profiling efforts should be continually identifying upstream benefits to the rest of the business.
We’re not talking about cutting costs here, we’re talking about creating value.
So there you have it: in 2020 and beyond data profiling should start with the right attitude and questions asked even before you’ve loaded your first data set.