Data cleansing is a form of data management to ensure that the information your business stores is accurate, complete, formatted, unique, relevant and up-to-date. How often you clean your data and the methods used can vary depending on the business or industry but no matter what, an effective data cleansing strategy has the capability of being a core driver in business performance.
In a digital dominant now, we rely on data more and more. This could be sending emails to the correct address or ensuring the right amount is on an invoice but in perhaps a more innovative sense, telling our customers what products they might like or what movies to watch and music to listen. Data is becoming the central force behind an enormous amount of business activity.
This article looks at why clean data is so important across several different industries and what the potential risks of negating any sort of data governance at all
Why should data be cleansed?
It is thought that anything up to 70% of data could be out of date after just one year, resulting in neglected sales opportunities or misguiding marketing behavior. IBM has suggested that in 2016, bad data took $3 trillion out of the US economy and spending on redundant, obsolete or trivial data in the UK costs about £435k per year.
Source: Royal Mail
Beyond the marketing and sales opportunities, bad data means you walk the tightrope of being blacklisted by email providers if you don’t have a verification process, have a higher chance of complaints if you continually send poor or wrong information or as a worst case, financial implications if you are not able to align all their sources.
The reason for data cleansing ultimately depends on your industry, budget and objective of doing so.
Other sources suggest an even higher rate of data decay, saying that only 80% of data is valid after each year. (Source: Finelyfetted). If we consider that there are 0.9% of deaths in the population each year, it’s quite a big turn off to customers if you are known to continue sending such inappropriate communications.
The data cleansing process
Prior to beginning your data cleansing, a proper data audit should be conducted as each business may have different data quality issues that should be addressed.
Imagine you are in a meeting with the CEO, COO and CFO looking at revenue for the previous month. The CEO thinks you made $10 million, the COO believes it is $11 million and the CFO thinks it was $9.5 million. In theory, they are all correct, but the problem has stemmed from them all using different data sources and metrics to get their numbers. Confusion like this can quickly cause mass hysteria in a business which is why definitions are highly important before even beginning a cleansing strategy.
Once the data sources are aligned, you can start reviewing the best methods for cleansing as you have a far better visualisation of the potential problems. For example, it might show you are missing lots of email addresses so you need an email verification system or perhaps customer date of births are in different formats and you need to focus on validation.
StrategicDB offer a free data audit report designed to help clients work out which metrics should be normalised or which should be benchmarked. The image below shows how a data audit report can help you begin your data cleansing journey.
Overview of data cleansing methods
1. Email verification
Ensuring you hold the correct email addresses for your customer base has become of the
Most digital businesses will deploy email verification systems that check if the data entered is correct to avoid typos and spam. The best option is to have this running in real-time behind your website to filter out bad quality straight away but if not, an entire cleanse each quarter as a minimum is probably a good idea.
As well as email, many businesses will follow these procedures for checking cell phone details or mailing addresses before the customer is able to register their details. Common methods might be sending you an SMS with a code to verify your details before placing an order perhaps.
2. Data validation
Like email verification, data validation makes sure that any piece of data coming into your business is correct. For example, are dates of birth formatted correctly and are customers giving their correct cell phone numbers.
Front end sites can validate all this information and it is the best way to cleanse data before it even comes into your database. This might work in the form of input masks which force the customer to enter data in a specific way rather than it being free form text.
Finding duplicates in your data and removing or merging them is very important. If a customer is able to have two records, they may start getting duplicate emails, phone calls, letters or text messages perhaps. This is both a poor customer experience and an unnecessary business cost.
De-duplication may not always be 100% accurate but you can do the best job possible. It is quite common for customers to use multiple email addresses for examples and picking up on that is not always simple when searching for duplicate accounts.
However, minimizing duplicate records by checking details on the front end or having rules built in for certain flags can be an excellent cost-saving exercise that ultimately improves customer experience.
4. Data standardisation and normalisation
standardisation and normalisation are management practices for optimisation and
streamlining of your company data.
Standardisation is a way of taking disparate datasets and turning them into the same scale for more accurate analysis using averages or standard deviations. A good example of this is with seasonal businesses. Say you sell ice cream for example at an average of $420 per day but in the summer, you sell $520 per day and a standard deviation of $50. To standardise the value, you would do 520-420/50 = 2 and this is your result. If you sold 600 in a day, it would be 600-420/50 = 3.6. This turns large values in standard formats for more accurate data analysis.
How often should data be cleansed?
There isn’t really a definitive answer to how often data should be cleansed but in an ideal world you’d like everything to happen in real-time, certainly in retail.
For much of the time, you can get front end tools to manage email verification, postcode validation or de-duplication virtually in real-time and if they are costing your business a lot of money this could be the right thing to invest in. If your business is working with Big Data, cleansing, or not doing so, can have a large impact on campaign performance and ROI if quality if not up to scratch. However, some smaller businesses could still be working from spreadsheets, in which case an annual cleanse may be sufficient to ensure your records are kept up-to-date.
The best solution here is to base the regularity of your data cleansing on how much not doing so might be costing you as a business.
Creating a data quality dashboard – cleansing as an asset
Arguably one of the most difficult parts of data cleansing is finding out whether there is a problem and if so, what the problem is. Beyond that, the data quality dashboard is set up to show the cost of any poor data management, allowing senior business leaders to get a view of the impact of data governance to their goals.
Taking a practical example, let’s imagine you are getting several customer returns meaning the company couriers keep bringing packages back to be checked and redistributed. The root cause of the returns is where you don’t have an automated address finder on your business website so the customer is being asked to manually enter their details each time.
The returns you are getting back are due to human errors in typing addresses and could be resolved if you invested in an automated address finder but the Board are not willing to sign off the investment in such technology.
The data quality dashboard will break down the cost of he errors and show how it impacts business income and efficiency. It will highlight the need for investment and give a cost/benefit view for better strategic decision making. Data will be focused on more as an asset in a ledger type sense by having this cost/benefit view available.
As Big Data becomes even bigger, a strong data cleansing strategy has the potential to drive business performance by creating an environment of trustworthy data, improving the efficiencies that impact bottom line profit. As of July 2017, 90% of the World’s data had been created in the previous 2 years. Whilst businesses continue to explore new methods of working with this data, cleansing only become a greater challenge if it isn’t fully managed.
Sales and marketing teams run the risk of higher cost, poor customer experience, lower customer loyalty and the potential for blacklisting if they don’t drive data cleansing strategies in the business. An article by The Havard Business Review has pointed at data costing the US over $3 trillion per year with a major cause being sales teams working with erred prospect or customer information and service teams wasting time dealing with incorrect orders.
Without a cleansing strategy, it is no wonder why senior business leaders fail to trust what they see and rely on gut feel rather than the data they are provided as marketing or sales forecasts prove to be inaccurate.
Data cleansing provides the quality in to get the quality back out.
This post was submitted by a TNS experts. Check out our Contributor page for details about how you can share your ideas on digital marketing, SEO, social media, growth hacking and content marketing with our audience.