Are You Ready to Tackle Your Junk Data?

Avatar photo

By Ben Harrison  |  September 23, 2021

Mark Twain is quoted, “Everybody talks about the weather, but no one ever does anything about it.” A similar sentiment might be said about junk data. It is a blight which perpetually limits our ability to make data driven decisions which will improve our businesses. Being informed by quality data is more imperative in business than ever to remain competitive and profitable. If your data is ‘junk’, you can never become a data driven organization.


Sometimes junk data accumulates because organizations lack disciplined business processes. Additionally, they may not have the analytics framework in place to monitor those processes and ensure they are followed. Most dimensional or categorical data is generated manually by human input or by automated systems. When these fail, data is assigned inconsistent values and reporting on inconsistent junk data is nearly impossible.


Another cause of junk data is the lack of a planned governed decision framework. Many companies’ analytics efforts are focused on one specific ‘use case’ at a time. Each project becomes a silo built for a specific user or purpose. These independent repositories of data are not built using a common standard which leads to data structures varying widely from one solution to the next.


Alternatively, a governed decision framework will eliminate siloed solutions and have these characteristics:


1. User friendly

You shouldn’t need to be a data scientist to be able to use and understand the data.


2. Formatting

Dates, numbers, percentages, and dimensional values should be formatted the same throughout the data set. If you use ‘96.5%’ in one column, another percentage column in the same data set should not read ‘.965’.


3. Trustworthy

Aggregations of data should be able to be validated against the detail transactions so that end users know they can trust the information and insights obtained.


4. Completeness

If you have dimensional data such as a ‘region’ code, every row of data should always have a value. If the value doesn’t apply, show something like ‘N/A’ instead of leaving it blank. Blank values don’t tell the end user if the field was accidently or purposefully left out.


Preferred Strategies is often told by prospective clients that they want to ‘clean up’ their data before they start an analytics initiative. Our experience has been that until the prospective client embarks on an analytics initiative (and experiences the frustration of junk data), they will never have the process discipline required to enforce the creation of reliable, trustworthy data. If your company is ready build a governed decision framework which will help to clean up your data and provide a solid foundation for analytics excellence, contact us to see how we can help. That is, don’t just talk about the weather, do something about it.


Avatar photo

About the Author

Ben Harrison

Ben is an experienced business analyst with a demonstrated history of working in the construction and process industries.

Related Articles

November 03, 2023
QuickLaunch for Vista Release Notes: Payroll Detail, Equipment Maintenance, Warranty Details and Much More!

We’re always looking to make improvements here at Preferred Strategies and lately, we’ve been busy! Vista Release Notes is our blog series that showcases recent product improvements to help you stay up to date on what’s new.

Read More >
June 29, 2023
Power BI Updates for Enterprise Analytics: Q2 2023 – Part 1

As we move towards the holiday season, let’s review what the Power BI team has been up to over the summer and early fall of 2023.

Read More >
November 23, 2021
iShift and Preferred Strategies Partner to Enable Businesses to Make the Most of Their Enterprise Information in the Cloud

iShift and Preferred Strategies have joined forces to deliver a cloud-based implementation of Preferred Strategies’ QuickLaunch.

Read More >