Coronavirus 2019 (COVID-19) is a global pandemic that has disrupted our daily lives with no immediate end in site. The global community has relied upon leaders and experts to communicate what the pandemic is, how to mitigate its risk, and the impact it is having on us as a human species. To the latter point, leaders have primarily relied upon data to share and illustrate impact. In this article, I will focus on the importance of data and some of the lessons we are learning from its role in managing this global pandemic.
Sourcing Data is Hard
Sourcing data is first and foremost about the credibility of the reporting source. How do they collect (manual processes or technical automated processes)? What is the scope of their collection? At what frequency (day, hour, minute, second) is data collected? In the case of COVID-19 we have learned that credibility is certainly important in the case of nation-state reporting (e.g., China) vs. independent body reporting (e.g., W.H.O). Independent body reporting can be even further scrutinized when dealing with academic or think tank organizations (do they have the resources and/or expertise that underscore credibility?).
Getting sourcing correct, whether it be in a global pandemic crisis scenario or in a business organization initiative, is so fundamentally key to having a shot at leveraging data to inform or ultimately make decisions.
Normalizing Data is Even Harder
Normalizing data presents an array of challenges, even if the data is coming from credible trusted sources. Let’s take the following COVID-19 example — at any given time you can compare reporting out of the World Health Organization (W.H.O) and the US Center for Disease Control and Prevention (CDC) and find discrepancies in vital statistics that would otherwise seem easily reconcilable. However what we often overlook in any data collection initiative is basic normalization methods (setting forth a common set of definitions, standards, and procedures for the input, processing, and output of data). This is primarily why we see so many variances in the reporting of data associated with COVID-19.
Additionally, when you begin adding qualitative vs. exclusively quantitative methods into the reporting you begin skewing objective data points. This example can present further damage when data points are “cherry picked” — creating even more disparate sets of data likely not adhering to a common set of standards. Ultimately yielding another problem of having to re-unify and re-trace the branches back to the root. When this problem is presented in a business organization environment perhaps consequences are at worst a financial risk. This problem while managing a global pandemic can be the difference between lives saved vs. lost.
Finding Truth in Data is Hardest
Sometimes we want data to prove a predisposition already formulated in our minds. If we were to apply a reverse engineering approach to shaping data to support a predisposition while still supported by a standardized methodology and credible sourcing, we could. Unfortunately, this can be a common practice in many environments. If you closely monitor the media and political edges of COVID-19 you will see this practice happening right in front of you.
Whether that practice is the truth to some or many — can it be considered finding truth in data? That is the million dollar question in any setting where we want the data to be what we rely upon to make our decision(s). Finding truth in data may be in the eye of the beholder, as they say. Single sources of truth, unchallenged, can be reassuring and dangerous at the same time.
Like in the practice of law, I have found it most useful to always rely upon the definitions that were created in good faith — following the methodology or process to arrive at a reasonable conclusion or decision — and living with that as your single source of truth.