Data Owner

I share this data science pill for unversed professionals to understand how much they can contribute to transforming the company into a data-driven company.

Some days ago I got an excel list of sessions from a school.
My data analytics bias caught my attention in the first column: «Teaching mode»

After two nested formulas in excel, I got a list of unique values like this:

Híbrida
Hibrido
LinkedIn
Online
Presencial
Teams
Zoom
Zoom Meet
Zoom W+Zoom M
Zoom Web
Zoom Web.
Zoom WebBrasil
Zoom Webinar

The list made me think about the headache for the academic director to know how many sessions were online, presential, or hybrid.

Culture of data

This simple use case shows that becoming a data-driven organization is not as easy as a willful wish coming from the board of directors.

Transforming a company into a data-driven decision company requires that a culture of data has to permeate the organization. That takes time.

At the moment, ETL tools (Extract, Transform Load) assist in data analysis to prepare data for further treatment with statistical algorithms. but despite such tools, we are still attached to the 80/20 data-science dilemma. Preparing data takes 80% of the time.

One tiny part of the data culture is the definition of the data owner role. In the case of the school, who is the «teaching mode» data owner?

Basically, The owner of the «Teaching mode» has to define that only «Online», «presential» , «hybrid» are allowed. If there was a requirement to know which platform the school is using, an additional column had to be filled with «Zoom», «Teams», «Hangout», «Webex», and so on.

In Companies using ERPs or CRMs, IT people prevent these mistakes with dropdowns showing the right values to users. But in the areas unreached by corporate software the mistakes will persist.

It was my aim that this small pill opens your eyes to this regular problem in data science.