Effective data management starts with a data governance framework. Essentially, this is a set of rules defining the types of data you will collect and how that data will be represented. More precisely, the Data Governance Institute defines data governance as “a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.”
Enterprise data architecture is evolving, wholly as a result of the pressure that technology is imposing and the growing desire for self-service analytics. Traditionally, data governance implementation involves working with a functionally and technically experienced data management team to execute the following activities.
The first step in creating a data governance framework is to identify the functional use cases that define your business. A typical retail use case is a customer interaction at the point of sale, while a utility use case might be online bill payment.
Defining use cases helps you think in terms of real-life scenarios. You can then deconstruct to define major data entities, their relationships to other entities and their attributes. In a retail business, major data entities include products, customers, stores, warehouses, suppliers and vendors. A utility company’s data entities include customers, field workers, power stations, substations and meters.
Once you’ve established your major data entities, you need to define acceptable representations of that data. A retail customer might be defined as someone who makes a purchase at a store or online and has provided an email address, street address and phone number. A utility power station might be defined in terms of monthly billing with key data entities such as customer, service point, usage and rates.
With this level of detail, you can start to understand how to define data quality. For example, every phone number needs to be 10 digits, or every email must include the “@” symbol with a domain name. Finally, all departments must agree to represent vendors, materials or other shared data in the same way. Once you have defined a set of data rules, those rules can be implemented in an IT system to maintain the quality of your data throughout its life.
It was accepted that these implementations would involve lengthy delivery timelines before providing the end user with the high-quality data, tools and artifacts initially requested. With the volume and sources of data growing dramatically, the velocity at which this data is being delivered — and the growing desire for self-service analytics from end users — means the traditional implementation approach is no longer suitable for meeting the needs of an organization’s decisionmakers.
The enterprise’s data architecture must evolve by developing the speed and agility needed to meet these information demands. An experienced team will recognize a set of characteristics of those organizations that have adapted to meet these data management challenges:
- Collaboration. Fostering collaboration among the business, technology and leadership groups helps to promote data being a shared asset, develop a common vocabulary and determine how to optimize the managing of the data through its life cycle.
- Promotion of self-service capabilities. Getting data into the hands of the decision-makers as expeditiously as possible is realized through providing self-service analytics. By empowering end users with this capability, there is less reliance on project delivery timelines, allowing the enterprise to react, anticipate and plan.
- Focus on capturing metadata. Metadata is a key component to enabling analytics on data. Metadata allows an organization to catalog its data, empowering various types of analyses.
- Evolution from historical reporting to predictive and prescriptive analytics. As an organization is able to better understand, manage and derive insights from its data, it is able to unlock higher value-added reporting capabilities.
- Adoption of lightweight toolsets/platforms. Developing speed and agility from a technical perspective involves adopting tools that can easily integrate with the organization’s various applications and data sources with minimal infrastructure footprint to extract the metadata necessary for analysis.