Introduction
With the recent changes of the No Child Left Behind Act signed
into law on January 8, 2002, State Departments of Education
along with local education agencies have struggled to interpret,
understand, and respond to the new demands of the legislation
around the issue of accountability. As stated in the Learning
First Alliance Summary in January of 2002 and 2003, “While the
changes to standards and assessments are substantial, the
changes in accountability are more far reaching. Some of these
requirements apply to all districts and schools while others
apply only to districts and schools receiving funds under
Title I.”
Nowhere does accountability have a greater impact on the
education process than at the local district and school level.
Perhaps there was a time not so long ago when a priori decision
making was quintessential for the delivery of educational
services, but rest assured that time has passed in American
education. Now decisions must be based on reality. Every
person involved in the education process must be on the same
page, see the same facts and arrive at the same decisions.
Just the Facts of Data Quality
Webster’s Dictionary defines a fact as something that has
actually happened or is true. The English Thesaurus lists
information as an alternative term to the word fact. We start
here simply to state two fundamental propositions essential in
meeting the demands of accountability. First, accountability
decision making must be based on information that accurately
represents what actually happened. Second, you can only achieve
the first by leveraging your data as a strategic asset.
Data is not information. This seems like an obvious statement to
make, but the two terms “data” and “information” are often used
interchangeably. Simply stated, raw data is the numbers and
letters collected about an organization and its day-to-day
activities. Information represents data that has been given a
context of meaning for users and consumers of that information.
Although data and information are directly related to one
another, the two are distinctly different.
The non-distinction between data and information, at least in
part, seems to contribute to the lack of understanding for the
necessity of addressing data quality. The reality is that data
is the foundation on which information is built. This crucial
fact is the reason it is imperative to address the issue of
data quality. The relationship between data quality, data,
and information is demonstrated in Figure 1.

http://thephantomwriters.com/client-img/CanDoEDU-img1.gif
As shown in Figure 1, as the level of quality in the data
increases, so too does the level of quality in the information.
In laymen’s terms, if the quality of the data is bad, then the
quality of the information produced will also be bad. If you
don’t have good information, then what do you have? In other
words, it may even be worse than worthless, because consumers of
the information will be making education decisions based on bad
information. Such decision-making could lead to outcomes that
adversely affect the education process.
The whole premise of data-driven decision making is to get
information that will allow administrators and educators to make
better decisions. How can you make good, sound decisions when
the information that is being used to make those decisions is
derived from data that is less than the highest possible quality?
In short, the old adage of garbage in, garbage out is forever
true. That is why an understanding of the relationship between
data quality, data, and information is an integral step toward
establishing data quality.
Consequences of Ignoring Data Quality
It is also important to understand the potential impact of data
quality on education and specifically accountability when the
issue of data quality is not adequately addressed. Three
possible outcomes of ignoring data quality are illustrated in
Figures 2 – 4.
As shown in Figure 2, the perfect outcome represents a scenario
of when the Accountability View completely mirrors the Actual
View of education. That is to say, there is no need to address
data quality when we look at it from this outcome. The data
used to create information gives a complete and accurate view
of the education process without having to be concerned about
data quality.

http://thephantomwriters.com/client-img/CanDoEDU-img2.gif
This scenario is utopian in that it rarely happens in the real
world. One of the goals of information is to accurately
represent reality. Data are collected by human beings about
human beings for human beings. In short, human hands are
involved in the process at every stage. Human beings are
imperfect creatures in that they make mistakes, and that means
you can expect to have imperfections in your data. This is not
meant as a criticism. It is simply a fact. Consequently, either
the issue of data quality must be addressed, or you are more
likely to encounter one of the two outcomes presented in Figures
3 and 4.
A second possible outcome of ignoring data quality is
demonstrated in Figure 3. In this scenario, the Accountability
View under-represents the Actual View of education. Such a
misrepresentation may lead a decision-maker to allocate resources
to address a perceived issue or problem, when in fact no issue
or problem exists. From that standpoint, those resources are
wasted.

http://thephantomwriters.com/client-img/CanDoEDU-img3.gif
A third possible outcome of ignoring data quality is presented in
Figure 4. In this outcome, Accountability View over-represents
the Actual View as demonstrated by the two graphs in Figure 4.
This type of scenario may lead a decision-maker to the conclusion
that resources should not be deployed in certain key strategic
education areas, when in actuality a problem or issues does exist
and should be addressed. Again, this scenario can lead to the
misuse or misallocation of valuable resources.

http://thephantomwriters.com/client-img/CanDoEDU-img4.gif
Building Data Quality
One question often asked is how to address data quality, and that
is the fundamental purpose of this article. Decision makers must
have a starting point for addressing data quality – a common
understanding. After all, it is extremely difficult to know
how to get where you’re going if you don’t know where to start.

http://thephantomwriters.com/client-img/CanDoEDU-img5.gif
The first task is to establish a common understanding of exactly
what data quality means. More to the point, let’s establish a
common ground for communicating about data quality as
demonstrated in Figure 5: The ABC Model of Data Quality.
Good data quality doesn’t happen by itself. Some means of
judging the level of quality must be devised in order to
establish data quality. Thus, the first step toward establishing
quality is to define a standard that may be utilized to assess
data quality. It’s very much like using a yardstick to measure
the length of the line.
The ABC Model of Data Quality consists of three specific criteria
that are recommended to determine good data quality. These
criteria are accuracy, business logic, and completeness. By
using this approach, it is assumed that good data quality is
achieved when data meets these criteria. Good data quality
exists when data are accurate, when it conforms to the logic
of the education enterprise, and when it is complete.
Accuracy
Data are considered to be accurate when the fact that is
represented is true in its representation. For example,
absenteeism may be a problem that needs to be addressed within a
specific school. It certainly makes sense that if a child does
not go to school, then that child cannot learn what is being
taught in the classroom. In this example, let’s say that the
absenteeism data that are collected are entered with an absent
date that falls on a Saturday and on a Sunday. In this scenario,
the end result is that the absenteeism data are inaccurate, and
therefore could lead to misinterpretation and misinformation
about the absenteeism problem at that school.
Business Logic
The business of education is a complicated process that is
multifaceted. It is certain then that data must also reflect the
multifaceted nature of the education process, as information used
as a basis for decision making must be inclusive of all facets.
All data should meet this criterion, as data that does not
support the education business logic should not be utilized in
making decisions. This criterion is the ultimate test of data
quality as the whole reason for the data is to support decisions
about the delivery of education. It is in the area of business
logic that the more sophisticated and complicated aspects of data
quality can and should be addressed. An example of a very simple
test of education business logic would be having students
reported as absent on days when there is no school such as a
Saturday, a Sunday or a holiday. In this scenario, no student
could be absent, when there is no school to be absent from.
Clearly, here is a data quality issue of business logic.
Completeness
Data are complete when the values contained within the data
conform to a predefined set of acceptable values that represents
the total picture. That is to say, all the data values that is
supposed to be there are actually there. For example, test data
with 100,000 test results have missing data for 50,000 test
results in the data element representing gender when that data
element is supposed to have no missing data at all. This example
of completeness would involve data representing gender in the
data with values of “M” for males and “F” for females. In this
scenario, upon looking at standardized test scores for a mandated
State test by gender it is found that males have higher Math
scores. Further examination of the gender data finds that the
data primarily contain a value of “M”. Many of the female test
results do not have data for the female test results when in fact
it should have both male and female test results. In these
scenarios, the information is affected by the incomplete data,
and any analysis or interpretation of the data will lead to an
incomplete picture of the test results.
Summary and Conclusion
Data quality has largely been defined as something technical,
since it is directly related to data. While it is true that it
is a data management task, the truth of the matter is that it
has to be an organizational responsibility that is shared at
all levels from data collection to information dissemination
and consumption. Educators, administrators as well as technical
personnel must be involved in data quality.
Technology cannot be a substitute for human judgment, nor can
it eradicate human error. This fact is why education decision
makers must consider the issue of data quality where
accountability is concerned. Given the dire consequences of No
Child Left Behind such as reduction of funds, firing of teachers,
principals and administrators, or the redistricting of a local
district, can local district and school decision makers afford
to make the wrong decisions? When the impact of accountability
is considered, can you ignore the consequences of bad decision
making by allocating resources to address a problem that does
not exist, or vice versa, not allocating resources to address
a problem that does exist? Can we rely on the possibility of
a Perfect Outcome in light of the current environment where
administrators and educators already have to do more with fewer
resources?
The ABC Model provides a theoretical foundation by which data
quality can be judged by education decision makers. The model
presented here is not intended to establish a standard for data
quality in education; it is not a practitioner’s guide. However,
data quality is an issue that must be addressed even in the
difficult situation currently faced by local districts and
schools. The ABC Model of Data Quality outlines a simple
approach. In such a situation, simplicity works well.
NOTE:
- To View this Article in the Format it Was Meant to be Seen,
please click here:
http://www.candoedu.com/LinkClick.aspx?
link=Data+Quality+and+Accountability.pdf&mid=558
|