A test is said to be valid if it measures what it is
intended to measure.
Construct validity refers to a general notion of validity.
When we talk about validity, we need to have evidence to
prove it. Therefore, we have different types of validity. They are described
below.
1.
Content validity
A test has content validity if its content
constitutes a fair representative sample of the language skills, structures,
etc. with which it is meant to be concerned. It is like a proper sample of relevant
structures dealt with in the teaching programme. The idea is that the test
should not be based on any particular section of the syllabus. If it is so,
then there would be negative backwash effect.
A document specifying skills and structures
to be tested is necessary to create a good test. What is expected in a test for
content validity is a wise representation of this test specification. Can
compare test specifications and test content to judge on the content validity
of the test. Usually this validation is done by someone not related to the
construction of the test itself.
What is the significance of content
validity? First, greater the content validity, the more accurate will the
measurement be. Second, if there is no content validity, there would be
negative backwash. Therefore, writing full-test-specification is a necessary
step to ensure a test’s content validity.
2.
Criterion-related validity
Criterion-related validity is defined as
the degree to which the results of the test agree with another set of results
provided by some independent, highly dependable assessment of the candidate’s
language ability. This independent assessment is the criterion against which
the test is validated here.
There are two kinds of criterion-related
validity.
a.
Concurrent validity.
Concurrent validity is established when a test and its criterion are
administered at about the same time. This is done in many ways. It may not be
practical to test all the items in the test specification. In such cases, short
tests are conducted on large scale to save time, money and effort. In that case,
in order to ensure validity, conduct a full-test on a selected sample set of
students using four scorers to ensure reliable scoring. This is the criterion
against which the shorter test would be validated. Then, we need to compare the
two scores using a correlation coefficient. If there is a great deal of
agreement, then the shorter test is said to be valid. Here, the purpose of the
test determines what level of agreement is to be expected. A high stakes test
should look for a greater agreement, while a low stakes test can look for a
lower agreement with the criterion. In informal situations, the teachers’
judgment can also be the criterion against which concurrent validation is done.
b.
Predictive validity
Predictive validity a kind of criterion-referenced validity, which is the
degree to which a test can predict a candidate’s future performance. For example,
a test like GRE predicts whether a candidate is able to undertake graduate
studies in a university.
Here, criterion can be an assessment of the candidates’ language ability
done by his/her supervisor in a university or the result/outcome of a course. Depending
on the criterion chosen, the correlation coefficient is adjusted. Then the test
score is validated against the chosen criterion.
No comments:
Post a Comment