The ELT Blog: Validity of a Test

A test is said to be valid if it measures what it is intended to measure.

Construct validity refers to a general notion of validity.

When we talk about validity, we need to have evidence to prove it. Therefore, we have different types of validity. They are described below.

1. Content validity

A test has content validity if its content constitutes a fair representative sample of the language skills, structures, etc. with which it is meant to be concerned. It is like a proper sample of relevant structures dealt with in the teaching programme. The idea is that the test should not be based on any particular section of the syllabus. If it is so, then there would be negative backwash effect.

A document specifying skills and structures to be tested is necessary to create a good test. What is expected in a test for content validity is a wise representation of this test specification. Can compare test specifications and test content to judge on the content validity of the test. Usually this validation is done by someone not related to the construction of the test itself.

What is the significance of content validity? First, greater the content validity, the more accurate will the measurement be. Second, if there is no content validity, there would be negative backwash. Therefore, writing full-test-specification is a necessary step to ensure a test’s content validity.

2. Criterion-related validity

Criterion-related validity is defined as the degree to which the results of the test agree with another set of results provided by some independent, highly dependable assessment of the candidate’s language ability. This independent assessment is the criterion against which the test is validated here.

There are two kinds of criterion-related validity.

a. Concurrent validity.

Concurrent validity is established when a test and its criterion are administered at about the same time. This is done in many ways. It may not be practical to test all the items in the test specification. In such cases, short tests are conducted on large scale to save time, money and effort. In that case, in order to ensure validity, conduct a full-test on a selected sample set of students using four scorers to ensure reliable scoring. This is the criterion against which the shorter test would be validated. Then, we need to compare the two scores using a correlation coefficient. If there is a great deal of agreement, then the shorter test is said to be valid. Here, the purpose of the test determines what level of agreement is to be expected. A high stakes test should look for a greater agreement, while a low stakes test can look for a lower agreement with the criterion. In informal situations, the teachers’ judgment can also be the criterion against which concurrent validation is done.

b. Predictive validity

Predictive validity a kind of criterion-referenced validity, which is the degree to which a test can predict a candidate’s future performance. For example, a test like GRE predicts whether a candidate is able to undertake graduate studies in a university.

Here, criterion can be an assessment of the candidates’ language ability done by his/her supervisor in a university or the result/outcome of a course. Depending on the criterion chosen, the correlation coefficient is adjusted. Then the test score is validated against the chosen criterion.

Thursday, 6 October 2016

Validity of a Test

No comments:

Post a Comment

Amazon.in