
Projects
The Multilingual Glossary of Language Testing Terms was originally developed by ALTE members with funding from the European Commission's LINGUA programme (94-09/1801/UK - lll). The idea of producing a multilingual glossary of assessment terms grew out of the needs experienced by the members of ALTE - the difficulties of talking about language testing issues in the range of languages of ALTE member organisations.
The glossary is of use not only to members of ALTE but to many others who are involved in language testing and assessment. The original glossary contains entries in ten languages: Catalan, Danish, Dutch, English, French, German, Irish, Italian, Portuguese and Spanish. It is available in paperback and on CD-ROM.
Available from Cambridge University Press. For more information go to publications.
The Glossary has been produced in other languages by ALTE partners. For instance, under the TiPS Development Programme for Testing in Polish and Slovene – a Socrates Lingua 2 project – an English, Polish and Slovene version was produced. TIPS. Similarly, under the Socrates funded Devprothell project, a group of ALTE partners produced a glossary in Estonian, Hungarian, Latvian and Lithuanian.
Some example terms from the glossary follow
ACCREDITATION
The granting of recognition of a test, usually by an official
body such as a government department, examinations board, etc.
ADMINISTRATION
The date or period during which a test takes place. Many tests
have a fixed date of administration several times a year, while
others may be administered on demand.
ANCHOR ITEM
An item which is included in two or more tests. Anchor items have
known characteristics, and form one section of a new version of
a test in order to provide information about that test and the
candidates who have taken it, e.g. to calibrate a new test to
a measurement scale.
ASSESSOR
Someone who assigns a score to a candidates performance
in a test, using subjective judgement to do so. Assessors are
normally qualified in the relevant field, and are required to
undergo a process of training and standardisation. In oral testing
the roles of Assessor and Interlocutor are sometimes distinguished.
Also referred to as Examiner or Rater.
CALIBRATION
The process of determining the scale of a test or tests. Calibration
may involve anchoring items from different tests to a common difficulty
scale (the theta scale). When a test is constructed from calibrated
items, then scores on the test indicate the candidates ability,
i.e. their location on the theta scale.
CLERICAL MARKING
A method of marking in which Markers do not need to exercise any
special expertise or subjective judgement. They mark by following
a mark scheme which specifies all acceptable responses to each
test item.
COMMUNICATIVE TASK / ACTIVITY
A classroom or examination exercise which involves or tests an
individuals ability to deal with a communication event.
COMPONENT
Part of an examination, often presented as a separate test, with
its own instruction booklet and time limit. Components are often
skills-based, and have titles such as Listening Comprehension
or Composition. Also referred to as subtest.
COMPUTERISED MARKING (SCORING)
Various ways of using computer systems to minimise error in the
marking of objective tests. For example, this can be done by scanning
information from the candidates mark sheet by means of an
optical mark reader, and producing data which can be used to provide
scores or analyses.
CONJUNCTION
A word used to connect clauses or sentences or words in the same
clause: for example and, but, if.
CONTENT ANALYSIS
A means of describing and analysing the content of test materials.
This analysis is necessary in order to ensure that the content
of the test meets its specification. It is essential in establishing
content and construct validity.
DESCRIPTOR
A brief description accompanying a band on a rating scale, which
summarises the degree of proficiency or type of performance expected
for a candidate to achieve that particular score.
DIRECTED WRITING TASK
See definition for Guided Writing Task.
DISCRETE ITEM
A self-contained item. It is not linked to a text, other items
or any supplementary material. An example of an item type used
in this way is multiple-choice.
DISCRIMINATION
The power of an item to discriminate between weaker and stronger
candidates. Various indices of discrimination are used. Some (e.g.
point-biserial, biserial) are based on a correlation between the
score on the item and a criterion, such as total score on the
test or some external measure of proficiency. Others are based
on the difference in the items difficulty for low and high
ability groups. In item response theory the 2 and 3 parameter
models estimate item discrimination as the A-parameter.
DISCURSIVE COMPOSITION
A writing task in which the candidate has to discuss a topic on
which various views can be held, or argue in support of personal
opinions.
DOUBLE MARKING
A method of assessing performance in which two individuals independently
assess candidate performance on a test.
EDITING
The process by which examination materials submitted by item writers
are modified and put into the form in which they will appear on
an examination paper.
EXAMINER
Refer to definition for Assessor.
FACILITY INDEX
The proportion of correct responses to an item, expressed on a
scale of 0 to 1. It is also sometimes expressed as a percentage.
Also referred to as facility value or p-value.
GAP-FILLING ITEM
Any type of item which requires the candidate to insert some written
material letters, numbers, single words, phrases, sentences
or paragraphs into spaces in a text. The response may be
supplied by the candidate or selected from a set of options.
GRADE
A test score may be reported to the candidate as a grade, for
example on a scale of A to E, where A is the highest grade available,
B is a good pass, C a pass and D and E are failing grades.
GRADING
The process of converting test scores or marks into grades.
GUIDED WRITING TASK
A task which involves the candidate in the production of a written
text, where graphic or textual information, such as pictures,
letters, postcards and instructions, is used to control and standardise
the expected response.
INFORMATION TRANSFER
A technique of testing which involves taking information given
in a certain form and presenting it in a different form. Examples
of such tasks are: taking information from a text and using it
to label a diagram; rewriting an informal note as a formal announcement.
INTERVAL SCALE
A scale of measurement on which the distance between any two adjacent
units of measurement is the same, but in which there is no absolute
zero point.
INTONATION
The tone given to words with the effect that, for example, a question
can be distinguished from a statement.
ITEM
Each testing point in a test which is given a separate mark or
marks. Examples are: one gap in a cloze test; one multiple-choice
question with three or four options; one sentence for grammatical
transformation; one question to which a sentence-length response
is expected.
ITEM BANKING
An approach to the management of test items which entails storing
information about items so that tests of known content and difficulty
can be constructed. Normally, the approach makes use of a computer
database, and is based on latent trait theory, which means that
items can be related to each other by means of a common difficulty
scale.
ITEM RESPONSE THEORY
A group of mathematical models for relating an individuals
test performance to that individuals level of ability. These
models are based on the fundamental theory that an individuals
expected performance on a particular test question, or item, is
a function of both the level of difficulty of the item and the
individuals level of ability.
LANGUAGE FOR SPECIFIC PURPOSES (LSP)
Language teaching or testing which focuses on the area of language
used for a particular activity or profession; for example, English
for Air Traffic Control, Spanish for Commerce.
LEXIS
A term used to refer to vocabulary.
LINK ITEM
Refer to definition for Anchor Item.
MARK
The outcome of an examination, often expressed as a percentage.
Because of adjustments such as heavier weighting for some items,
the mark is not always the same as the total score.
MARK SCHEME
A list of all the acceptable responses to the items in a test.
A mark scheme makes it possible for a Marker to assign a score
to a test accurately.
MARKER
Someone who assigns a score to a candidates responses to
a written test. This may involve the use of expert judgement or,
in the case of a clerical Marker, the relatively unskilled application
of a mark scheme.
MARKING
Assigning a mark to a candidates responses to a test. This
may involve professional judgement, or the application of a mark
scheme which lists all acceptable responses.
MATCHING TASK
A test type which involves bringing together elements from two
separate lists. One kind of matching test consists of selecting
the correct phrase to complete each of a number of unfinished
sentences. A type used in tests of reading comprehension involves
choosing from a list something like a holiday or a book to suit
a person whose particular requirements are described.
MEASUREMENT
Generally, the process of finding the amount of something by comparison
with a fixed unit, e.g. using a ruler to measure length. In the
social sciences, measurement often refers to the quantification
of characteristics of persons, such as language proficiency.
MULTIPLE-CHOICE GAP-FILLING
A type of test item in which the candidates task is to select
from a set of options the correct word or phrase to insert into
a space in a text.
MULTIPLE-CHOICE ITEM
A type of test item which consists of a question or incomplete
sentence (stem), with a choice of answers or ways of completing
the sentence. The candidates task is to choose the correct
option (key) from a set of three, four or five possibilities,
and no production of language is involved. For this reason, multiple-choice
items are normally used in tests of reading and listening. They
may be discrete or text-based.
MULTIPLE-MATCHING TASK
A test task in which a number of questions or sentence completion
items, generally based on a reading text, are set. The responses
are provided in the form of a bank of words or phrases, each of
which can be used an unlimited number of times. The advantage
is that options are not removed as the candidate works through
the items (as with other forms of matching) so that the task does
not become progressively easier.
NARRATIVE TEXT
A text in which a story is told or events recounted.
OBJECTIVE TEST
A test which can be scored by applying a mark scheme, without
the need to bring expert opinion or subjective judgement to the
task.
OPEN-ENDED QUESTION
A type of item or task in a written test which requires the candidate
to supply, as opposed to select, a response. The purpose of this
kind of item is to elicit a relatively unconstrained response,
which may vary in length from a few words to an extended essay.
The mark scheme therefore allows for a range of acceptable answers.
OPTICAL MARK READER (OMR)
An electronic device used for scanning information directly from
mark sheets or answer sheets. Candidates or Examiners can mark
item responses or tasks on a mark sheet and this information can
be directly read into the computer. Also referred to as scanner.
PAPER CONSTRUCTION
The process of selecting the items which will make up an examination
paper, and adding rubrics and an answer key.
PREPOSITION
A word which expresses the relationship between a noun or pronoun
and another word: for example on, with, for.
PRETESTING
A stage in the development of test materials at which items are
tried out with representative samples from the target population
in order to determine their difficulty. Following statistical
analysis, those items that are considered satisfactory can be
used in live tests.
PROMPT
In tests of speaking or writing, graphic materials or texts designed
to elicit a response from the candidate.
PROOF-READING TASK
A test task which involves checking a text for errors of a specified
type, e.g. spelling or structure. Part of the task may also consist
of marking errors and supplying correct forms.
QUESTION
Sometimes used to refer to a test task or item.
RASCH MODEL
A mathematical model, also known as the simple logistic model,
which posits a relationship between the probability of a person
completing a task and the difference between the ability of the
person and the difficulty of the task. Mathematically equivalent
to the one-parameter model in item response theory. The Rasch
model has been extended in various ways, e.g. to handle scalar
responses or multiple facets accounting for the difficulty
of a task.
RAW SCORE
A test score that has not been statistically manipulated by any
transformation, weighting or re-scaling.
REGISTER
A distinct variety of speech or writing characteristic of a particular
activity or a particular degree of formality.
ROLE PLAY
A task type which is sometimes used in speaking tests in which
candidates have to imagine themselves in a specific situation
or adopt specific roles.
RUBRIC
The instructions given to candidates to guide their responses
to a particular test task.
SCALE
A set of numbers or categories for measuring something. Four types
of measurement scale are distinguished - nominal, ordinal, interval
and ratio.
SCALE DESCRIPTOR
Refer to definition for Descriptor.
SCAN
To read something quickly, in order to look for a specific piece
of information or answer to a question. A scanning exercise often
consists of questions placed before a text.
SCRIPT
The paper containing a candidates responses to a test, used
particularly of open-ended task types.
SEMI-AUTHENTIC TEXT
A text taken from a real-life source that has been edited for
use in a test, e.g. to adapt the vocabulary and/or grammar to
the level of the candidates.
SENTENCE COMPLETION
An item type in which only half of a sentence is given. The candidates
task is to complete the sentence, either by supplying suitable
words (possibly based on the reading of a text) or by choosing
them from various options given.
SENTENCE TRANSFORMATION
An item type in which a complete sentence is given as a prompt,
followed by the first one or two words of a second sentence which
expresses the content of the first in a different grammatical
form. For example, the first sentence may be active, and the candidates
task is to present the identical content in passive form.
SETTING
The whole process by which examination materials are produced
and papers constructed.
SKIM
To read rapidly so that the main point is understood, although
details will be missed.
SPECIFICATIONS
A description of the characteristics of an examination, including
what is tested, how it is tested, details such as number and length
of papers, item types used, etc.
STRESS
The emphasis put on a syllable or word in spoken language.
STRUCTURAL COMPETENCE
Structural competence refers to an individuals ability in
and knowledge of the grammatical structures of a language.
SYLLABUS
A detailed document which lists all the areas covered in a particular
programme of study, and the order in which content is presented.
SYNONYM
Two words which mean the same, or almost the same, as each other;
for example, shut the door and close the door.
SYNTACTIC STRUCTURES
The grammatical structures of language.
'TABLE-TOP' MARKING
A method of marking examination papers which involves gathering
all the Markers together to mark for a limited period of time,
rather than sending papers out to be marked by people in their
own homes.
TASK
A combination of rubric, input and response. For example, a reading
text with several multiple-choice items, all of which can be responded
to by referring to a single rubric.
TEST METHOD CHARACTERISTICS
The defining characteristics of different test methods. These
may include environment, rubric, language of instructions, format,
etc.
TEXT
A piece of connected discourse, written or spoken, used as the
basis for a set of test items.
THRESHOLD LEVEL
An influential specification in functional terms of a basic level
of foreign language competence, published by the Council of Europe
in 1976 for English, and updated in 1990. Versions have since
been produced for a number of European languages.
TRANSFORMATION ITEM
Refer to definition for sentence transformation.
UTTERANCE
A chain of spoken words.
VETTING
A stage in the cycle of test production at which the test developers
assess materials commissioned from item writers and decide which
should be rejected as not fulfilling the specifications of the
test, and which can go forward to the editing stage.
WAYSTAGE LEVEL
A specification of an elementary level of foreign language competence
first published by the Council of Europe in 1977 for English and
revised in 1990. It provides a less demanding objective than Threshold,
being estimated to have approximately half the Threshold learning
load.
WEIGHTING
The assignment of a different number of maximum points to a test
item, task or component in order to change its relative contribution
in relation to other parts of the same test. For example, if double
marks are given to all the items in Task One of a test, Task One
will account for a greater proportion of the total score than
other tasks.
WORD FORMATION
An item type where the candidate has to produce a form of a word
based on another form of the same word which is given as input.