Applied psychology. Reliability - The degree of consistency of results obtained from repeated application of a measurement technique. Basic requirements for criteria

Test 3. Research methods

1. Data on real human behavior obtained through external observation are called:

a) L – data;

b) Q-data;

c) T-data;

d) Z-data.

2. The type of results recorded using questionnaires and other self-assessment methods is called:

a) L – data;

b) Q-data;

c) T-data;

d) Z-data.

3. This assignment of numbers to objects, in which equal differences in numbers correspond to equal differences in the measured attribute or property of the object, presupposes the presence of a scale:

a) names;

b) order;

c) intervals;

d) relationships.

4. The order scale corresponds to measurement at the level:

a) nominal;

b) ordinal;

c) interval;

d) relationships.

5. Ranking of objects according to the severity of a certain characteristic is the essence of measurements at the level:

a) nominal;

b) ordinal;

c) interval;

d) relationships.

6. It is extremely rare in psychology to use the following scale:

a) names;

b) order;

c) intervals;

d) relationships.

7. The postulates that govern transformations of ordinal scales do not include the following postulates:

a) trichotomy;

b) asymmetry;

c) transitivity;

d) dichotomies.

8. In the most general form, measurement scales are represented by the scale:

a) names;

b) order;

c) intervals;

d) relationships.

9. You cannot perform any arithmetic operations on the scale:

a) names;

b) order;

c) intervals;

d) relationships.

10. Establishing equality of relationships between individual values ​​is permissible at the scale level:

a) names;

b) order;

c) intervals;

d) relationships.

11. B.G. Ananyev refers to the longitudinal research method:

a) to organizational methods;

b) to empirical methods;

c) to methods of data processing;

d) to interpretive methods.

12. Purposeful, systematically carried out perception of objects in the knowledge of which a person is interested is:

a) experiment;

b) content analysis;

c) observation;

d) the method of analyzing the products of activity.

13. Long-term and systematic observation, the study of the same people, which allows one to analyze mental development at various stages of life and draw certain conclusions based on this, is usually called research:

a) aerobatics;

b) longitudinal;

c) comparative;

d) complex.

14. The concept of “self-observation” is synonymous with the term:

a) introversion;

b) introjection;

c) introspection;

d) introscopy.

15. The systematic use of modeling is most typical:

a) for humanistic psychology;

b) for Gestalt psychology;

c) for psychoanalysis;

d) for the psychology of consciousness.

16. A brief, standardized psychological test that attempts to evaluate a particular mental process or personality as a whole is:

a) observation;

b) experiment;

c) testing;

d) self-observation.

17. Receipt by the subject of data about his own mental processes and states at the time of their occurrence or following it is:

a) observation;

b) experiment;

c) testing;

d) self-observation.

18. The active intervention of a researcher in the activities of a subject in order to create conditions for establishing a psychological fact is called:

a) content analysis;

b) analysis of activity products;

c) conversation;

d) experiment.

19. The main method for modern psychogenetic research is not:

a) twin;

b) adopted children;

c) family;

d) introspection.

20. Depending on the situation, the following observations can be distinguished:

a) field;

b) solid;

c) systematic,

d) discrete.

21. A method of studying the structure and nature of people’s interpersonal relationships based on measuring their interpersonal choice is called:

a) content analysis;

b) comparison method;

c) the method of social units;

d) sociometry.

22. For the first time, an experimental psychological laboratory was opened:

a) W. James;

b) G. Ebbinghaus;

c) W. Wundt;

d) H. Wolf.

23. The world's first experimental laboratory began its work:

a) in 1850;

b) in 1868;

c) in 1879;

24. The first experimental psychological laboratory in Russia is known:

a) since 1880;

b) since 1883;

c) since 1885;

25. The first pedological laboratory was created:

a) A.P. Nechaev in 1901;

b) S. Hall in 1889;

c) W. James in 1875;

d) N.N. Lange in 1896

26. In Russia, the first experimental psychological laboratory was opened by:

a) I.M. Sechenov;

b) G.I. Chelpanov;

c) V.M. Bekhterev;

d) I.P. Pavlov.

27. The researcher’s ability to evoke some mental process or property is the main advantage:

a) observations;

b) experiment;

c) content analysis;

d) analysis of activity products.

28. Using the experimental method, hypotheses about the presence of:

a) phenomena;

b) connections between phenomena;

c) cause-and-effect relationship between phenomena;

d) correlations between phenomena.

29. The following allows you to establish the most general mathematical and statistical patterns:

a) content analysis;

b) analysis of activity products;

c) conversation;

d) experiment.

30. An associative experiment for studying unconscious affective formations was developed and proposed:

a) P. Janet;

b) S. Freud;

c) J. Breuer;

a) R. Gottsdanker;

b) A.F. Lazursky;

c) D. Campbell;

d) W. Wundt.

32. The concept of “full compliance experiment” was introduced into scientific circulation by:

a) R. Gottsdanker;

b) A.F. Lazursky;

c) D. Campbell;

d) W. Wundt.

33. Intermediate between natural research methods and methods where strict control of variables is applied is:

a) thought experiment;

b) quasi-experiment;

c) laboratory experiment;

d) conversation method.

34. A characteristic that is actively changed in a psychological experiment is called a variable:

a) independent;

b) dependent;

c) external;

d) side.

35. According to D. Campbell, potentially controlled variables refer to the variables of the experiment:

a) independent;

b) dependent;

c) collateral;

d) external.

36. As a criterion for the reliability of results, the validity achieved during a real experiment in comparison with an ideal one is called:

a) internal;

b) external;

c) operational;

d) constructive.

37. The measure of compliance of the experimental procedure with objective reality characterizes the validity:

a) internal;

b) external;

c) operational;

d) constructive.

38. In a laboratory experiment, validity is most violated:

a) internal;

b) external;

c) operational;

d) constructive.

39. The concept of “ecological validity” is more often used as a synonym for the concept of “validity”:

a) internal;

b) external;

c) operational;

d) constructive.

40. Eight main factors violating internal validity and four factors violating external validity were identified:

a) R. Gottsdanker;

b) A.F. Lazursky;

c) D. Campbell;

d) W. Wundt.

41. The factor of non-equivalence of groups in composition, which reduces the internal validity of the study, was named by D. Campbell:

a) selection;

b) statistical regression;

c) experimental screening;

d) natural development.

42. The placebo effect was discovered:

a) psychologists;

b) teachers;

c) doctors;

d) physiologists.

43. The presence of any external observer in an experiment is called an effect:

a) placebo;

b) Hawthorne;

c) social facilitation;

d) halo.

44. The influence of the experimenter on the results is most significant in studies:

a) psychophysiological;

b) “global” individual processes (intelligence, motivation, decision-making, etc.);

c) personality psychology and social psychology;

d) psychogenetic.

45. As a specially developed technique, introspection has been most consistently used in psychological research:

a) A.N. Leontyev;

b) W. Wundt;

c) V.M. Bekhterev;

d) Z. Freud.

46. ​​Psychological techniques constructed on educational material and intended to assess the level of mastery of educational knowledge and skills are known as tests:

a) achievements;

b) intelligence;

c) personality;

d) projective.

47. Assessment of an individual’s capabilities to master knowledge, skills and abilities, of a general or specific nature, is carried out through testing:

a) achievements;

b) intelligence;

c) personality;

d) abilities.

48. An assessment of the consistency of indicators obtained by re-testing the same subjects with the same test or its equivalent form characterizes the test in terms of its:

a) validity;

b) reliability;

c) reliability;

d) representativeness.

49. The test quality criterion used to determine its compliance with the field of measured mental phenomena represents the validity of the test:

a) constructive;

b) by criterion;

d) prognostic.

50. The test quality criterion used when measuring any complex mental phenomenon that has a hierarchical structure, which because of this is impossible to measure with one act of testing, is known as:

a) construct validity of the test;

b) criterion-related validity of the test;

c) content validity of the test;

d) reliability of the test.

51. The data of personal questionnaires should not be influenced by:

a) the use of incorrect standards by the subjects;

b) lack of introspection skills among the subjects;

c) discrepancy between the intellectual capabilities of respondents and the requirements of the survey procedure;

d) personal influence of the researcher.

52. To establish a statistical relationship between variables, the following is used:

a) Student’s t-test;

b) correlation analysis;

c) method of analyzing activity products;

d) content analysis.

53. Factor analysis was first used in psychology:

a) R. Cattell;

b) K. Spearman;

c) J. Kelly;

d) L. Thurstone.

54. The most frequently occurring value in a set of data is called:

a) median;

b) fashion;

c) decile;

d) percentile.

55. If psychological data are obtained on an interval scale or a ratio scale, then a correlation coefficient is used to identify the nature of the relationship between the signs:

a) linear;

b) ranked;

c) steam room;

d) multiple.

56. Tabulation, presentation and description of the totality of the results of psychological research is carried out:

a) in descriptive statistics;

b) in the theory of statistical inference;

c) in testing hypotheses;

d) in modeling.

57. The widest range of application of mathematical methods in psychology allows for the quantification of indicators on a scale:

a) names;

b) order;

c) relationships;

d) interval.

58. Dispersion is an indicator:

a) variability;

b) measures of central tendency;

c) medium-structural;

d) average.

59. Multivariate statistical methods do not include:

a) multidimensional scaling;

b) factor analysis;

c) cluster analysis;

d) correlation analysis.

60. A visual assessment of the similarities and differences between certain objects described by a large number of different variables is provided by:

a) multidimensional scaling;

b) factor analysis;

c) cluster analysis;

d) structural latent analysis.

61. The set of analytical and statistical procedures for identifying hidden variables (features), as well as the internal structure of connections between these characteristics, is called:

a) multidimensional scaling;

b) factor analysis;

c) cluster analysis;

d) structural latent analysis.

*Reliability and validity of a test are characteristics of a study’s compliance with formal criteria that determine quality and suitability for use in practice.

What is reliability

During test reliability testing, the consistency of the results obtained when the test is repeated is assessed. Data discrepancies should be absent or insignificant. Otherwise, it is impossible to treat the test results with confidence.

Test reliability is a criterion that indicates that the following properties of tests are considered essential:

  • reproducibility of the results obtained from the study;
  • degree of accuracy or related instruments;
  • sustainability of results over a certain period of time.

In the interpretation of reliability, the following main components can be distinguished:

  • the reliability of the measuring instrument (namely the literacy and objectivity of the test task), which can be assessed by calculating the corresponding coefficient;
  • the stability of the characteristic being studied over a long period of time, as well as the predictability and smoothness of its fluctuations;
  • objectivity of the result (that is, its independence from the personal preferences of the researcher).

Reliability factors

The degree of reliability can be affected by a number of negative factors, the most significant of which are the following:

  • imperfection of the methodology (incorrect or inaccurate instructions, unclear wording of tasks);
  • temporary instability or constant fluctuations in the values ​​of the indicator that is being studied;
  • inadequacy of the environment in which initial and follow-up studies are conducted;
  • the changing behavior of the researcher, as well as the instability of the subject’s condition;
  • subjective approach when assessing test results.

Methods for assessing test reliability

The following techniques can be used to determine test reliability.

The retesting method is one of the most common. It allows you to establish the degree of correlation between the results of studies, as well as the time in which they were conducted. This technique is simple and effective. Nevertheless, as a rule, repeated examinations cause irritation and negative reactions in subjects.

  • constructive validity of a test is a criterion used when evaluating a test that has a hierarchical structure (used in the process of studying complex psychological phenomena);
  • criterion-based validity involves comparing test results with the test subject’s level of development of one or another psychological characteristic;
  • content validity determines the correspondence of the methodology to the phenomenon being studied, as well as the range of parameters that it covers;
  • predictive validity is one that allows one to evaluate the future development of a parameter.

Types of Validity Criteria

Test validity is one of the indicators that allows you to assess the adequacy and suitability of a technique for studying a particular phenomenon. There are four main criteria that can affect it:

  • performer criterion (we are talking about the qualifications and experience of the researcher);
  • subjective criteria (the subject’s attitude towards a particular phenomenon, which is reflected in the final test result);
  • physiological criteria (health status, fatigue and other characteristics that can have a significant impact on the final test result);
  • criterion of chance (takes place in determining the probability of the occurrence of a particular event).

The validity criterion is an independent source of data about a particular phenomenon (psychological property), the study of which is carried out through testing. Until the results obtained are checked for compliance with the criterion, validity cannot be judged.

Basic criteria requirements

External criteria that influence the test validity indicator must meet the following basic requirements:

  • compliance with the particular area in which the research is being conducted, relevance, as well as semantic connection with the diagnostic model;
  • the absence of any interference or sharp breaks in the sample (the point is that all participants in the experiment must meet pre-established parameters and be in similar conditions);
  • the parameter under study must be reliable, constant and not subject to sudden changes.

Ways to Establish Validity

Checking the validity of tests can be done in several ways.

Assessing face validity involves checking whether a test is fit for purpose.

Construct validity is assessed when a series of experiments are conducted to study a specific complex measure. It includes:

  • convergent validation - checking the relationship of assessments obtained using various complex techniques;
  • divergent validation, which consists in ensuring that the methodology does not imply the assessment of extraneous indicators that are not related to the main study.

Assessing predictive validity involves establishing the possibility of predicting future fluctuations of the indicator being studied.

conclusions

Test validity and reliability are complementary indicators that provide the most complete assessment of the fairness and significance of research results. Often they are determined simultaneously.

Reliability shows how much the test results can be trusted. This means their constancy every time a similar test is repeated with the same participants. A low degree of reliability may indicate intentional distortion or an irresponsible approach.

The concept of test validity is associated with the qualitative side of the experiment. We are talking about whether the chosen tool corresponds to the assessment of a particular psychological phenomenon. Here, both qualitative indicators (theoretical assessment) and quantitative indicators (calculation of appropriate coefficients) can be used.

In a practical sense, reliability refers to the consistency or stability of measurement results. If a particular measuring instrument is reliable, then repeated measurements by the same instrument and another person will not change the result. Conversely, unreliable measuring instruments produce different measurement results depending on a wide variety of circumstances.

Reliability is a general requirement for any type of measurement under any conditions.

There are several ways to assess how reliable This test provides measurement results. Three methods are most commonly used.

1.Assessment of test reliability using retesting method. One of the most frequently used. This procedure calculates the correlation coefficient between two variables - the results of measurements obtained by testing the same people twice using the same test, but at different times.

From the researcher's point of view, the retesting procedure is simple and takes little time. Test takers probably like it less because they have to take the test twice. As Smith and George emphasize, an important aspect of testing is motivating test takers to do well on the test. It is possible that during repeated testing, subjects feel impatience or boredom, which introduces additional error into the results.

When examining the reliability of a test by test-retest, other events may also occur between the first and second tests. If this time is too short, then the persistence rate may be affected by factors such as memorization of test questions or experience gained during the first test, as well as a decrease in the test takers' interest in the test. If too much time passes between the first and second tests, test takers may change in some test-relevant ways (they may have prepared, gained experience, learned the material, and so on).

Uneven reactions of subjects to the first test introduce additional error into the assessment of the reliability of the test. For this reason, this method is most useful for assessing the reliability of tests designed to assess skills that are not related to memory and are unlikely to improve with short practice during the first test. Examples of such tests include tests of hearing acuity, problem-solving skills, and fine motor skills.

2. Assessment of test reliability by checking internal consistency. Some of the problems associated with motivation, memory, and experience that arise in assessing test-retest reliability can be circumvented by using a method of checking the internal consistency of the test. This checks the consistency of answers to individual test questions, rather than the consistency of results obtained during testing at different times. One commonly used approach is to have several subjects take the test once, then divide the test into two parts, the results of which are scored separately. Each subject now has two results, and these are used to calculate the correlation coefficient.

Typically, the test is divided into two parts as follows: questions with odd numbers are included in one half, and questions with even numbers are included in the other half. The resulting correlation coefficient r between two sets of “results” is called the internal consistency coefficient or sometimes the separation coefficient.

3. Assessment of test reliability using the equivalent forms method In addition to the method of checking internal consistency, you can use an alternative procedure, which is based on the use of two different tests.

If both tests are based on the same material and are equivalent in form and difficulty, a reliability assessment can be made using the equivalent forms procedure. Each subject is offered both tests and the correlation coefficient between the results obtained (r), which is called the equivalence coefficient, is calculated. This name indicates the main disadvantage of this method - the difficulty of constructing equivalent test forms. A test is considered reliable if the same results are obtained using the same measuring instrument. If the different forms of the test are not equivalent, then the same measuring instrument is not used, and, accordingly, the reliability estimate will be underestimated.

Constructing equivalent test forms can be difficult and time-consuming. In addition, before different test forms can be used to assess test reliability, they must be tested for equivalence using a different sample. However, once a test has been shown to be adequate and reliable, it may be helpful to have equivalent forms of the test on hand.

But unlike reliability, in addition to random factors Systematic factors influence test validity. They introduce systematic biases into the results. These factors are other mental properties that prevent the property at which the test is aimed from manifesting itself in the test results. For example, we want to measure learning potential (a critical component of a person's overall intellectual ability). But we give the subject a test with a strict time limit and no opportunity to go back and correct the mistake. It is quite obvious that the desired mental property turns out to be mixed in the test with a false mental property - stress resistance: subjects with high levels of stress resistance will perform the test better. This will manifest the effect of systematic distortion.



In modern psychometrics, dozens of different theoretical and experimental methods for testing the validity of tests have been developed. The main element of almost all these methods is the so-called criterion. Validity criterion is a test-independent source of information about the mental property being measured, external to the test. We cannot judge the validity of a test until we compare its results with a source of true (or at least obviously more valid) information about the property being measured - with a criterion.

Specific laboratory criteria predominate in scientific research. For example, a compact test questionnaire for anxiety is being constructed. And as a criterion of validity, a special labor-intensive objective laboratory experiment is used, in which a real situation of anxiety is reproduced (volunteer subjects are threatened with electric shocks for erroneous actions, etc.).

In practice, very often pragmatic criteria are used as a criterion of validity - indicators of the effectiveness of the activity for the sake of predicting which testing is being undertaken. At school, the most typical criterion indicator is academic performance. But for a child’s socio-psychological adaptation, an external criterion indicator may be the level of popularity in the class.

27. The problem of adapting foreign and foreign language tests and methods. (Theoretical and methodological nutritional adaptation of foreign tests and techniques)

Psychological practice is in need of scientifically based and at the same time economical, standardized psychodiagnostic tests. In this regard, the problem of not only the development of domestic, but also the adaptation of foreign, tested and validated diagnostic methods has always been and remains relevant. Test adaptation– this is a set of measures that ensure the adequacy of the methodology in the new conditions of its application.

Main stages of test adaptation:

1) analysis original theoretical provisions the author of the test, which involves identifying points of contact with the theory and methodology of Russian psychology;

2) linguistic translation text and its instructions into the user’s language. This stage ends with an expert assessment of the correspondence of the texts of the translated version to the original texts;

3) experimental verification translated text according to the criteria of validity, reliability and reliability in accordance with psychometric requirements;

4) empirical standardization test on appropriate samples.

From the above stages it is clear that the use of foreign language tests is not just a translation into another language. In this case, the main difficulties are associated not only with linguistic, but also with sociocultural differences in the environment in which the test was created and in which it will be used. Linguistic aspect of adaptation means adapting the vocabulary and grammar of the text to the age and educational specifics of the populations planned for examination, as well as taking into account the connotative meaning (logic specialists use the concept connotative as equivalent to the concept implied. Thus, connotative meaning- this is the one that assumed or implied or expressed by a word, symbol, gesture or event. Connotative meanings usually define abstract qualities, general properties or classes of objects, or emotional components). It should be borne in mind that it is difficult, and sometimes simply impossible, to find an equivalent equivalent in another culture for the linguistic characteristics of the culture of the society in which the test was created. Therefore, the professional translation of mental tests is always accompanied by linguistic correction; language constructs are subject to mental verification (correctness of perception, thoughts, assumptions. Verification- this is confirmation of the final product's compliance with predefined reference requirements.) Therefore, complete empirical adaptation of the test after its translation is mandatory, and often it is as difficult as developing the original methodology.

Recently, the adaptation of foreign tests has become not only an object of discussion among specialists, but also a direction of special research, the subject of relevant methodological, advisory, and instructional literature.

It is known how complex stages of adaptation many methods went through, for example, the Minnesota Multifactor Personality Test (MMPI) or R. Cattell's 16-factor personality questionnaire (16-PF). Adaptation of these methods was expressed in checking the correspondence of American and Slovenian test norms using statistical calculations of arithmetic means and standard deviations for the main diagnostic scales on new samples of subjects. Correlations between the scales of these methods were also studied. However, the most important stage of checking the correctness of the adapted versions of these questionnaires - analysis of the reproducibility of diagnostic scales, i.e. analysis of correlations between individual items - was carried out much later. This made it possible to find out:

1) how legitimate was the borrowing of a system of differentiated concepts (personality traits) in relation to those that were proposed by developers in other sociocultural conditions;

2) what diagnostic concepts actually “work” in our conditions.

As a result of a series of studies, it turned out that foreign multifactor personality test questionnaires in relation to Russian-speaking samples reveal both stable diagnostic properties and specific features.

Thus, for practical psychodiagnostics adaptation of foreign tests means not only semantic interpretation in a new language version, but also their thorough experimental and normative testing in other sociocultural conditions using modern methods of mathematical analysis.

In contrast, validity refers to whether the methodology used in a given study measures the accuracy of what it is intended to measure. For example, when using the Peabody Picture Vocabulary Test, the child will...

Part 1, Comprehensive study of a person’s life path

They call it a booklet with pictures. The experimenter says the stimulus word out loud and asks the child to show one of the 4 pictures on the booklet page, depicting the named object, etc. This is just a test for understanding English words presented orally. However, researchers sometimes mistakenly use it to measure intelligence. Needless to say, such use of the test is invalid, that is, unfounded.

Direct observation. Perhaps the most common type of measurement used with infants and young children is direct observation of the child's behavior in a particular situation. The researcher can observe how the child handles the toy or reacts to strangers. Children can be observed in a school setting to see how they work together to solve a problem. To increase the accuracy and information content of observations, scientists often use recording equipment, such as a video camera. If it is necessary to conduct research on older children, adolescents or adults, organizing direct observation of their behavior encounters increasing difficulties. Teenagers and adults do not really like to “go on stage”; they prefer to tell researchers about their thoughts and feelings.

Analysis of individual cases. This method aims to study individuality and can be in-depth interviews, observation, or a combination of both. Extraordinary people are often selected for research using this method: these can be Nobel laureates, mentally ill people, survivors of concentration camps, and talented musicians. Typically, an informal, qualitative approach is used to describe and evaluate their behavior. Case analysis can be used to develop new areas of study or to more closely examine the sequential interaction of multiple conflicting influences. The earliest example of the use of this method is found in “child diaries” containing observational data on a developing baby. Entries in these types of diaries tend to be incomplete and unsystematic, as can be seen in the excerpts from the diary compiled by Moore (1896).

Week 5: I recognized the person’s face.

9th week: I recognized breasts when I saw them and my mother’s face.

Week 12: I recognized my hand.

Week 16: Recognized my thumb and pacifier.

Week 17: Recognized a marble from a few feet away.

Case studies are rarely used in developmental research because they involve problems of subjectivity and uncontrolled variables, and they involve the study of a single individual. Therefore, establish cause-and-effect relationships and make generalizations about the effects

Chapter 1 Human Development Perspectives, Processes and Research Methods

is almost impossible. At the same time, a correctly conducted analysis of the development of one person can stimulate a more rigorous study of the problems revealed in him.

In practice areas such as medicine, education, social work and clinical psychology, case analysis is an important tool for making diagnoses and making recommendations. A short-term study using this method, such as a detailed analysis of a child's reactions to combat or trauma, may be useful for understanding later behavior. Although case analysis should be treated with caution as a research tool, it provides a vivid, visual, and detailed picture of how the whole individual changes in relation to his environment.

Achievement and ability tests. Written achievement or ability tests are a common form of measuring physical and cognitive aspects of development. To be useful, these tests must be reliable and valid in measuring the abilities they are designed to measure. Most often these are form methods filled out manually, although their computer versions are becoming increasingly common.

Self-report techniques. Self-report methods include interviews and various forms of reports and questionnaires filled out by the subject himself, in which the researcher asks questions to identify the opinions and typical forms of behavior of the respondent. Sometimes subjects are asked to provide information about themselves, about what they are like now, in the present, or what they were like in the past. Sometimes they are asked to reflect on their statements or intentions, make judgments about their behavior or lifestyle, or evaluate themselves on a set of personality traits. In any case, they are expected to try to be as fair and objective as possible. Sometimes such methods include a “lie scale”, which contains questions from the main part of the questionnaire repeated in a slightly modified form and is intended to assess the sincerity of the respondent. Despite this control, the data obtained through self-report techniques may be limited to what the respondent is willing to report or what the researcher deems acceptable to the researcher.

Despite the widespread use of interviews and questionnaires in studies of adolescents and adults, these methods require significant adaptation when working with children. In one such study, researchers sought to understand children's beliefs about themselves and their families. A self-report technique known as interactive dialogue. One of these dialogues was devoted to the question “Who am I like and who are my family members like?” The researcher prepared a set of cards with plot pictures for the interview. When answering questions, children arranged the cards into two groups, thereby indicating the similarities or differences between the situations depicted in the pictures and the relationships in their family (Reid, Ramey, & Burchinal, 1990).

Part 1. A comprehensive study of a person’s life path

Projective techniques. Sometimes the researcher does not ask direct questions at all. In projective tests, subjects are presented with a picture, task or situation that contains an element of uncertainty, and they must tell a story, explain what is drawn, or find a way out of the situation. Since the original task, due to its uncertainty, is such that there can be no right or wrong answers, it is assumed that in this case people will project their own feelings, attitudes, anxieties and needs onto this situation. Probably the most famous projective technique is the Rorschach ink blot test. Another example is the Thematic Apperception Test (TAT), in which the subject is asked to make up short stories as they are presented with a series of pictures of rather vague content. The tester then analyzes the themes contained in all the stories the test taker created.

Projective techniques such as the word association test and the unfinished sentence test are also widely used. Subjects may be asked to complete a sentence like: “My dad always...” They may be shown a set of pictures and asked to tell what is drawn, express their attitude to what is depicted, analyze the pictures, or arrange them in such an order to form a coherent story. For example, in one study, 4-year-old children participated in a game called “Bear Picnic.” The experimenter told several stories about a family of teddy bears. The child was then given one bear cub (“this will be your bear cub”) and asked to complete the story (Mueller, & Lucas, 1975).

End of work -

This topic belongs to the section:

Human development: perspectives, processes and research methods

If you need additional material on this topic, or you did not find what you were looking for, we recommend using the search in our database of works:

What will we do with the received material:

If this material was useful to you, you can save it to your page on social networks:

All topics in this section:

Developmental psychology
7th international edition Series “Masters of Psychology” Translation from English by N. Malgina, N. Mironov, S. Rysev, E. Turutina under the general scientific editorship of prof. A. A. Alek

Craig G
K78 Developmental Psychology. - St. Petersburg: Peter, 2000. - 992 pp.: ill. - (Series “Masters of Psychology”) ISBN 5-314-00128-4 The proposed book went through 7 editions in America. Today this is one of the best

Chapter Objectives
After completing this chapter, you should be able to complete the following tasks. 1. Characterize the biological and environmental processes of development, as well as explain the nature and

Areas of development
Development occurs in three areas: physical, cognitive and psychosocial. The physical domain includes such physical characteristics as the size and shape of the body and organs, changes

General characteristics of development areas
Table 1-1 Developmental Area Characteristics Physical Includes growth and changes in the human body. This is the entrance

Biological development processes
All living organisms develop according to their genetic code, or blueprint. In some species, such as moths or butterflies, this genetic plan is strictly defined and practically not allowed

The influence of the environment on human development
Every moment we are exposed to the environment. Light, sound, warmth, food, medicine, anger, kindness, severity - all this and much more can serve to satisfy basic biological needs.

Interaction of development processes
There is an ongoing debate among some psychologists about how much of our behavior is determined by maturation and how much by learning. The baby first sits down, then stands up, and finally

Critical period - The only period of time in the life cycle of an organism when a certain environmental factor is capable of causing an effect
For example, if a pregnant woman does not have immunity to rubella, as a result of which she falls ill 2 months after conception, this can lead to serious consequences: deafness of the child or

Ecological model
Perhaps the most influential model of human development to date is that proposed by American psychologist Uri Bronfenbrenner. According to his model of ecological systems (Br

Historical and cultural perspectives
The life course perspective, which we have already introduced in this chapter when considering the results of the California Longitudinal Study, is based on 4 basic premises (Stoller,

Human life course: age-related changes in comparison with historical changes
Human development throughout life cannot be studied in a controlled laboratory setting. Cultural and historical factors, mixed with predictable age-related changes,

Cohort is a group or collection of people born in the same period of time
Part 1 A comprehensive study of the human life course Some researchers studied the generation of people born during the Great Depression and who survived the Second World War as teenagers

Attitude towards children in historical perspective
Throughout human history, attitudes towards children have undergone significant changes. According to written sources, in medieval Europe, adults largely ignored the period

Development in the context of a changing family
Attitudes towards the size of the family, its structure and functions also changed over time. Until the 1920s, American families tended to be large, with representatives from three

Objective systematic study of human development
Obtaining reliable, verifiable facts related to the field of human development is not an easy task. How is data from our personal experience different from data collected by a researcher? At what point do you

Principle of objectivity
No matter how fair and impartial we try to be, our personal and cultural attitudes can create serious barriers to a proper understanding of human behavior.

The art of asking questions
Most discoveries in the natural and social sciences have resulted from the asking of meaningful questions and the keen observation of researchers. A scientist, noticing something interesting, exits

Using the Scientific Method
Child development research uses the same scientific method that is used in any other area of ​​the social or behavioral sciences. The term scientific method refers to the topic generally accepted

Selection of research conditions
Planning any social science research involves determining the structure and type of data to be collected and how it will be analyzed. The growing human body develops in

Dependent variable - a variable in an experiment that changes as a result of manipulating the independent variable
The laboratory is an ideal place to test hypotheses and prove the existence of cause-and-effect relationships between variables. It was under these conditions that many studies were carried out.

Choosing an experimental design to study changes over time
As has been said more than once, development is a continuous, dynamic process that continues throughout life. Therefore, genetic research, unlike other types of research, focuses on and

Longitudinal method (design) - Organization of a study in which the same subjects are observed over a set period
Children may be followed until they reach adulthood to see which personality traits persist and which ones disappear. Longitudinal plans are especially attractive to specialists in

Cohort-sequential design
Rice. 1-4. Main research plans: longitudinal, cross-sectional

Data collection methods
Scientific studies produce widely varying results depending on the measurement methods used and the composition of the subjects. People can be observed in real life, or they can be tested

Data interpretation
Once the data has been collected, it is time for the researcher to interpret it and check whether it supports the hypothesis he previously formulated. We don't always interpret the same

Defining pin boundaries
It is all too easy to stray from the solid ground of facts in an attempt to reach conclusions beyond what has actually been discovered in the process of research. Although this can happen in different ways

Causality – A relationship between two variables in which a change in one leads to a change in the other
Maybe aggressive children watch more TV because they like scenes of violence? A number of studies indicate that aggressive behavior and interest in scenes of violence seen on television

CORRELATION – A mathematical expression of the relationship between two variables
When studying children, you can first determine the number of hours that the child spends watching TV programs that include scenes of violence, and then measure the degree of aggressiveness of his behavior

Research Ethics
Needless to say, when conducting research on humans, scientists must be guided by ethical principles. Deliberately causing harm to a subject or infringing on fundamental human rights -

Ethical Standards for Conducting Psychological Research
Most of us would agree that if we want to understand and control the impact that potentially dangerous environmental phenomena have on people, we must somehow


Although the study of human development produces impressive results, it sometimes fails to answer basic socially significant questions. For example, does television cause any

Basic terms and concepts
Validity Readiness Dependent variable Cohort sequential/age cohort des

Chapter Objectives
Having finished studying this chapter, you should be able to cope with the following tasks: 1. Explain the essence of three controversial issues that are raised in theories of child development 2. About

The importance of theories
Large amounts of data that are difficult to deal with are brought into order with the help of theories. Social scientists use theories to formulate important questions to

Nature or nurture
Nature or nurture is a short formulation of the question of which factors - related to heredity or related to the environment - play a determining role in development. Giving n

Continuity or discontinuity
Another fundamentally important question for development researchers is how this process occurs—continuously or spasmodically. Does development occur in such a way that the various

Organism or mechanism
The last question - whether man is by nature an organism or a mechanism - like the previous two, is of philosophical origin. Proponents of the organismic approach believe that people are active,

Learning theories
According to learning theories, the key to understanding human nature is how it is shaped by the environment, that is, most behaviors are acquired through learning. Learning is

Behaviorism
At the beginning of the 20th century, American psychologists began to create the “science of human behavior.” They were not interested in human thoughts, dreams or feelings. Or rather, they wanted to collect “facts”, observable

Stimulus generalization Transfer of a reaction from one specific stimulus to other similar ones
It is easy to see how stimulus generalization occurs in children's everyday lives. Children may be afraid of white doctor's coats or the smell of medicine because they associate them with

To a specific stimulus (or set of stimuli) due to repeated combination of this stimulus with positive reinforcement
How can operant conditioning be used to teach complex actions? Very often the final form of behavior must be built step by step, or formed. With such a step-by-step formation

Response formation Systematic reinforcement of inconsistent approaches to a desired action
The learning machines Skinner used were also based on the principles of operant conditioning (Skinner, 1968). With the help of these machines, training was carried out in small successive steps, from

Social learning theory
Psychologists who created social learning theory expanded the scope of learning theory to explain complex social behavior. To do this, they went far beyond appearances

Consequences of a reaction - the results of actions observed by the individual and used by him to divide his behavior
In contrast to the more mechanistic learning theory, social learning theory gives conscious thought a greater role in guiding behavior. Imitation and

Evaluating learning theories
Learning theories, including behaviorism, modern behavior analysis, and social learning theory, have made major contributions to our understanding of human development. The main focus of these theories is


1. Compare classical and operant conditioning. Be sure to use basic terms when formulating your answer. 2. Explain how the creators of social learning theory expanded

Cognitive theories
Unlike early learning theories, in which people were considered passive machines acting under the influence of the external environment, cognitive theories present people as rational, active, knowledgeable

Understanding conservation according to Piaget, the main cognitive achievement at the stage of concrete operations
Now the child is able to judge changes in the amount of a substance based on logical thinking, and not just on perceptual data; so, he believes that the amount of water will remain constant, yes

Accommodation is Piaget's term for the act of changing our thought processes when a new object or idea does not fit into our concepts
Piaget proposed a biological model to describe the process by which people adapt to the world. When an animal eats, two processes occur simultaneously. Adaptation occurs (acc.

Balancing
Piaget's term for the basic process of human adaptation. In equilibration, individuals strive to achieve balance or correspondence between the external environment and their own

Preoperative stage
According to Piaget, the second stage of cognitive development (from approximately 2 to 7 years). It begins from the time when children acquire the ability to use symbols, such as language. Their mouse

Specific Operations Stage
According to Piaget, the third period of cognitive development (from 7 to 11 years). Children begin to think logically. At this stage, they are able to classify objects taking into account the class hierarchy. Chapter

Information approach to development
Piaget has been criticized by many psychologists, including supporters of the information approach. Like Piaget, they are cognitive psychologists because they study thinking and intelligence.

Cognitive development in social context
According to Piaget's theory, the child is an "active scientist" who interacts with the physical environment and develops increasingly complex thinking strategies. It seems that this

Evaluating Cognitive Theories
Cognitive psychologists criticize learning theories. They believe that the emphasis on repetition and positive reinforcement is too simplistic an approach to explain many aspects

Psychoanalytic tradition
The theories of Sigmund Freud, neo-Freudians and representatives of ego psychology form what we call the psychoanalytic tradition. The driving force behind this trend is the work of Sigmund Freud,

Identification
Accepting the behavior and qualities of a person that the individual respects and would like to emulate. As the child biologically matures, his ego also develops.

Erickson
Erik Erikson (1904-1994) belongs to the third generation of Freudians. His theory of personality development has much in common with Freud's theory, but there are several important differences between them.

Stages of psychosocial development
In Erikson's theory, stages of development during which the life experience acquired by an individual predetermines the most important adaptations to the social environment and fundamental changes in personality.

Assessing the Psychoanalytic Tradition
Although many consider traditional psychoanalysis to be a thing of history, it continues to make significant contributions to the study of human behavior. His strength lies in his wealth

Repetition and application of learned material
1. Describe Freud's assumptions about personality development and his theory of psychosexual development. 2. How does Erikson’s theory of personality development differ from Freud’s theory? 3. Pere

Humanistic psychology
Humanistic psychology emerged in the mid-20th century as a more optimistic third force in the study of personality (Maslow, 1968). It was a reaction against external determinism,

Self-actualization The full realization of one's potential only after the individual's "lower" levels have been satisfied.
For example, a hungry child will not be able to concentrate on reading or drawing at school until he is fed. Maslow built human needs in the form of a pyramid, showing

Positive attitude
It is the "warm, positive, accepting" attitude of the therapist toward the client that Rogers has found is most effective in promoting personal growth. Humanity Appreciation

Ethology
Ethology is a branch of biology that studies the patterns of animal behavior. It was she who revived interest in the biological properties common to humans and animals. Ethologists emphasize the importance of research

Repetition and application of learned material
1. How do humanistic psychology and related theories of the “Self” differ from the theories discussed earlier? 2. Explain the essence of Maslow's hierarchy of needs. 3. What is it?

Application of the studied material
You have just signed a contract with a major publishing house to write your autobiography. Recognizing the scope of the task, you decide to begin by identifying central themes and events

Basic terms and concepts
Accommodation Anal stage Assimilation Behaviorists Stimulus generalization Genital stage

Our planet - a snapshot
MORTALITY RATE FOR CHILDREN UNDER 5 YEARS OF AGE Fig. 1. Average life expectancy at birth, 1993 Japan p Germany p USA |

Chapter Objectives
After completing this chapter, you should be able to complete the following tasks: 1. Explain the principles and processes of genetic reproduction. 2. Describe the causes and characteristics

Before psychodiagnostic techniques can be used for practical purposes, they must be tested against a number of formal criteria that prove their high quality and effectiveness. These requirements in psychodiagnostics have evolved over the years in the process of working on tests and improving them. As a result, it became possible to protect psychology from all sorts of illiterate fakes that pretend to be called diagnostic techniques.

The main criteria for evaluating psychodiagnostic techniques include reliability and validity. Foreign psychologists made a great contribution to the development of these concepts (A. Anastasi, E. Ghiselli, J. Guilford, L. Cronbach, R. Thorndike and E. Hagen, etc.). They developed both formal logical and mathematical-statistical apparatus (primarily the correlation method and factual analysis) to substantiate the degree of compliance of the methods with the noted criteria.

In psychodiagnostics, the problems of reliability and validity of methods are closely interrelated, however, there is a tradition of separately presenting these most important characteristics. Following it, we begin by considering the reliability of the methods.

RELIABILITY

In traditional testing, the term “reliability” means the relative constancy, stability, and consistency of test results during initial and repeated use on the same subjects. As A. Anastasi (1982) writes, one can hardly trust an intelligence test if at the beginning of the week the child had an indicator equal to BUT, and by the end of the week it was 80. Repeated use of reliable methods gives similar estimates. In this case, to a certain extent, both the results themselves and the ordinal place (rank) occupied by the subject in the group may coincide. In both cases, when repeating the experiment, some discrepancies are possible, but it is important that they are insignificant, within the same group. Thus, we can say that the reliability of a technique is a criterion that indicates the accuracy of psychological measurements, i.e. allows us to judge how credible the results are.

The degree of reliability of methods depends on many reasons. Therefore, an important problem in practical diagnostics is the identification of negative factors affecting the accuracy of measurements. Many authors have tried to classify such factors. Among them, the most frequently mentioned are the following:

1) instability of the property being diagnosed;

    imperfection of diagnostic techniques (instructions are carelessly drawn up, tasks are heterogeneous in nature, instructions for presenting the technique to subjects are not clearly formulated, etc.);

    changing examination situation (different times of day when experiments are carried out, different room illumination, presence or absence of extraneous noise, etc.);

    differences in the behavior of the experimenter (from experiment to experiment he presents instructions differently, stimulates the completion of tasks differently, etc.);

    fluctuations in the functional state of the subject (in one experiment there is good health, in another - fatigue, etc.);

    elements of subjectivity in the methods of assessing and interpreting the results (when the test subjects’ answers are recorded, the answers are assessed according to the degree of completeness, originality, etc.).

If you keep all these factors in mind and try to eliminate the conditions in each of them that reduce the accuracy of measurements, then you can achieve an acceptable level of test reliability. One of the most important means of increasing the reliability of a psychodiagnostic technique is the uniformity of the examination procedure, its strict regulation: the same environment and working conditions for the examined sample of subjects, the same type of instructions, the same time restrictions for everyone, methods and features of contact with subjects, the order of presentation of tasks, etc. d. With such standardization of the research procedure, it is possible to significantly reduce the influence of extraneous random factors on the test results and thus increase their reliability.

The reliability characteristics of the methods are greatly influenced by the sample being studied. It can either reduce or increase this indicator; for example, reliability can be artificially increased if there is a small scatter of results in the sample, i.e. if the results are close in value to each other. In this case, during a repeat examination, the new results will also be located in a close group. Possible changes in the rank places of the subjects will be insignificant, and, therefore, the reliability of the technique will be high. The same unjustified overestimation of reliability can occur when analyzing the results of a sample consisting of a group with very high scores and a group with very low test scores. Then these widely separated results will not overlap, even if random factors interfere with the experimental conditions. Therefore, the manual usually describes the sample on which the reliability of the technique was determined.

Currently, reliability is increasingly determined on the most homogeneous samples, i.e. on samples similar in gender, age, level of education, professional training, etc. For each such sample, its own reliability coefficients are given. The reliability indicator given is applicable only to groups similar to those on which it was determined. If a technique is applied to a sample different from the one on which its reliability was tested, then this procedure must be repeated.

many, as well as conditions influencing the results of diagnostic tests (V Cherny, 1983) However, only a few types of reliability find practical application

Since all types of reliability reflect the degree of consistency of two independently obtained series of indicators, the mathematical and statistical technique by which the reliability of the methodology is established is correlation (according to Pearson or Spearman, see Chapter XIV). The more the resulting correlation coefficient approaches unity, the higher the reliability, and vice versa.

In this manual, when describing the types of reliability, the main emphasis is on the works of K.M. Gurevich (1969, 1975, 1977, 1979), who, after conducting a thorough analysis of foreign literature on this issue, proposed to interpret reliability as:

    reliability of the measuring instrument itself,

    stability of the studied trait;

3) constancy, i.e. relative independence of the results from the personality of the experimenter

The indicator characterizing the measuring instrument is proposed to be called the reliability coefficient, the indicator characterizing the stability of the measured property is the stability coefficient; and the indicator for assessing the influence of the experimenter’s personality is the coefficient of constancy.

It is in this order that it is recommended to check the methodology: it is advisable to first check the measurement tool. If the data obtained are satisfactory, then we can proceed to establishing a measure of stability of the property being measured, and after that, if necessary, consider the criterion of constancy.

Let us take a closer look at these indicators, which characterize the reliability of the psychodiagnostic technique from different angles.

1. Determination of the reliability of the measuring instrument. The accuracy and objectivity of any psychological measurement depends on how the methodology is compiled, how correctly the tasks are selected from the point of view of their mutual consistency, and how homogeneous it is. The internal homogeneity of the methodology shows that its tasks actualize the same property, sign.

To check the reliability of a measuring instrument, indicating its homogeneity (or homogeneity), the so-called “splitting” method is used. Typically, tasks are divided into even and odd, processed separately, and then the results of the two obtained series are correlated with each other. To use this method, it is necessary to put the subjects in such conditions that they can have time to solve (or try to solve) all the tasks. If the technique is homogeneous, then there will not be a big difference in the success of the solution for such halves, and, therefore, the correlation coefficient will be quite high.

You can divide tasks in another way, for example, compare the first half of the test with the second, the first and third quarters with the second and fourth, etc. However, “splitting” into even and odd tasks seems to be the most appropriate, since it is this method that is most independent of the influence of such factors such as workability, training, fatigue, etc.

The method is considered reliable when the obtained coefficient is not lower than0.75-0.85. The best reliability tests give coefficients of the order of 0.90 or more.

But at the initial stage of developing a diagnostic technique, low reliability coefficients can be obtained, for example, on the order of 0.46-0.50. This means that the developed methodology contains a certain number of tasks, which, due to their specificity, lead to a decrease in the correlation coefficient. Such tasks need to be specially analyzed and either remade or removed altogether.

To make it easier to establish due to which tasks the correlation coefficients are reduced, it is necessary to analyze the tables with written data prepared for correlations. It should be noted that any changes in the content of the methodology - removal of tasks, their rearrangement, reformulation of questions or answers require recalculating reliability coefficients.

When familiarizing yourself with reliability coefficients, one should not forget that they depend not only on the correct selection of tasks in terms of their mutual consistency, but also on the socio-psychological homogeneity of the sample on which the reliability of the measuring instrument was tested.

In fact, tasks may contain concepts that are little known to one part of the subjects, but well known to another part. The reliability coefficient will depend on how many such concepts are in the methodology; tasks with such concepts can be randomly located in both the even and odd half of the test. Obviously, the reliability indicator should not be attributed solely to the technique as such and cannot be relied upon to remain unchanged no matter what sample is tested.

2. Determination of the stability of the studied trait. Determining the reliability of the technique itself does not mean solving all the issues related to its application. It is also necessary to establish how stable the trait that the researcher intends to measure is. It would be a methodological mistake to count on the absolute stability of psychological characteristics. There is nothing dangerous to reliability in the fact that a measured trait changes over time. The whole point is the extent to which the results vary from experiment to experiment for the same subject, whether these fluctuations lead to the fact that the subject, for unknown reasons, finds himself either at the beginning, then in the middle, or at the end of the sample. It is impossible to draw any specific conclusions about the level of the presented measured trait in such a subject. Thus, fluctuations in the characteristic should not be unpredictable. If the reasons for sharp fluctuations are not clear, then such a sign cannot be used for diagnostic purposes.

To check the stability of the diagnosed sign or property, a technique known as test-retest is used. It consists of re-examining the subjects using the same technique. The stability of a sign is judged by the correlation coefficient between the results of the first and repeated examinations. It will indicate whether each subject retains or does not retain his ordinal number in the sample.

The degree of resistance and stability of the diagnosed property is influenced by various factors. Their number is quite large. It has already been said above how important it is to comply with the requirements of uniformity of the experimental procedure. So, for example, if the first test was carried out in the morning, then the second test

should be carried out in the morning, if the first experiment was accompanied by a preliminary display of tasks, then during the second test this condition must also be met, etc.

When determining the stability of a sign, the time interval between the first and repeated examinations is of great importance. The shorter the period from the first to the second test, the greater the chance (other things being equal) that the symptom being diagnosed will maintain the level of the first test. As the time interval increases, the stability of the trait tends to decrease, as the number of extraneous factors influencing it increases. Consequently, the conclusion suggests itself that it is advisable to conduct repeated testing shortly after the first one. However, there are difficulties here: if the period between the first and second experiments is short, then some subjects can reproduce their previous answers in memory and, thus, deviate from the meaning of completing the tasks. In this case, the results of the two presentations of the technique can no longer be considered independent.

It is difficult to clearly answer the question of what period can be considered optimal for a repeated experiment. Only a researcher, based on the psychological essence of the technique, the conditions in which it is carried out, and the characteristics of the sample of subjects, can determine this period. Moreover, such a choice must be scientifically justified. In the testological literature, time intervals of several months (but not more than six months) are most often referred to. When examining young children, when age-related changes and development occur very quickly, these intervals can be on the order of several weeks (A Anastasi, 1982).

It is important to remember that the stability coefficient should not be considered only from its narrow formal side, in terms of its absolute values. If the test examines a property that is in the process of intensive development during the testing period (for example, the ability to make generalizations), then the stability coefficient may turn out to be low, but this should not be interpreted as a drawback of the test. Such stability coefficient should be interpreted as an indicator of certain changes, the development of the studied properties. In this case, for example, KM Gurevich (1975) recommends considering in parts the sample on which the stability coefficient was established. With such an examination, a part of the subjects will be identified who go through the path of development at an equally even pace, another part - where development proceeded at a particularly rapid pace; and a part of the sample where the development of the subjects is almost completely invisible. Each part of the sample deserves special analysis and interpretation. Therefore, it is not enough to simply state that the stability coefficient is low; you need to understand what it depends on.

A completely different requirement is placed on the stability coefficient if the author of the technique believes that the property being measured has already been formed and should be sufficiently stable. The stability coefficient in this case should be quite high (not lower than 0.80).

Thus, the question of the stability of the measured property is not always resolved unambiguously. The decision depends on the essence of the property being diagnosed.

3. Definition of constancy, that is, the relative independence of the results from the personality of the experimenter. Since the technique developed for diagnostic

goals, is not intended to remain forever in the hands of its creators, it is extremely important to know to what extent its results are influenced by the personality of the experimenter. Although a diagnostic technique is always provided with detailed instructions for its use, rules and examples indicating how to conduct an experiment, it is very difficult to regulate the experimenter’s behavior, speed of speech, tone of voice, pauses, and facial expression. The subject's attitude towards the experience will always reflect how the experimenter himself relates to this experience (allows negligence or acts exactly in accordance with the requirements of the procedure, is demanding, persistent or uncontrolled, etc.).

The personality of the experimenter plays a particularly significant role when conducting so-called non-deterministic techniques (for example, in projective tests).

Although in testological practice the criterion of constancy is used infrequently, however, according to KM Gurevich (1969), this cannot serve as a basis for its underestimation. If the authors of the method have suspicions about the possible influence of the experimenter’s personality on the outcome of the diagnostic procedure, then it is advisable to check the method according to this criterion. It is important to keep in mind the following point. If, under the influence of a new experimenter, all subjects equally began to work a little better or a little worse, then this fact in itself (although it deserves attention) will not affect the reliability of the technique. Reliability will change only when the influence of the experimenter on the subjects is different: some began to work better, others worse, and others the same as under the first experimenter. In other words, if the subjects, under the new experimenter, changed their ordinal places in the sample.

The coefficient of constancy is determined by correlating the results of two experiments conducted under relatively identical conditions on the same sample of subjects, but by different experimenters. The correlation coefficient should not be lower than 0.80.

So, three indicators of the reliability of psychodiagnostic techniques were considered. The question may arise: is it necessary to test each of them when creating psychodiagnostic methods? There is a debate about this in foreign literature. Some researchers believe that all methods of determining the reliability of a test are to some extent identical and therefore it is enough to check the reliability of the technique with one of them. For example, the author of a book on statistics for psychologists and teachers, repeatedly republished in the USA, G. Garrett (1962), does not find any fundamental differences between methods of testing reliability. In his opinion, all these methods show the reproducibility of test indicators. Sometimes one, sometimes the other, provides a better criterion. Other researchers take a different point of view. Thus, the authors of “Standard Requirements for Pedagogical and Psychological Tests” (1974) in the chapter “Reliability” note that the reliability coefficient in the modern sense is a generic concept that includes several types, and each type has its own special meaning. This point of view is also shared by K.M. Gurevich (1975). In his opinion, when they talk about different ways of determining reliability, they are not dealing with a better or worse measure, but with measures of essentially different reliability. In fact, what is a technique worth if it is not clear whether it itself is reliable as a measuring instrument or whether the stability of the property being measured has not been established? What is the cost of a diagnostic technique if

It is not known whether the results may change depending on who is conducting the experiment? Each individual indicator cannot replace other verification methods in any way and, therefore, cannot be considered as a necessary and sufficient reliability characteristic. Only a technique that has a complete reliability characteristic is most suitable for diagnostic and practical use.

VALIDITY

After reliability, another key criterion for assessing the quality of methods is validity. The question of the validity of methods is resolved only after its sufficient reliability has been established, since an unreliable method without knowledge of its validity is practically useless.

It should be noted that the question of validity until recently seems to be one of the most difficult. The most established definition of this concept is the one given in the book by A. Anastasi: “Test validity is a concept that tells us what the test measures and how well it does it” (1982, p. 126). Validity at its core is a complex characteristic that includes, on the one hand, information about whether the technique is suitable for measuring what it was created for, and on the other hand, what its effectiveness and efficiency are. For this reason, there is no single universal approach to determining validity. Depending on which aspect of validity the researcher wants to consider, different methods of evidence are used. In other words, the concept of validity includes its different types, which have their own special meaning. Checking the validity of a methodology is called validation.

Validity in its first understanding is related to the methodology itself, that is, it is the validity of the measuring instrument. This type of testing is called theoretical validation. Validity in the second understanding refers not so much to the methodology as to the purpose of its use. This is pragmatic validation.

So, during theoretical validation, the researcher is interested in the property itself measured by the technique. This essentially means that psychological validation itself is being carried out. With pragmatic validation, the essence of the subject of measurement (psychological property) is out of sight. The main emphasis is on proving that the “something” measured by the technique has a connection with certain areas of practice.

Conducting theoretical validation, as opposed to pragmatic validation, sometimes turns out to be much more difficult. Without going into specific details for now, let us dwell in general terms on how pragmatic validity is checked: some external criterion, independent of the methodology, is selected that determines success in a particular activity (educational, professional, etc.), and with it The results of the diagnostic technique are compared. If the connection between them is considered satisfactory, then a conclusion is drawn about the practical effectiveness and efficiency of the diagnostic technique.

For determining theoretical validity It is much more difficult to find any independent criterion that lies outside the methodology. Therefore, in the early stages of the development of testology, when the concept of validity was just taking shape, there was an intuitive idea that the test measures:

1) the methodology was recognized as valid, since what it measures is simply “obvious”;

    the proof of validity was based on the researcher's confidence that his method allows him to “understand the subject”;

    the technique was considered valid (i.e., the statement was accepted that such and such a test measures such and such a quality) only because the theory on the basis of which the technique was based was “very good.”

Acceptance of unfounded statements about the validity of the methodology could not continue for a long time. The first manifestations of truly scientific criticism debunked this approach: the search for scientifically based evidence began.

As already mentioned, to carry out theoretical validation of a technique is to show whether the technique really measures exactly the property, the quality that it, according to the researcher, should measure. So, for example, if some test was developed in order to diagnose the mental development of schoolchildren, it is necessary to analyze whether it really measures this development, and not some other characteristics (for example, personality, character, etc.). Thus, for theoretical validation, the cardinal problem is the relationship between mental phenomena and their indicators, through which these mental phenomena are attempted to be cognized. It shows that the author’s intention and the results of the methodology coincide.

It is not so difficult to carry out theoretical validation of a new technique if there is already a technique with known, proven validity for measuring a given property. The presence of a correlation between a new and a similar old technique indicates that the developed technique measures the same psychological quality as the reference one. And if the new method at the same time turns out to be more compact and economical in carrying out and processing the results, then psychodiagnosticians have the opportunity to use a new tool instead of the old one. This technique is especially often used in differential psychophysiology when creating methods for diagnosing the basic properties of the human nervous system (see Chapter VII).

But theoretical validity is proven not only by comparison with related indicators, but also with those where, based on the hypothesis, there should be no significant connections. Thus, to check theoretical validity, it is important, on the one hand, to establish the degree of connection with a related technique (convergent validity) and the absence of this connection with techniques that have a different theoretical basis (discriminant validity).

It is much more difficult to carry out theoretical validation of a technique when such a path is impossible. Most often, this is the situation a researcher faces. In such circumstances, only the gradual accumulation of various information about the property being studied, the analysis of theoretical premises and experimental data, and significant experience in working with the technique make it possible to reveal its psychological meaning.

An important role in understanding what the methodology measures is played by comparing its indicators with practical forms of activity. But here it is especially important that the methodology be carefully worked out theoretically, i.e. so that there is a solid, well-founded scientific basis. Then, when comparing the methodology with that taken from

everyday practice, by an external criterion corresponding to what it measures, information can be obtained that supports theoretical ideas about its essence.

It is important to remember that if theoretical validity is proven, then the interpretation of the obtained indicators becomes clearer and more unambiguous, and the name of the technique corresponds to the scope of its application.

Concerning pragmatic validation, then it involves testing the methodology from the point of view of its practical effectiveness, significance, and usefulness. It is given great importance, especially where the question of selection arises. The development and use of diagnostic techniques makes sense only when there is a reasonable assumption that the quality being measured is manifested in certain life situations, in certain types of activities.

If we again turn to the history of the development of testology (A Anastasi, 1982; B.S. Avanesov, 1982; K.M. Gurevich, 1970; “General psychodiagnostics”, 1987; B.M. Teplov, 1985, etc.), then we can highlight such a period (20 -30s), when the scientific content of the tests and their theoretical “baggage” were of less interest. It was important that the test “work” and help quickly select the most prepared people. The empirical criterion for assessing test tasks was considered the only correct guideline in solving scientific and applied problems.

The use of diagnostic techniques with purely empirical justification, without a clear theoretical basis, often led to pseudoscientific conclusions and unjustified practical recommendations. It was impossible to name exactly the abilities and qualities that the tests revealed. B.M. Teplov, analyzing the tests of that period, called them “blind tests” (1985).

This approach to the problem of test validity was typical until the early 50s. not only in the USA, but also in other countries. The theoretical weakness of empirical validation methods could not but arouse criticism from those scientists who, in the development of tests, called for relying not only on “bare” empirics and practice, but also on a theoretical concept. Practice without theory, as we know, is blind, and theory without practice is dead. Currently, theoretical and pragmatic assessment of the validity of methods is perceived as the most productive.

To carry out pragmatic validation of the methodology, i.e. To assess its effectiveness, efficiency, and practical significance, an independent external criterion is usually used - an indicator of the manifestation of the property being studied in everyday life. Such a criterion can be academic performance (for tests of learning abilities, tests of achievements, tests of intelligence), production achievements (for methods of professional orientation), the effectiveness of real activities - drawing, modeling, etc. (for special ability tests), subjective assessments (for personality tests).

American researchers Tiffin and McCormick (1968), after analyzing the external criteria used to prove the validity, identify four types:

1) performance criteria (these may include such as the amount of work completed, academic performance, time spent on training, growth rate

qualifications, etc.);

2) subjective criteria (they include various types of answers that reflect a person’s attitude towards something or someone, his opinion, views, preferences; usually subjective criteria are obtained using interviews, questionnaires, questionnaires);

3) physiological criteria (they are used to study the influence of the environment and other situational variables on the human body and psyche; pulse rate, blood pressure, electrical resistance of the skin, symptoms of fatigue, etc. are measured);

4) criteria of accidents (applied when the purpose of the study concerns, for example, the problem of selecting for work such persons who are less susceptible to accidents).

The external criterion must meet three basic requirements:

it must be relevant, free from contamination and reliable.

Relevance refers to the semantic correspondence of a diagnostic tool to an independent vital criterion. In other words, there must be confidence that the criterion involves precisely those features of the individual psyche that are measured by the diagnostic technique. The external criterion and the diagnostic technique must be in internal semantic correspondence with each other, and be qualitatively homogeneous in psychological essence (K.M. Gurevich, 1985). If, for example, a test measures individual characteristics of thinking, the ability to perform logical actions with certain objects and concepts, then the criterion should also look for the manifestation of precisely these skills. This equally applies to professional activities. It has not one, but several goals and objectives, each of which is specific and imposes its own conditions for implementation. This implies the existence of several criteria for performing professional activities. Therefore, success in diagnostic techniques should not be compared with production efficiency in general. It is necessary to find a criterion that, based on the nature of the operations performed, is comparable to the methodology.

If it is unknown regarding an external criterion whether it is relevant to the property being measured or not, then comparing the results of a psychodiagnostic technique with it becomes practically useless. It does not allow one to come to any conclusions that could assess the validity of the methodology.

The requirements for freedom from contamination are caused by the fact that, for example, educational or industrial success depends on two variables: on the person himself, his individual characteristics, measured by methods, and on the situation, study and work conditions, which can introduce interference and “contaminate” the applied criterion . To avoid this to some extent, groups of people who are in more or less identical conditions should be selected for research. Another method can be used. It consists of correcting the influence of interference. This adjustment is usually statistical in nature. Thus, for example, productivity should not be taken in absolute terms, but in relation to the average productivity of workers working under similar conditions.

When they say that a criterion must have statistically significant reliability, this means that it must reflect the constancy and stability of the function being studied.

The search for an adequate and easily identified criterion is a very important and complex task of validation. In Western testing, many methods are disqualified only because it was not possible to find a suitable criterion for testing them. For example, most questionnaires have questionable validity data because it is difficult to find an adequate external criterion that corresponds to what they measure.

Assessment of the validity of the methodology can be quantitative and qualitativecharacter.

To calculate a quantitative indicator - the validity coefficient - the results obtained when applying the diagnostic technique are compared with the data obtained using an external criterion for the same individuals. Different types of linear correlation are used (according to Spearman, according to Pearson).

How many subjects are needed to calculate validity? Practice has shown that there should not be less than 50, but more than 200 is best. The question often arises, what should the value of the validity coefficient be for it to be considered acceptable? In general, it is noted that it is sufficient for the validity coefficient to be statistically significant. A validity coefficient of about 0.20-0.30 is considered low, average - 0.30-0.50 and high - over 0.60.

But, as A. Anastasi (1982) emphasizes, K.M. Gurevich (1970) and others, it is not always legitimate to use linear correlation to calculate the validity coefficient. This technique is justified only when it is proven that success in some activity is directly proportional to success in performing a diagnostic test. The position of foreign testologists, especially those involved in professional suitability and selection, most often comes down to the unconditional recognition that the one who has completed more tasks in the test is more suitable for the profession. But it may also be that to succeed in an activity you need to have a property at the level of 40% of the test solution. Further success in the test no longer has any significance for the profession. A clear example from the monograph of KM Gurevich: a postman must be able to read, but whether he reads at normal speed or at a very high speed - this no longer has professional significance. With such a correlation between the indicators of the method and the external criterion, the most adequate way to establish validity may be the criterion of differences.

Another case is also possible: a higher level of property than required by the profession interferes with professional success. So F. Taylor found that the most developed female production workers have low labor productivity. That is, their high level of mental development prevents them from working highly productively. In this case, analysis of variance or calculation of correlation relationships would be more suitable for calculating the validity coefficient.

As the experience of foreign testologists has shown, not a single statistical procedure is able to fully reflect the diversity of individual assessments. Therefore, another model is often used to prove the validity of methods - clinical assessments. This is nothing more than a qualitative description of the essence of what is being studied

properties. In this case, we are talking about the use of techniques that do not rely on statistical processing.

There are several types of validity, conditioned by the peculiarities of diagnostic techniques, as well as the temporary status of the external criterion In many works (A Anastasi, 1982; L.F. Burlachuk, SM. Morozov, 1989; KM. Gurevich, 1970; B.V. Kulagin, 1984; V. Cherny, 1983; " General psychodiagnostics", 1987, etc.) the following are most often called:

    Content validity. This technique is used primarily in achievement tests. Typically, achievement tests do not include all the material that students have covered, but some small part of it (3-4 questions). Can you be sure that the correct answers to these few questions indicate that you have mastered all the material? This is what a content validity test should answer. To do this, a comparison of success on the test with expert assessments of teachers (based on this material) is carried out. Content validity also applies to criterion-referenced tests. This technique is sometimes called logical validity.

    Concurrent validity or ongoing validity, is determined using an external criterion by which information is collected simultaneously with experiments on the method being tested. In other words, data is collected relating to present performance during the test period, performance during the same period, etc. The results of success on the test are correlated with it.

    "Predictive" validity(another name is “predictive” validity). It is also determined by a fairly reliable external criterion, but information on it is collected some time after the test. An external criterion is usually a person’s ability, expressed in some kind of assessment, for the type of activity for which he was selected based on the results of diagnostic tests. Although this technique is most consistent with the task of diagnostic techniques - predicting future success, it is very difficult to apply. The accuracy of the forecast is inversely related to the time specified for such forecasting. The more time passes after measurement, the greater the number of factors that need to be taken into account when assessing the prognostic significance of the technique. However, it is almost impossible to take into account all the factors influencing the prediction.

    "Retrospective" validity. It is determined on the basis of a criterion reflecting events or the state of quality in the past. Can be used to quickly obtain information about the predictive capabilities of the technique. Thus, to check the extent to which good aptitude test results correspond to rapid learning, past performance assessments, past expert opinions, etc. can be compared. in individuals with high and low current diagnostic indicators.

When providing data on the validity of the developed methodology, it is important to indicate exactly what type of validity is meant (by content, by simultaneity, etc.). It is also advisable to provide information about the number and characteristics of the individuals on whom the validation was carried out. Such information allows the researcher using the technique to decide how valid this technique is for the group being targeted.

which he is going to use it. As with reliability, it is important to remember that a technique may have high validity in one sample and low validity in another. Therefore, if a researcher plans to use a technique on a sample of subjects that differs significantly from the one on which the validity test was carried out, he needs to re-conduct such a test. The validity coefficient given in the manual applies only to groups of subjects similar to those on which it was determined.

Literature

Anastasi A. Psychological testing" In 2 books / Edited by K.M. Gurevich, V. I. Lubovsky M., 1982. Book 1.

Gurevich K.M O reliability of psychophysiological indicators // Problems of differential psychophysiology M., 1969 Vol. VI. From 266-275.

Gurevich K.M. Reliability of psychological tests // Psychological diagnostics Its problems and methods M, 1975 P 162-176.

Gurevich KM Statistics - an apparatus for proving psychological diagnostics//Problems of psychological diagnostics Tallinn 1977. pp. 206-225

Gurevich K.MCh^o is psychological diagnostics M., 1985.

Share with friends or save for yourself:

Loading...