Jump to content

英文维基 | 中文维基 | 日文维基 | 草榴社区

User:Bataromatic/sandbox2

From Wikipedia, the free encyclopedia

The widespread definition in psychometrics, proposed by psychologist Stanley Smith Stevens (1946), is that measurement is "the assignment of numerals to objects or events according to some rule." Stevens defined measurement in this manner in the same paper that he proposed the four levels of measurement: nominal, ordinal, interval, and ratio.[1][2] This convention differs from the classical definition of measurement adopted in the physical sciences, namely that scientific measurement entails "the estimation or discovery of the ratio of some magnitude of a quantitative attribute to a unit of the same attribute" (p. 358)[3] This framework of distinguishing levels of measurement originated in psychology and is widely criticized by scholars in other disciplines.[4]

Stevens proposed his typology in a 1946 Science article titled "On the theory of scales of measurement".[5] In that article, Stevens suggested that all measurement in science was conducted using four different types of scales, unifying both "qualitative" (which are described by his "nominal" type) and "quantitative" (to a different degree, all the rest of his scales). The concept of scale types later received the mathematical rigor that it lacked at its inception with the work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens (1981a, b), and R. Duncan Luce (1986, 1987, 2001). As Luce (1997, p. 395) wrote:

S. S. Stevens (1946, 1951, 1975) claimed that what counted was having an interval or ratio scale. Subsequent research has given meaning to this assertion, but given his attempts to invoke scale type ideas it is doubtful if he understood it himself ... no measurement theorist I know accepts Stevens's broad definition of measurement ... in our view, the only sensible meaning for 'rule' is empirically testable laws about the attribute.

While Stevens's typology is widely adopted, it is still being challenged by other theoreticians, particularly in the cases of the nominal and ordinal types (Michell, 1986).[6] Some, however, have argued that the degree of discord can be overstated. Hand says:[7]

Basic psychology texts often begin with Stevens’s framework and the ideas are ubiquitous. Indeed, the essential soundness of his hierarchy has been established for representational measurement by mathematicians, determining the invariance properties of mappings from empirical systems to real number continua. Certainly the ideas have been revised, extended, and elaborated, but the remarkable thing is his insight given the relatively limited formal apparatus available to him and how many decades have passed since he coined them.

Duncan (1986) objected to the use of the word measurement in relation to the nominal type, but Stevens (1975) said of his own definition of measurement that "the assignment can be any consistent rule. The only rule not allowed would be random assignment, for randomness amounts in effect to a nonrule".

The use of the mean as a measure of the central tendency for the ordinal type is still debatable among those who accept Stevens's typology. Many behavioural scientists do use the mean for ordinal data. This is often justified on the basis that the ordinal type in behavioural science is in fact somewhere between the true ordinal and interval types; although the interval difference between two ordinal ranks is not constant, it is often of the same order of magnitude.

L. L. Thurstone made progress toward developing a justification for obtaining the interval type, based on the law of comparative judgment. A common application of the law is the analytic hierarchy process. Further progress was made by Georg Rasch (1960), who developed the probabilistic Rasch model that provides a theoretical basis and justification for obtaining interval-level measurements from counts of observations such as total scores on assessments.

These divergent responses are reflected in alternative approaches to measurement. For example, methods based on covariance matrices are typically employed on the premise that numbers, such as raw scores derived from assessments, are measurements. Such approaches implicitly entail Stevens's definition of measurement, which requires only that numbers are assigned according to some rule. The main research task in social sciences is generally considered to be the discovery of associations between scores, and of factors posited to underlie such associations.[8]


strikethrough=original writing from page

This is all or in part lifted, do not copy directly:

Definition of measurement in the social sciences

[edit]

The definition of measurement in the social sciences has a long history. A currently widespread definition, proposed by Stanley Smith Stevens (1946), is that measurement is "the assignment of numerals to objects or events according to some rule." This definition was introduced in the paper in which Stevens proposed four levels of measurement. Although widely adopted, this definition differs in important respects from the more classical definition of measurement adopted in the physical sciences, namely that scientific measurement entails "the estimation or discovery of the ratio of some magnitude of a quantitative attribute to a unit of the same attribute" (p. 358)[3]

Indeed, Stevens's definition of measurement was put forward in response to the British Ferguson Committee, whose chair, A. Ferguson, was a physicist. The committee was appointed in 1932 by the British Association for the Advancement of Science to investigate the possibility of quantitatively estimating sensory events. Although its chair and other members were physicists, the committee also included several psychologists. The committee's report highlighted the importance of the definition of measurement. While Stevens's response was to propose a new definition, which has had considerable influence in the field, this was by no means the only response to the report. Another, notably different, response was to accept the classical definition, as reflected in the following statement:

Measurement in psychology and physics are in no sense different. Physicists can measure when they can find the operations by which they may meet the necessary criteria; psychologists have but to do the same. They need not worry about the mysterious differences between the meaning of measurement in the two sciences (Reese, 1943, p. 49).[9]

These divergent responses are reflected in alternative approaches to measurement. For example, methods based on covariance matrices are typically employed on the premise that numbers, such as raw scores derived from assessments, are measurements. Such approaches implicitly entail Stevens's definition of measurement, which requires only that numbers are assigned according to some rule. The main research task, then, is generally considered to be the discovery of associations between scores, and of factors posited to underlie such associations.[8]

On the other hand, when measurement models such as the Rasch model are employed, numbers are not assigned based on a rule. Instead, in keeping with Reese's statement above, specific criteria for measurement are stated, and the goal is to construct procedures or operations that provide data that meet the relevant criteria. Measurements are estimated based on the models, and tests are conducted to ascertain whether the relevant criteria have been met.[citation needed]

Instruments and procedures

[edit]

The first[citation needed]psychometric instruments were designed to measure the concept of intelligence.[10] One historical approach involved the Stanford-Binet IQ test, developed originally by the French psychologist Alfred Binet. An alternative conception of intelligence is that cognitive capacities within individuals are a manifestation of a general component, or general intelligence factor, as well as cognitive capacity specific to a given domain.[citation needed]

Another major focus in psychometrics has been on personality testing. There have been a range of theoretical approaches to conceptualizing and measuring personality, though there is no widely agreed upon theory. Some of the better known instruments include the Minnesota Multiphasic Personality Inventory, the Five-Factor Model (or "Big 5") and tools such as Personality and Preference Inventory and the Myers-Briggs Type Indicator. Attitudes have also been studied extensively using psychometric approaches.[citation needed] An alternative method involves the application of unfolding measurement models, the most general being the Hyperbolic Cosine Model (Andrich & Luo, 1993).[11]



This is all lifted, do not copy directly:

Later career and the Binet–Simon test

[edit]

In 1899, Alfred Binet was asked to be a member of the Free Society for the Psychological Study of the Child. French education changed greatly during the end of the nineteenth century, because of a law that passed which made it mandatory for children ages six to fourteen to attend school. This group to which Binet became a member hoped to begin studying children in a scientific manner. Binet and many other members of the society were appointed to the Commission for the Retarded. The question became "What should be the test given to children thought to possibly have learning disabilities, that might place them in a special classroom?" Binet made it his problem to establish the differences that separate the normal child from the abnormal, and to measure such differences. L'Etude experimentale de l'intelligence (Experimental Studies of Intelligence) was the book he used to describe his methods and it was published in 1903.

Development of more tests and investigations began soon after the book, with the help of a young medical student named Theodore Simon. Simon had nominated himself a few years before as Binet's research assistant and worked with him on the intelligence tests that Binet is known for, which share Simon's name as well. In 1905, a new test for measuring intelligence was introduced and simply called the Binet–Simon scale. In 1908, they revised the scale, dropping, modifying, and adding tests and also arranging them according to age levels from three to thirteen.

In 1904 a French professional group for child psychology, La Société Libre pour l'Etude Psychologique de l'Enfant, was called upon by the French government to appoint a commission on the education of retarded children. The commission was asked to create a mechanism for identifying students in need of alternative education. Binet, being an active member of this group, found the impetus for the development of his mental scale.

Binet and Simon, in creating what historically is known as the Binet-Simon Scale, comprised a variety of tasks they thought were representative of typical children's abilities at various ages. This task-selection process was based on their many years of observing children in natural settings[12] and previously published research by Binet and others.[13] They then tested their measurement on a sample of fifty children, ten children per five age groups. The children selected for their study were identified by their school teachers as being average for their age. The purpose of this scale of normal functioning, which would later be revised twice using more stringent standards, was to compare children's mental abilities relative to those of their normal peers.[14]

The scale consisted of thirty tasks of increasing difficulty. The easier ones could be done by everyone. Some of the simplest test items assessed whether or not a child could follow a beam of light or talk back to the examiner. Slightly harder tasks required children to point to various named body parts, repeat back a series of 2 digits, repeat simple sentences, and define words like house, fork or mama. More difficult test items required children to state the difference between pairs of things, reproduce drawings from memory or to construct sentences from three given words such as "Paris, river and fortune." The hardest test items included asking children to repeat back 7 random digits, find three rhymes for the French word "obéissance" and to answer questions such as "My neighbor has been receiving strange visitors. He has received in turn a doctor, a lawyer, and then a priest. What is taking place?" (Fancher, 1985).

Reproduction of an item from the 1908 Binet-Simon intelligence scale, showing three pairs of pictures, about which the tested child was asked, "Which of these two faces is the prettier?" Reproduced from the article "A Practical Guide for Administering the Binet-Simon Scale for Measuring Intelligence" by J. W. Wallace Wallin in the March 1911 issue of the journal The Psychological Clinic (volume 5 number 1), public domain.

For the practical use of determining educational placement, the score on the Binet-Simon scale would reveal the child's mental age. For example, a 6-year-old child who passed all the tasks usually passed by 6 year-olds—but nothing beyond—would have a mental age that exactly matched his chronological age, 6.0. (Fancher, 1985).

Binet was forthright about the limitations of his scale. He stressed the remarkable diversity of intelligence and the subsequent need to study it using qualitative, as opposed to quantitative, measures. Binet also stressed that intellectual development progressed at variable rates and could be influenced by the environment; therefore, intelligence was not based solely on genetics, was malleable rather than fixed, and could only be found in children with comparable backgrounds.[14] Given Binet's stance that intelligence testing was subject to variability and was not generalizable, it is important to look at the metamorphosis that mental testing took on as it made its way to the U.S.

While Binet was developing his mental scale, the business, civic, and educational leaders in the U.S. were facing issues of how to accommodate the needs of a diversifying population, while continuing to meet the demands of society. There arose the call to form a society based on meritocracy[14] while continuing to underline the ideals of the upper class. In 1908, H.H. Goddard, a champion of the eugenics movement, found utility in mental testing as a way to evidence the superiority of the white race. After studying abroad, Goddard brought the Binet-Simon Scale to the United States and translated it into English.

Following Goddard in the U.S. mental testing movement was Lewis Terman, who took the Simon-Binet Scale and standardized it using a large American sample. The new Stanford-Binet scale was no longer used solely for advocating education for all children, as was Binet's objective. A new objective of intelligence testing was illustrated in the Stanford-Binet manual with testing ultimately resulting in "curtailing the reproduction of feeble-mindedness and in the elimination of an enormous amount of crime, pauperism, and industrial inefficiency".[15]

Addressing the question why Binet did not speak out concerning the newfound uses of his measure, Siegler pointed out that Binet was somewhat of an isolationist in that he never traveled outside France and he barely participated in professional organizations.[14] Additionally, his mental scale was not adopted in his own country during his lifetime and therefore was not subjected to the same fate. Finally, when Binet did become aware of the "foreign ideas being grafted on his instrument" he condemned those who with 'brutal pessimism' and 'deplorable verdicts' were promoting the concept of intelligence as a single, unitary construct (White, 2000).

He did a lot of studies of children. His experimental subjects ranged from 3 to 18 years old. Binet published the third version of the Binet-Simon scale shortly before his death in 1911. The Binet-Simon scale was and is hugely popular around the world, mainly because of the vast literature it has fostered, as well as its relative ease of administration.

Since his death, many people in many ways have honored Binet, but two of these stand out. In 1917, the Free Society for the Psychological Study of the Child, of which Binet became a member in 1899 and which prompted his development of the intelligence tests, changed their name to La Société Alfred Binet, in memory of the renowned psychologist. The second honor was not until 1984, when the journal Science 84 picked the Binet-Simon scale as one of twenty of the century's most significant developments or discoveries.

He studied sexual behavior, coining the term erotic fetishism to describe individuals whose sexual interests in nonhuman objects, such as articles of clothing,[16] and linking this to the after-effects of early impressions in an anticipation of Freud.[17]

Between 1904 and 1909, Binet co-wrote several plays for the Grand Guignol theatre with the playwright André de Lorde.[18]

He also studied the abilities of Valentine Dencausse, the most famous chiromancer in Paris in those days.


Source: Latent Variables----> Psychology

Latent variables, as created by factor analytic methods, generally represent "shared" variance, or the degree to which variables "move" together. Variables that have no correlation cannot result in a latent construct based on the common factor model.[19]

  1. ^ Kirch, Wilhelm, ed. (2008). "Level of Measurement". Encyclopedia of Public Health. Vol. 2. Springer. pp. 851–852. doi:10.1007/978-1-4020-5614-7_1971. ISBN 978-1-4020-5613-0.
  2. ^ Stevens, S. S. (7 June 1946). "On the Theory of Scales of Measurement". Science. 103 (2684): 677–680. Bibcode:1946Sci...103..677S. doi:10.1126/science.103.2684.677. PMID 17750512. S2CID 4667599.
  3. ^ a b Michell, Joel (August 1997). "Quantitative science and the definition of measurement in psychology". British Journal of Psychology. 88 (3): 355–383. doi:10.1111/j.2044-8295.1997.tb02641.x.
  4. ^ Michell, J. (1986). "Measurement scales and statistics: a clash of paradigms". Psychological Bulletin. 100 (3): 398–407. doi:10.1037/0033-2909.100.3.398.
  5. ^ Stevens, S. S. (7 June 1946). "On the Theory of Scales of Measurement". Science. 103 (2684): 677–680. Bibcode:1946Sci...103..677S. doi:10.1126/science.103.2684.677. PMID 17750512. S2CID 4667599.
  6. ^ Velleman, Paul F.; Wilkinson, Leland (1993). "Nominal, ordinal, interval, and ratio typologies are misleading". The American Statistician. 47 (1): 65–72. doi:10.2307/2684788. JSTOR 2684788.
  7. ^ Hand, David J. (2017). "Measurement: A Very Short Introduction—Rejoinder to discussion". Measurement: Interdisciplinary Research and Perspectives. 15 (1): 37–50. doi:10.1080/15366367.2017.1360022. hdl:10044/1/50223.
  8. ^ a b http://www.assessmentpsychology.com/psychometrics.htm
  9. ^ Reese, T.W. (1943). The application of the theory of physical measurement to the measurement of psychological magnitudes, with three experimental examples. Psychological Monographs, 55, 1–89. doi:10.1037/h0061367
  10. ^ "Los diferentes tipos de tests psicometricos - examen psicometrico". examenpsicometrico.com.
  11. ^ Andrich, D. & Luo, G. (1993). A hyperbolic cosine latent trait model for unfolding dichotomous single-stimulus responses. Applied Psychological Measurement, 17, 253-276.
  12. ^ Wolf, Theta H. (1973). Alfred Binet. Chicago, IL: The University of Chicago Press. ISBN 9780226904986.
  13. ^ Gibbons, Aisa; Warne, Russell T. (2019). "First publication of subtests in the Stanford-Binet 5, WAIS-IV, WISC-V, and WPPSI-IV". Intelligence. 75: 9–18. doi:10.1016/j.intell.2019.02.005.
  14. ^ a b c d Siegler, Robert S. (1992). "The other Alfred Binet". Developmental Psychology. 28 (2): 179–190. doi:10.1037/0012-1649.28.2.179.
  15. ^ Terman, L., Lyman, G., Ordahl, G., Ordahl, L., Galbreath, N., & Talbert, W. (1916). The Stanford Revision and Extension of the Binet-Simon Scale for Measuring Intelligence. Baltimore: Warwick & York.
  16. ^ Binet, A. (1887). "Le fétichisme dans l'amour". Revue Philosophique. 24: 143–167, 252–274.
  17. ^ Freud, Sigmund (1991). On Sexuality: Three Essays on the Theory of Sexuality and Other Works. Penguin. p. 67. ISBN 0-140-13797-1.
  18. ^ "Grand Guignol Plays 1900 - 1909". GrandGuignol.com. Thrillpeddlers. Retrieved 10 November 2018.
  19. ^ Tabachnick, B.G.; Fidell, L.S. (2001). Using Multivariate Analysis. Boston: Allyn and Bacon. ISBN 978-0-321-05677-1.[page needed]
  20. ^ a b Borsboom, D.; Mellenbergh, G.J.; van Heerden, J. (2003). "The Theoretical Status of Latent Variables" (PDF). Psychological Review. 110 (2): 203–219. CiteSeerX 10.1.1.134.9704. doi:10.1037/0033-295X.110.2.203. PMID 12747522. Archived from the original (PDF) on 2013-01-20. Retrieved 2008-04-08.
  21. ^ Greene, Jeffrey A.; Brown, Scott C. (2009). "The Wisdom Development Scale: Further Validity Investigations". International Journal of Aging and Human Development. 68 (4): 289–320 (at p. 291). doi:10.2190/AG.68.4.b. PMID 19711618.
  22. ^ Spearman, C. (1904). ""General Intelligence," Objectively Determined and Measured". The American Journal of Psychology. 15 (2): 201–292. doi:10.2307/1412107. JSTOR 1412107.