A Mathematician Reads The Newspaper |
Company Charged with Ethnic Bias in Hiring
Test Disparities Need Not Imply Racism
The Comedian Mort Sahl remarks that some newspapers might report a nuclear exchange between the United States and Russia with the headline WORLD ENDS:WOMEN AND MINORITIES HARDEST HIT.Sarcasm and hyperbole aside,victimization and the differential treatment of groups,whether intentional or not,are the basis for many a news story.The percentage of African-American students at elite colleges,the proportion of women in managerial positions,the ratio of Hispanic representatives in legislatures have all been written about extensively.Oddly enough,the shape of normal bell-shaped statistical curves sometimes has unexpected consequences for such situations.For example,even a slight divergence between the averages of different population groups is accentuated at the extreme ends of these curves,and these extremes often receive inordinate attention in the press.There are other inferences that have been drawn from this fact,some involving social policy issues such as affirmative action and jobs programs.The issue is a charged one,and I don't wish to endorse any dubious claims,but merely to clarify some mathematical points.
As an illustration,assume that two populations vary along some dimension - height for example.Although it is not essential to the argument,make the further assumption that the two groups' heights vary in a normal or bell-shaped manner (see diagram).Then even if the average height of one group is only slightly greater than the average height of the other,people from the taller group will constitute a large majority among the very tall (the right tail of the curve).Likewise,people from the shorter group will constitute a large majority among the very short (the left tail of the curve).This is true even though the bulk of the people from both groups are of roughly average stature.Thus is group A has a mean height of 5' 8" and group B a mean height of 5' 7",then (depending on the exact variability of the heights) perhaps 90 percent or more of those of 6' 2" will be from group A.In general,any differences between the two groups will always be greatly accentuated at the extremes.
Two Normal Curves
___________________________________________________________________________
___________________________________________________________________________
Small differences in the mean lead to large differences at the extremes.
These simple ideas can be used and misused by people of very different political persuasions.My concerns,as I've said,are only with some mathematical aspects of a very complicated story,let me again illustrate with a somewhat idealized case.Many people submit their job applications to a large corporation.Some of these people are Mexican and some are Korean,and the corporation uses a single test to determine which jobs to offer to whom.For whatever reasons (good,bad,justifiable or not),let's assume that although the scores of both groups are normally distributed with similar variability,those of the Mexican applicants are slightly lower on average than those of the Korean applicants.
The corporation's personnel officer notes the relatively small differences between the group's means and observes with satisfaction that the many mid-level positions are occupied by both Mexicans and Koreans.[One might ask why there is any reference to Koreans and Mexicans in the first place-LB] She is puzzled,however,by the preponderance of Koreans assigned to the relatively few top jobs,those requiring and exceedingly high score in the qualifying test.[It could just be that Koreans have cultural traits giving them better organisational skills-LB]The personnel officer does further research and discovers that most of the holders of the comparably few bottom jobs,assigned to applicants because of their very low scores on the qualifying test,are Mexican.She may suspect racism,but the result might just as well be an unforeseen consequence of the way the normal distribution works.Paradoxically, if she lowers the threshold for entrance to mid-level jobs,she will actually end up increasing the percentage of Mexicans in the bottom category.[So much for positive discrimination-LB]
The fact is that groups differ in history,interests,and cultural values and along a whole host of other dimensions (which are impossible to disentangle).These differences constitute the group's identity and are what makes it possible even to talk about a collection of people as a group.Confronted with these social and historical dissimilarities,then,we shouldn't be astonished that member's scores on some standardized test are also likely to differ in mean and,much more substantially,at the extremes of the test-score distribution.(Much of this discussion is valid even if the distribution is not the normal bell-shaped one.) Such statistical disparities are not necessarily evidence of racism or ethnic prejudice,although,without a doubt,they sometimes are.[The trick is to know when it is and when it isn't-LB] One can and should debate whether the tests in question are appropriate for the purpose at hand,but one shouldn't be surprised when normal curves behave normally.As long as I'm issuing pronouncements,let me make another:the basic unit upon which or society or,indeed,any liberal society ("indeed" is a sure sign of something pompous coming up) is founded is the individual[Ref: M.Laver "The Politics of Private Desire"],not the group;I think it should stay that way.
Aside from having a questionable rationale,schemes of strict proportional representation are impossible to implement.Another thought experiment illustrates this point.Imagine a company - let's call it PC Industries - operating in a community that is 25 percent black,75 percent white ,5 percent homosexual,and 95 percent heterosexual.Unknown to PCI and the community is the fact that only 2 percent of the blacks are homosexual,whereas 6 percent of the whites are.Making a concerted attempt to assemble a work force of 1,000 that "fairly" reflects the community,the company hires 750 whites and 250 blacks.However,just 5 of the blacks (or 2 percent) would be homosexual,whereas 45 of the whites (or 6 percent) would be (totalling 50,5 percent of all workers).Despite these efforts,the company could still be accused by its black employees of being homophobic,since only 2 percent of the black employees would be homosexual,not the community wide 5 percent.The company's homosexual employees could likewise claim that the company was racist,since only 10 percent of their members would be black,not the community wide 25 percent.White heterosexuals would certainly make similar complaints.
To complete the reductio ad absurdum,factor in several other groups: Hispanics, women, Norwegians,even.Their memberships will likely also intersect to various unknown degrees.* People will identify with varying intensity with various groups to which they belong (whose definitions are vague at best).The backgrounds and training across these various cross sections and intersections are extremely unlikely to be uniform.Statistical disparities will necessarily result.
Racism and homophobia and all other forms of group hatreds are real enough without making them our unthinking first inference when confronted with such disparities.
[Thus is shown the folly of thinking in terms of gender and race,and not in terms of people who may or may not have greater propensities or abilities regardless of what group they belong to-LB]
A partially rhetorical question of tangential relevance:Assume that
an organization wishes to "encourage" those having characteristic C,but cannot
directly inquire of anyone whether he or she possesses it.Assume further
that Mr X has a surname 20 percent of whose co-owners have characteristic
C.If one knows nothing else about Mr X,then it seems prudent to suppose that
there is a 20 percent chance that Mr X possesses C.If one later discovers
that Mr X comes from a neighbourhood 70 percent of whose members have
characteristic C,what should one's estimate now be of the likelihood that
Mr X possesses C? And what if one subsequently learns that Mr X is also an
active member of a nation-wide organization only 3 percent of whose members
possess characteristic C? with all this information,what can one now conclude
about the chances that Mr X has C?
Harvard Psychiatrist Believes Patients Abducted by Aliens
Mathematically Creating One's Own Pseudoscience
In addition to the occasional piece of informed advocacy,science reporting ought every so often to gently debunk pseudoscientific extravagances.The enduring obsession with guardian angels and statues that bleed or cry are cases in point.So is the writing of the Harvard psychiatrist John Mack [Ref: S.Blackmore],whose recent book chronicles (and more or less) endorses the authenticity of ) the experiences of patients of his who assert that they've been abducted by aliens in UFOs.It received wide notice,and both the Washington Post and the New York Times ran profiles on him,which,while not exactly credulous,were not exactly incredulous either.
One might have guessed that such a radical claim would have galvanized scores of reporters. Time magazine's James Willwerth had no trouble locating Donna Basset,one of Dr Mack's patients.Ms Basset had insinuated herself into his program and,pretending to be an abductee,concocted wild stories that made their way into Mack's book. And James Gleick in the New Republic wrote a scathing review of the book and what he termed its weasely dodges and equivocations.Of course,The Skeptical Enquirer whose reason for existence is the critical examination of such claims,also printed an article on the book.(As a fellow of the Committee for the Scientific Investigation of Claims of the Paranormal,which publishes The Skeptical Enquirer ,I may be biased,but in my estimation the publication deserves a Pulitzer Prize for its work on these issues over the years.)
Sometimes mathematics is also helpful in uncloaking pseudoscientific claims and explaining their appeal.I wrote earlier about the plethora of coincidences that become apparent when one skims a paper,flicks through a magazine,and channel-surfs through cable television (not to mention simply living one's life.) These remarkable relationships between totally dissimilar items frequently seem to have an air of scientific hypotheses:sunspots and the stock market,hemlines and presidential elections,Super Bowl outcomes and the economy.Very often there is some personal connection or some element of self-reference involved in these relationships.(This morning I prefixed the file names associated with this book with MRN for "Mathematician Reads Newspaper," and twenty minutes later I learned that former President Nixon had died.The undoubtedly cosmic connection between these two events is that Nixon' initials,RMN,are a permutation of MRN.) The sheer number of such possible links and associations should convince one that almost all are merely coincidences.
Rather than rehash the skeptical arguments,let me present a mathematical recipe anyone may use to develop his or her very own personal pseudoscience.The method comes from the Dutch physicist Cornelis de Jager,who used it to advance a theory about the metaphysical properties of Dutch bicycles.
The recipe: Take any four numbers associated with you (height,weight,birthday,social security number,whatever you like) and label them X,Y,Z and W.Now consider the expression X^{a}Y^{b}Z^{c}W^{d},where the exponents a,b,c and d range over the values 0,1,2,3,4,5, 1/2,1/3,pi,or the negative of these numbers.(For any number N,N^{1/2} and N^{1/3} equal the square root and cube root of N,respectively,and N to a negative exponent,say,N^{-2},is equal to one over N to the corresponding positive exponent, 1/N^{2}.) Since each of the four exponents may be any one of these seventeen numbers,the number of possible choices of a,b,c, and d is,by the multiplication principle,83521 (17x17x17x17x17).There are thus this many values for the expression X^{a}Y^{b}Z^{c}W^{d}.
Among all these values,there will likely be several that equal,to two or three decimal places, universal constants such as the speed of light,the gravitational constant,Planck's constant,the fine structure constant,and so on.(If there are not,the units in which these constants are expressed can be altered.) A computer program can easily be written that can determine which of these universal constants is equal to one of the 83521 numbers generated from your original four.
Thus you might learn that for your choice of X,Y,Z and W the number X^{2}Y^{[2/3]}Z^{-3}W^{-1} is equal to the sun's distance from the earth.Or you might discover any of a host of other correspondences between your personal numbers and these universal constants - all without having to undergo the rigors of alien abduction.De Jager found that the square of his bike's pedal diameter multiplied by the square root of the product of the diameters of his bell and light was equal to 1,816,the ratio of the mass of the proton to that of an electron.
Incidentally,the ratio of the height of the Sears Building in Chicago to the height of the Woolworth Building in New York is the same to four significant digits (1.816 versus 1,816) as the latter ratio.Salacious versions of this game are possible,of course.Compare it also to the coincidental similarities,described in the first section,that can be found to link any two American presidents.
That such accidental linkages and their more mundane,non-numerical cousins are extremely common is not fully appreciated.As a result hucksters can cash in on people's tendency to ascribe significance to any meaningless coincidence.Consider the foreign con artist whose lure was that he could help students hoping to enter a very competitive national university. The man bragged that he knew the arcane details of the admission process,had contacts with the appropriate officials,and so on.After gathering detailed information from the students,he collected an exorbitant fee,promising to return it if the student was not admitted.Every year he threw their information out,yet every year some of the students got in anyway.Their fees he kept.
Similar stories can be told about miracle medical interventions whose efficacy is only an illusion.Many people simply improve on their own.As in the case of random and the stock market,it's quite easy to see patterns,especially if one wants very much to find them.The lesson for all of us is that talking only to people who have a vested interest in some result or linkage can be beguiling (especially if that person is a Harvard professor of psychiatry); gullible journalism is often the result.
Of course,coincidences occasionally point up valuable yet overlooked connections or,vastly less often,defective scientific laws.But,as the philosopher David Hume observed more than 200 years ago,every piece of evidence for a miraculous coincidence - that is ,for a contravention of natural law - is also evidence for the proposition that the regularities that the miracle contravened are not really laws of nature after all.Any near-instantaneous transmission of voices across great distances might have been considered a miracle centuries ago.As it turned out,however,the scientific principles prohibiting or seeming to prohibit such transmissions were not,in fact,natural laws.
Not only are coincidences not miraculous but -Freudians,tabloids,and popular
sentiment to the contrary - an overwhelming majority of them have no significance
whatsoever.Neither,I might add,will the expected numerological fatuities
connected to the turn of the millennium in 2000(2001 for purists).Since 1998
equals 3 times 666,the festivities may begin even sooner.
John Allen Paulos @ABCNews | GO Guide | Temple.Edu
Chaos | Quantum | Logic | Cosmos | Conscious | Belief | Elect. | Art | Chem. | Maths |