Just read some articles from the May-June issue of American Scientist. Two of them readlly stood out to me:
1. "The Most Dangerous Equation" by Howard Wainer. (If you have an American Scientist subscription, or if your school has an institutional subscription, or if you are a Sigma Xi member, an online version is available here.) As (I inferred) an experimental scientist, the author argued that the most dangerous equation, at least among those equations the ignorance of which lead to social-economic loss, is what he calls de Moivre's equation (again I find myself awfully ignorant of the names of equations; but a rose by any other name would smell as sweet). This is a equation that should be familiar to most experimental scientists (physicists in particular), and I have an especially strong familiarity to it. The equation can be written as
σmean = σsample / √ nwhich says that the standard error of the mean is equal to the standard deviation of the sample divided by the square root of the sample size. Heuristically, this indicates a sort of cancellation between errors, especially when one takes measurements over large samples.
The author illustrates the misunderstanding of this statistical phenonmenon using five historical incidents. The most striking is the "Small Schools Movement". Quoting from the article:
The urbanization that characterized the 20th century led to ... increase in the size of schools. The era of one-room schoolhouses was replaced by one with large schools--often with more than a thousand students, dozens of teachers of many specialties and facilities that would not have been practical without the enormous increase in scale. Yet during the last quarter of the 20th century ... suggestion that smaller schools would provide a better education. In the lat 1990s the Bill and melinda Gates Foundation began supporting small schools on a broad-ranging, intensive, national basis ... They have since been joined in support ... by the Annenberg Foundation, the Carnegie Corporation, the Center for Collaborative Education, the Center for School Change, Harvard's Change Leadership Group ... and the U.S> Department of Education's Smaller Learning Communities Proogram.Among the many "evidence" cited for "smaller is better" is the bit of statistical gem:
... when schools are smaller, student achievement improves. The supporting evidence for this is that among high-performing schools, there is an unrepresentatively large proportion of smaller schools.Yet as the author continues to note, that personal investigations led to him using public available data showed that, at the fifth-grade level in Pennsylvania, while among the top achieving schools (top three percent in standardized tests) the small schools (lowest three percent in school size) is over represented fourfold, among the worse achieving schools the small schools are also over represented, this time by 6 fold. In fact, on the 11th grade level, a linear regression shows that, on average, smaller schools are actually doing worse than large schools.
This suggests that the inference that "because small schools are over-represented in the top schools, they must be over all better" is flawed. Indeed, this is just one of the consequences of de Moivre's law. Since the standard error of a measurement decreases as the sample size increases, a small sample is much more likely to exhibit a wild statistic that is far above (or equally likely, below) average than a large sample. To illustrate, consider the following scenario
Suppose students were given a test on which the student is going to score or 90 or a 70 with equally likelihood. Let's look at the statistical distribution of small schools versus slightly larger schools. For our small school, consider a school with 3 students. Tabulating all possibilities, we see that there is a 12.5% chance that the school (of 3 students) averages 90, another 12.5% that the average is 70, 37.5% that the average is 83.3, and 37.5% that the average is 76.7. Now consider a school with 6 students: the percentages for the average scores are
90 - 1.5625%
86.7 - 9.375%
83.3 - 23.4375%
80 - 31.25%
76.7 - 23.4375%
73.3 - 9.375%
70 - 1.5625%
as one can already see that it is more likely for a small school to get a score on the high end (as well as on the low end) by averaging the student's score than for a big school to do so.
Part of the implicit understanding that the author uses (and I think not sufficiently stressed) is the fact that in comparing different groups of data with similar mean and differing variances, the groups with large variances tend to be over-represented in the "tails" far from the average, and the groups with small variances tend to concentrate around the norm. This fact is intuitively obvious, yet often overlooked.
(I am "familiar" with this through one of the junior papers I wrote as an undergrad; I did it however on the Heisenberg limit: on how entangled particles can average out statistical noise better than independent random variables.)
2. "Liquid-Mirror Telescopes" by Paul Hickson. (I linked it because I think it is free access.) This article is of particular interest to me because of the International Physics Olympiad. In IPhO'01, our experimental section is precisely one of that: a parabolic mirror surface from a spinning liquid. Of course, in telescopes, they use a large diameter plate of mercury, whereas we used a small tub of gellatin, but the principle is the same. I have fond memories of DJP getting strictly warned because he completed the experiment too early, got bored, and started tilting the liquid mirror assembly dangerously to project pretty patterns on the ceiling using a hand laser.
This issue of the magazine also answered a question that has been on the back of my mind since a couple years ago when S and I flew to SFO on route to Taiwan.
What the heck are these funky colors around San Francisco Bay?The answer, quoting American Scientist, Volume 95, No. 3
Water from the bay is pumped in [to the Cargill salt ponds], then shuttled from pond to pond as it evaporates to leave behind solid salt, a process that takes about five years. The increasingly salty brine hosts a succession of life forms, so the color of the pond can indicate the concentration of its salts. Low to mid-salinity pond appear deep green; the color lightens, then shifts to oranges and reds as rising salinity favors salt-loving algae, bring shrimp and microorganisms.