Apparently it has been an assumption for a long time in some circles that early sex by teenagers results in their later delinquency. Two recent papers demonstrate just how muddled this theory is (along with most theories generalizing about human behavior), because they differ in their conclusions based on how the data were analyzed. The first paper’s ( Armour, S. and D.L. Haynie, 2007. Adolescent sexual debut and later delinquency. Journal of Youth Adolescence 36:141-152) purpose was to use data to support the theory, which it does. The second paper (Harden, K.P., J. Mendle, J. E. Hill, E. Turkheimer and R.E. Emery, 2008. Rethinking timing of first sex and delinquency. Journal of Youth Adolescence, in press) uses the same dataset to reach the opposite conclusion, that earlier sex reduces future delinquency.
The second group of authors of course claim that their analysis is the better one, and in this case it is true. These papers, in fact, are a good demonstration of one of the major problems of large-dataset human studies, which is that they only control for factors (in this case, survey responses about race, income, parent’s education, GPA, drug use, etc.) that the researchers imagine could affect the data, and not all the other hundreds of factors that also could but are ignored out of practicality or researcher bias. The authors’ hope is that their use of a giant dataset will obscure the fact that important information is lacking.
(Once again, we will put aside the first major problem of such studies, the use of self-reporting data. Of course since both groups of authors rely on them, neither mentions how unreliable they are, especially, one might assume, with regard to sexual experience. And one might also imagine that the group of people who are most likely to lie about sexual experience is teenagers.)
The reason the second study is the better analysis is because the authors recognize that pooling all the data loses important information. Meaningless averages are calculated by pooling teenagers from all cultures and walks of life. To a repeat a very nice analogy used by the authors of the second paper: if you wish to correlate meat consumption with life expectancy, and you compare two countries, one primarily meat-eating and another not, you find a positive relationship – higher meat-eating correlates with higher life expectancy. But a third ignored variable also correlates positively with meat-eating, and that is level of industrialization. So to truly understand the relationship between meat-eating and life expectancy, you must control for industrialization. When the analysis is rerun within one country, the correlation between meat-eating and life expectancy is negative.
In addition, what is found in both papers is simply correlation, not causation (a trap that first-year undergraduates are taught to avoid, and yet catches so many human-behavior researchers). That is, the only information one has after the meat study is that meat-eating is associated with lower life expectancy. The study has not shown that meat-eating causes lower life-expectancy.
These were the two main problems with the first paper. The authors pool individuals across a wide range of cultural norms, which gives them a spurious result, and then conclude that early teen sex causes delinquency when the two are only correlated. Even though they use a crude control for cultural influence (average reported age of first sex for a given teenager’s high school) they ignore any potential unstudied factor that could cause both (just as industrialization causes both higher life expectancy, and more meat-eating), obscuring the results for individuals.
The second paper solves that problem by analyzing only the identical twins in the dataset (which was large enough for them to have data for 289 twin pairs), and therefore controlling for both genetics (which the twins share exactly) and environment (which twins living in the same household largely share). This is an appropriate twin analysis because (for this main point at least) the authors don’t care about trying to separate genetics and environment to answer their question. (Twin studies that do confound objective data with subjective assumptions.)
On top of all this, though, is another major flaw in the dataset, which the second group of authors strangely acknowledge despite their analysis. The supposedly “independent” (time of first sex) and “dependent” (delinquency) variables are by definition related from the start, because in much of American society, teen sex itself is considered delinquent behavior. What they are doing is a bit like asking whether or not shoplifting is correlated with delinquency. This certainly confounds the first study.
What does it mean that the second study found that identical twins who have their first sexual experience earlier than their siblings are less likely to engage in delinquent behavior? The authors seem to feel they have no choice but to conclude that there is probably no relationship between these factors at all. Perhaps that is exactly what they would have found statistically if they had used a Bonferroni correction for their dozen or so analyses. Either that, or delinquency is caused by sexual frustration, and the problem of misbehaving teens is now solved.