A new article on CNN by psychology professors, Wendy Williams and Stephen Ceci, boldly proclaims that gender bias in Science, Technology, Engineering and Mathematics (STEM) is a myth. Their research has been published in the Proceedings of the National Academy of Sciences (PNAS). Unfortunately, their work has a flawed methodological premise and their conclusions do not match their study design. This is not the first time these researchers have whipped up false controversy by decrying the end of sexism in science.
Williams and Ceci write on CNN:
Many female graduate students worry that hiring bias is inevitable. A walk through the science departments of any college or university could convince us that the scarcity of female faculty (20% or less) in fields like engineering, computer science, physics, economics and mathematics must reflect sexism in hiring.
But the facts tell a different story…
Our results, coupled with actuarial data on real-world academic hiring showing a female advantage, suggest this is a propitious time for women beginning careers in academic science. The low numbers of women in math-based fields of science do not result from sexist hiring, but rather from women’s lower rates of choosing to enter math-based fields in the first place, due to sex differences in preferred careers and perhaps to lack of female role models and mentors.
While women may encounter sexism before and during graduate training and after becoming professors, the only sexism they face in the hiring process is bias in their favour.
Williams and Ceci’s data show that, amongst their sample, women and male faculty say they would not discriminate against a woman candidate for a tenure-track position at a university. Sounds great, right? The problem is the discrepancy between their study design, that elicits hypothetical responses to hypothetical candidates in a manner that is nothing like real-world hiring conditions, and the researchers’ conclusions, which is that this hypothetical setting dispels the “myth” that women are disadvantaged in academic hiring. The background to this problem of inequality is that this is not a myth at all: a plethora of robust empirical research already shows that, not only are there less women in STEM fields, but that women are less likely to be hired for STEM jobs, as well as promoted, remunerated and professionally recognised in every respect of academic life.
Williams and Ceci sent out an email survey to a randomised sample of over 2,000 faculty members in the USA in two maths-intensive fields where women are under-represented (engineering and economics) and two non-maths intensive fields where women are relatively better represented (psychology and biology). They had a 34% response rate, meaning their final sample was over 700 faculty. This rate of response is standard in many email surveys, but with this sort of study design, researchers need to critically examine and control for bias. In the social sciences, we know that people will participate in studies where they are 1) Given an incentive (usually paid); or 2) They have a personal stake or interest in the study.
Williams and Ceci say they have addressed self-selection bias of their sample by conducting two control experiments. In one, they sent out surveys to only 90 psychology faculty who were paid $25 for participation. They had 91% response rate (82 agreed to participate). Psychology not only has one of the highest proportions of women faculty relative to other fields, but this discipline uses gender as a central concept of study. That means that awareness of gender issues is higher than for most other fields. So including psychology as a control is not a true reflection of gender bias in broader STEM fields.
In another control study, Wiliams and Ceci surveyed engineering faculty by sending out hypothetical applicants’ CVs to 35 academics. This means that for a small sub-set of participants, they were evaluating material that is more like what we usually review when we are considering a candidate pool.
The rest of the sample – over 500 participants – were asked to rate three candidates based on narratives. This is not how we hire scientists.
In effect the study design does not simulate the conditions in which hiring decisions are made. Instead, participants self-selected to participate in a study knowing they’d be judging hypothetical candidates. While the researchers included a “foil” in their study design (one weaker male candidate) to contrast with two identical candidates who only varied in their gender, it is very easy to see from their study design that the researchers were examining gender bias in hiring.
Participants read a small vignette about three candidates where the gender pronouns (he and she) were varied; some were given stories about candidates who were single; some were single divorced mothers; some were married mothers; and vice versa for men. Some of the stories contained adjectives usually associated with men (independent, analytic), others with “feminine” characteristics (creative, kind). Participants were assessing candidates based on the narrative by a hypothetical hiring committee chair. They were then asked to rate their preferred candidate. Under these highly atypical conditions, the participants were found almost equally likely to hire women and men, and in some cases, some sub-groups say they would prefer to hire a woman.
Here’s the thing; we don’t hire scientists based on short narratives.
When we hire scientists, the first thing that is assessed is their CV. It is the CV that gets an interview; the interviewee sits before a panel; individual panellists make notes; the committee makes a decision together. The researchers claim that their control groups and their “consultants” have proved that these individual evaluations would not be any different than the way in which a panel makes hiring choices. To suggest this is ludicrous given that they don’t have data about how hiring panels make decisions. If they did have these data, their study would be a completely different piece of research.
Gender Bias in Hiring
The process of hiring any professional is the outcome of social interaction. Biases shape social exchanges. Biases also influence how we read and interpret CVs, so our previous social interactions, from education to our workplace setting, all have a bearing on how CVs are assessed. A panel involves deliberation, another social exchange that is influenced by pre-existing biases.
Various studies have used hypothetical CVs in an experiment and these demonstrate how gender and other biases influence outcomes. This includes a study showing that amongst male and female psychologists who assessed potential candidates, men and women prefer to hire a man, even if women have the same qualifications. In light of this previous research, it is most striking that in Williams and Ceci’s study, the paid control group of psychologists were used to show that gender bias is not present. The fact that the control group was paid for their time and opinion in a study by two psychologists, where no other participants were paid is most unusual. There is nothing wrong with paying participants (their time is valuable) but if only a small sub-set are paid and others are not, then we need to question why.
Regardless, other research, including another study published by PNAS, shows that academics would prefer to hire men over women for prestigious managerial positions. Moreover, even in the life sciences, which has a relatively higher rate of women, male scientists in elite research institutions prefer to recruit men over women.
This issue aside, the fact remains that Williams and Ceci do not have data to support how scientists rank potential candidates. They have produced data about how scientists respond to a study about gender bias in academia, when they can easily guess that gender bias is being observed. Academics already understand that gender discrimination is morally wrong and unlawful. After all, North American universities have anti-discrimination policies in place, and they offer some level of training and information about their institutional stance on sex discrimination.
Research shows that academics do not fully understand how unconscious gender bias informs their decision-making and behaviour. Unconscious bias plays out in everyday interactions within STEM environments, from comments that undermine women’s professionalism, to “jokes,” to broader institutional practices that exclude women. Unconscious bias has a damaging effect on women, who are continually undermined at every stage of their education and careers.
The same goes for other professionals and the public at large: people are not aware of their biases unless they are trained to understand and address these preferences, which are deeply ingrained into us through early childhood socialisation. The myth that girls and women can’t succeed in STEM is demonstrated through the Draw a Scientist Test, a process that measures how young children are conditioned to accept the image of a scientist as being a White older man in a lab coat. This latent stereotype is further reinforced in the way girls are discouraged from learning STEM, and it impacts on their subsequent success when following these career paths.
A wealth of literature has shown that women are disadvantaged in STEM. Women academics who are mothers are less likely to be hired over fathers; these fathers are offered an average $11,000 more than mothers as a starting wage. Women are disadvantaged at every step of the hiring process, including for the types of activities that boost CVs for tenure-track positions, such as Fellowships.
Other research shows that, even when presented with empirical data about gender inequality in STEM, men are overwhelmingly more likely than others to reject the existence of gender bias. White men in particular either reject outright that inequality exists, or they otherwise think that inequality impacts on men, and that women are conversely more favoured. Sound familiar? This body of research demonstrates just how deeply held the so-called “myth” of gender inequality runs. Williams and Ceci have managed to reaffirm the popular, but ill-informed, idea that gender inequality is over, even when their own data cannot prove such a feat, particularly since it runs counter to decades of research.
Nationally representative data shows that over a 30 year-period, it is White women who have benefited from affirmative action, and that women of colour have made minimal progress under these diversity policies. Even still, White women remain under-represented in STEM relative to White men, while women of colour scientists are even more marginalised and less likely to be hired for jobs. As for the minuscule proportion of women of colour who manage to secure employment, they are subjected to routine sexism and racism within scientific settings. Transgender women, especially women of colour, are further subjected to additional prejudices, including gender policing and stigma that further alienates and undermines their professionalism.
Power, Race & Gender Bias
This is not the first time Williams and Ceci have published flawed results on gender in STEM, and it’s not the first time when they’ve completely ignored the real-world context in which women in STEM are battling to be hired, promoted and rewarded. This includes fighting not just sexism, but racism, homophobia, transphobia, ableism and so on. I critiqued their last study, which similarly tried to argue that sexism in academia is dead. Their data and methods prove no such thing. Rather, their previous studies have set up a precedence that is continued in their current research, which shows that two White, tenured professors draw insubstantial conclusions about gender inequalities that are simply not supported by their findings.
Power and race matters: these White professors who have “made it” in academia do not see a major problem with the gender imbalance in STEM. Instead, they explain this inequity away by arguing that women are self-selecting not to enter academia, and that those who do subsequently accept the “motherhood penalty.” That is, that women choose to sacrifice their careers for child-rearing. Williams and Ceci do not recognise that institutional factors and unfair policies do not really give women a real “choice” about their family and professional responsibilities.
Elsewhere, I have shown that Williams and Ceci’s previous research is informed by a false narrative of individual choice. The same can be said for the present study. The researchers’ own biases lead them to believe that women and men belong to two discrete groups (making genderqueer and transgender scientists invisible). Similarly, they do not see that issues of intersectionality (the multiple experiences of inequalities faced by minority women) have a profound impact on gender inequity in STEM.
Ignoring race, sexuality and other socio-economic factors is a power dynamic: White, senior academics can pretend that race doesn’t matter, because racism does not adversely affect their individual progress. They can choose to believe that sexism is over because they have secured their tenure, even though they did so in a different climate to present-day pressures, where tenure is even tougher to find and early career researchers face precarious employment.
We must be ever-vigilant of how our biases contribute to inequality in STEM, and we must not accept abuse of power pandering to populist notions that we live and work in a so-called post-feminist, post-racial world. The evidence does not support such White patriarchal fantasies. Inequality has a concrete impact on the working lives of many women scientists, and this is felt most acutely by women of minority backgrounds. Rather than pretending the problem does not exist, let’s work together to eradicate gender inequality.
Edited to Add: Some excellent posts analysing Williams & Ceci below. I’ll add more as I find them.
- Karen James, #StillaProblem II: academic science is (still) sexist. Storify of discussions on Twitter about problems with the study.
- Helen De Cruz, Assessing inductive risk in the Williams and Ceci studies. “A problem with Ceci and Williams’ research (not just this paper, but their project as a whole) is their too-narrow focus on what counts as personal choices by women.”
- Marie Claire Shanahan, Be careful saying “The Myth about Women in Science” is solved. With discussion of other research showing bias exists
- David Charbonneau, New Study Demonstrates Shocking Truth About Faculty Hiring. Satiric article on the problems with the study, with useful introduction about how hiring committees actually work.
- Claire Griffin, Women preferentially hired in STEM – but does that solve the problem? “Many of these hindrances are not based on “supply-side” decisions, as the paper calls the problem. Rather, they are a result of structural obstacles and biases within academia and society at large.”
- skullsinthestars, A one-act play about a study in hiring practices in STEM: “if you question people on their hypothetical preferences for hiring, it seems obvious to me that you’ll get very different answers than what you’d get in an actual hiring process.”
- Dr Kate Clancy has started a collection of articles on this study as an example of Gaslighting, leading to the #GaslightingDuo on Twitter
- Matthew R. Francis, A study in how not to talk about sexism in science. A follow up to his Slate post which covers the mainstream media’s uncritical reporting on this study.
Top image: photos adapted by Zuleyka Zevallos from these Creative Commons (2.0) sources:
44 thoughts on “The Myth About Women in Science? Bias in the Study of Gender Inequality in STEM”
Zuleyka, thank you for your engaging and well researched perspective. On Twitter, you mentioned that you were interested in my take on the study’s methods. So here are my thoughts.
I’ll respond to your methodological critiques point-by-point in the same order as you: (a) self-selection bias is a concern, (b) raters likely suspected study’s purpose, and (c) study did not simulate the real world. Have I missed anything? If so, let me know. Then I’ll also discuss the rigor of the peer review process.
As a forewarning to readers, the first half of this comment may come across as a boring methods discussion. However, the second half talks a little bit about the relevant players in this story and how the story has unfolded over time. Hence, the second half of this comment may interest a broader readership than the first half. But nevertheless, let’s dig into the methods.
(a) WAS SELF-SELECTION A CONCERN?
You note how emails were sent out to 2,090 professors in the first three of five experiments, of which 711 provided data yielding a response rate of 34%. You also note a control experiment involving psychology professors that aimed to assess self-selection bias.
You critique this control experiment because, “including psychology as a control is not a true reflection of gender bias in broader STEM fields.” Would that experiment have been better if it incorporated other STEM fields? Sure.
But there’s other data that also speak to this issue. Analyses reported in the Supporting Information found that respondents and nonrespondents were similar “in terms of their gender, rank, and discipline.” And that finding held true across all four sampled STEM fields, not just psychology.
The authors note this type of analysis “has often been the only validation check researchers have utilized in experimental email surveys.” And often such analyses aren’t even done in many studies. Hence, the control experiment with psychology was their attempt to improve prior methodological approaches and was only one part of their strategy for assessing self-selection bias.
(b) DID RATERS GUESS THE STUDY’S PURPOSE?
You noted that, for faculty raters, “it is very easy to see from their study design that the researchers were examining gender bias in hiring.” I agree this might be a potential concern.
But they did have data addressing that issue. As noted in the Supporting Information, “when a subset of 30 respondents was asked to guess the hypothesis of the study, none suspected it was related to applicant gender.” Many of those surveyed did think the study was about hiring biases for “analytic powerhouses” or “socially-skilled colleagues.” But not about gender biases, specifically. In fact, these descriptors were added to mask the true purpose of the study. And importantly, the gendered descriptors were counter-balanced.
The fifth experiment also addresses this concern by presenting raters with only one applicant. This methodological feature meant that raters couldn’t compare different applicants and then infer that the study was about gender bias. A female preference was still found even in this setup that more closely matched the earlier 2012 PNAS study.
(c) HOW WELL DID THE STUDY SIMULATE THE REAL WORLD?
You note scientists hire based on CVs, not short narratives. Do the results extend to evaluation of CVs?
There’s some evidence they do. From Experiment 4.
In that experiment, 35 engineering professors favored women by 3-to-1.
Could the evidence for CV evaluation be strengthened? Absolutely. With the right resources (time; money), any empirical evidence can be strengthened. That experiment with CVs could have sampled more faculty or other fields of study. But let’s also consider that this study had 5 experiments involving 873 participants, which took three years for data collection.
Now let’s contrast the resources invested in the widely reported 2012 PNAS study. That study had 1 experiment involving 127 participants, which took two months for data collection. In other words, this current PNAS study invested more resources than the earlier one by almost 7:1 for number of participants and over 18:1 for time collecting data. The current PNAS study also replicated its findings across five experiments, whereas the earlier study had no replication experiment.
My point is this: the available data show that the results for narrative summaries extend to CVs. Evidence for the CV results could be strengthened, but that involves substantial time and effort. Perhaps the results don’t extend to evaluation of CVs in, say, biology. But we have no particular reason to suspect that.
You raise a valuable point, though, that we should be cautious about generalizing from studies of hypothetical scenarios to real-world outcomes. So what do the real-world data show?
Scientists prefer *actual* female tenure-track applicants too. As I’ve noted elsewhere, “the proportion of women among tenure-track applicants increased substantially as jobseekers advanced through the process from applying to receiving job offers.”
This real-world preference for female applicants may come as a surprise to some. You wouldn’t learn about these real-world data by reading the introduction or discussion sections of the 2012 PNAS study, for instance.
That paper’s introduction section does acknowledge a scholarly debate about gender bias. But it doesn’t discuss the data that surround the debate. The discussion section makes one very brief reference to correlational data, but is silent beyond that.
Feeling somewhat unsatisfied with the lack of discussion, I was eager to hear what those authors had to say about those real-world data in more depth. So I talked with that study’s lead author, Corinne Moss-Racusin, in person after her talk at a social psychology conference in 2013.
She acknowledged knowing about those real-world data, but quickly dismissed them as correlational. She had a fair point. Correlational data can be ambiguous. These ambiguous interpretations are discussed at length in the Supporting Information for the most recent PNAS paper.
Unfortunately, however, I’ve found that dismissing evidence simply because it’s “correlational” can stunt productive discussion. In one instance, an academic journal declined to even send a manuscript of mine out for peer review “due to the strictly correlational nature of the data.” No specific concerns were mentioned, other than the study being merely “correlational.”
Moss-Racusin’s most recent paper on gender bias pretends that a scholarly debate doesn’t even exist. Her most recent paper cites an earlier paper by Ceci and Williams, but only to say that “among other factors (Ceci & Williams, 2011), gender bias may play a role in constraining women’s STEM opportunities.”
Failing to acknowledge this debate prevents newcomers to this conversation from learning about the real-world, “correlational” data. All data points should be discussed, including both the earlier and new PNAS studies on gender bias. The real-world data, no doubt, have ambiguity attached to them. But they deserve discussion nevertheless.
WAS THE PEER REVIEW PROCESS RIGOROUS?
Peer review is a cornerstone of producing valid science. But was the peer review process rigorous in this case? I have some knowledge on that.
I’ve talked at some length with two of the seven anonymous peer reviewers for this study. Both of them are extremely well respected scholars in my field (psychology), but had very different takes on the study and its methods.
One reviewer embraced the study, while the other said to reject it. This is common in peer review. The reviewer recommending rejection echoed your concern that raters might guess the purpose of the study if they saw two men and one woman as applicants.
You know what Williams and Ceci did to address that concern? They did another study.
Enter data, stage Experiment 5.
That experiment more closely resembled the earlier 2012 PNAS paper and still found similar results by presenting only one applicant to each rater. These new data definitely did help assuage the critical reviewer’s concerns.
That reviewer still has a few other concerns. For instance, the reviewer noted the importance of “true” audit studies, like Shelley Correll’s excellent work on motherhood discrimination. However, a “true” audit study might be impossible for the tenure-track hiring context because of the small size of academia.
The PNAS study was notable for having seven reviewers because the norm is two. The earlier 2012 PNAS study had two reviewers. I’ve reviewed for PNAS myself (not on a gender bias study). The journal published that study with only myself and one other scholar as the peer reviewers. The journal’s website even notes that having two reviewers is common at PNAS.
So having seven reviewers is extremely uncommon. My guess is that the journal’s editorial board knew that the results would be controversial and therefore took heroic efforts to protect the reputation of the journal. PNAS has come under fire by multiple scientists who repeatedly criticize the journal for letting studies simply “slip by” and get published because of an old boy’s network.
The editorial board probably knew that would be a concern for this current study, regardless of the study’s actual methodological strengths. This suspicion is further supported by some other facts about the study’s review process.
External statisticians evaluated the data analyses, for instance. This is not common. Quoting from the Supporting Information, “an independent statistician requested these raw data through a third party associated with the peer review process in order to replicate the results. His analyses did in fact replicate these findings using R rather than the SAS we used.”
Now I embrace methodological scrutiny in the peer review process. Frankly, I’m disappointed when I get peer reviews back and all I get is “methods were great.” I want people to critique my work! Critique helps improve it. But the scrutiny given to this study seems extreme, especially considering all the authors did to address the concerns such as collecting data for a fifth experiment.
I plan on independently analyzing the data myself, but I trust the integrity of the analyses based on the information that I’ve read so far.
SO WHAT’S MY OVERALL ASSESSMENT?
Bloggers have brought up valid methodological concerns about the new PNAS paper. I am impressed with the time and effort put into producing detailed posts such as yours. However, my overall assessment is that these methodological concerns are not persuasive in the grand scheme. But other scholars may disagree.
So that’s my take on the methods. I welcome your thoughts in response. I doubt this current study will end debate about sex bias in science. Nor should it. We still have a lot to learn about what contexts might undermine women.
But the current study’s diverse methods and robust results indicate that hiring STEM faculty is likely not one of those contexts.
Disclaimer: Ceci was the editor of a study I recently published in Frontiers in Psychology. I have been in email conversation with Williams and Ceci, but did not send them a draft of this comment before posting. I was not asked by them to write this comment.
In the first two points (a & b), you have restated the study’s methodology, which I’ve already critiqued. The study makes broad comments about hiring practices in STEM by sampling four disciplines. It uses only one of these disciplines in the first instance as a control, psychology, with only 82 participants who were paid, unlike other participants. If the study was about hiring bias in psychology, I would expect the control to be psychologists. It is most odd that the authors have chosen only a sub-set of people to prove their control. Second, as I already noted, Williams and Ceci sent CVs, as part of their bias control, to only another sub-set of their sample, mechanical engineers, so this subset were evaluating data that was unlike the rest of the study, on which the conclusions are drawn.
The odd choices of the researchers weakens their position that their dataset has been tested for self-selection bias. I’ve already covered this, and you have not proved anything to the contrary. All we have to go on is what is published, and the published results are riddled with methodological flaws. At the end of the day, that’s what counts: what the researchers say about their data. As presented, the data and methods do not match the conclusions, which make wild claims about the lack of bias in hiring preferences in STEM.
A critical review of the methodology in light of the existing evidence of empirical studies demonstrates that the study is highly flawed, as the authors have overstated the significance of their results. All they have collected is data about how a sub-set of self-selecting faculty (those who responded to the survey) read short narratives about candidates. Again: such narratives are nothing like the actual material that faculty teams assess. The researchers have collected data about narratives and nothing more.
As for being able to guess the design of the study – as noted – it is not difficult to see the gender component, as has been discussed on social media by other faculty. For the benefit of other readers, the sample narrative provided by Williams and Ceci in their supplementary paper is reproduced at the end of my comment. This narrative is about a single woman candidate, who supposedly mentions to the Chair that she has no children and has no problems starting work. In the case of women with kids, their narratives mentioned their family situation. The study told participants to consider the information provided by a fictional chair, including family circumstances. Why would a fictional faculty Chair write a narrative including the applicant’s family situation unless this fictional chair wanted this information to be assessed? As I’ve noted, university policy requires some level of mandatory training on sexual discrimination. Anyone reading this narrative would pick up the language and social clues that gender discrimination was being screened. The 30 participants in the sub-set who were used in the control did not guess the hypothesis was related to gender. To reiterate: only 30 out of over 2,000 were asked this question. Nevertheless, being asked to review three candidates, with only one woman, where family situation is being discussed represents an obvious social cue that gender is an issue. Social desirability is a central part of social science methods training. Participants always try to answer questions on surveys in a way that they guess might be the “right” way. This is why we control for social desirability. This hasn’t been done in Williams and Ceci’s dataset.
Regardless, none of this changes the fact that the data we’ve been presented with in the published PNAS study concerns narratives that have no bearing on real life materials that are assessed as part of the hiring process. Contrary to the researchers’ claims, they do not have data proving that individuals reading narratives is a useful simulation for how selection teams deliberate hiring decisions. If they had this data, we would not likely be having this discussion about methods not matching conclusions. If they wanted to claim that gender bias favours women in hiring, they should have conceived an experiment that better simulated the context in which hiring decisions are made. And they are not made on the basis of one short paragraph about family.
I’ve linked to other studies showing that experiments with CVs do show bias presented when potential faculty candidates are mothers. These women are disadvantaged and less likely to be offered a position, and when they are, they are offered a dramatically lower starting rate than fathers. Real world data bears this out, which I’ve also linked to, showing that mothers take much longer to achieve tenure. Williams and Ceci argued in their last paper that women are not disadvantaged in academia and they made reference to this (then-forthcoming) paper now published in PNAS. All of their research is tied together through the same ideological position that academia is “gender neutral” a quote straight out of Williams and Ceci’s mouths (and their colleagues) in reference to their broader research project (of which this PNAS paper is one component).
For some reason, you have chosen to critique Moss-Racusin based on what you perceive to be her lack of resources and based on some once-off conversation after a conference many years ago. You say Ceci and Williams spent three years collecting data for their study: they clearly needed to spend longer, and channel their time and seemingly expansive resources into actually collecting the type of data that would allow them to make the claims they make. In any case, the length of time of data collection is neither here nor there. Neither is your perception of how much time and resources Moss-Racusin invested in her study relevant. A well funded study with flawed methods, which sets out to prove a political or ideological view point that academia is “gender neutral” is still a poor piece of research. Your critique of Moss-Racusin seems to be personal, and frankly does not help your position. Moss-Racusin conducted a good study that did not claim to have solved the problem of sexism in academia. Williams and Ceci have conducted a weak study given their grand conclusions. Data must match methods, that is the crux of my critique. Resources don’t change this fact.
As for your final point about the peer review process of this paper – your comments are perplexing. You make the assertion that you know that seven reviewers read Williams and Ceci’s paper. Whether it’s two or seven reviewers, the peer review process for any given peer-review journal is set by the editors of that journal as they see fit. It depends on the field or sub-discipline; most sociology journals will seek three reviewers, but it will go out to more if there are discrepancies in the reviewers’ recommendations. So if anything, the fact that this study was sent to seven reviewers may suggest that some reviewers’ comments gave rise to editorial misgivings and so further reviewers may have been sought. Perhaps the PNAS editors decided on the otherwise arbitrary number of seven reviewers knowing that the study would be controversial. Regardless, PNAS decided to publish the study in the end. So, in fact, whatever the reviewers said is a null point after the editorial team’s decision to publish a paper.
Ultimately, the decision to publish a flawed study rests on the shoulders of the editors, PNAS in this case, and critiques about the data and conclusions still reflect on the researchers, in this case Williams and Ceci. The number of reviewers has zero correlation on the data that is actually published. Unless, of course, the editors are also the authors of a study, as was the case in Williams and Ceci’s last study on gender, in which case, then the peer review process has failed completely, given that study was riddled with methodological issues.
That’s the theme with Williams and Ceci’s work: their scholarship is open to critique because their methods, conclusions, omissions and presumptions are coloured by privilege on several levels: race, class, sexuality and other socio-economic measures are repeatedly sidestepped because the researchers, being White tenured professors, presume that the academy has no problem with sexism. In their world view as White privileged faculty, they don’t see discrimination, claiming in the New York Times about their previous study, “We are not your father’s academy anymore.” “We” being White professors with an aversion to addressing gender inequality and power dynamics. Well, plenty of studies have shown that girls and women are disadvantaged at every spot in their careers, including in hiring.
For some puzzling reason, you have elected to divulge your contact with two of seven anonymous reviewers of Williams and Ceci’s study. The peer review process is anonymous for a reason: to protect the authors and researchers from bias, influence and coercion, and to ensure that the process is as objective as possible. Many academic circles can be insular, so perhaps it’s easy to guess who might have been involved in reviewing certain papers. Even though you do not mention names, I find the fact that you discuss this in defence of Williams and Ceci is both disturbing and off topic. The data matters at the end of the day, as I keep noting, not whatever investigations you decided to carry out on your own.
You are crossing over into unsavoury ethical ground here. If what you say here is correct, what you say may suggest that PNAS does not adequately protect the integrity of the peer review process by allowing researchers to contact and berate reviewers to explain or justify their publishing recommendations. The academic peer review system relies on trust and professional conduct. Part of that is not evoking what reviewers may have told you in confidence, to support a study that you seem to be personally or professionally invested in. You are a PhD student still David, I hope you will reflect on how this behaviour may impact on your professional conduct.
You also disclose that Ceci reviewed a previous article of your yours, that you have been in contact with Williams and Ceci over email and that they have not read over your comment you’ve submitted here. You also saw fit to tell me on Twitter that Williams and Ceci have read my blog post. How are all these pieces of information related? You seem to be closely aligning yourself with the authors and while you claim to not be working on their behalf, you sure are spending a lot of time defending them, by evoking your connection to them. It is good that you enjoy their work. Critiquing the resources of other researchers and sharing private conversations are not the best way to convey this defence.
The peer review system is not the end of scholarly engagement nor critique. It is simply a gatekeeping service. We trust peer review journals to provide robust, expert opinions by reviewers and editors before publishing, but once a study is published, it is critiqued on the merits of the material that is published.
A fatal flaw in Williams and Ceci’s PNAS study methodology is that, no matter how the authors, or anyone else, spins it, Williams and Ceci simply do not have the data to back up their claims. Williams and Ceci have data about how faculty respond to narratives that have zero connection to how real life hiring is conducted in academia. They have overstated their flawed data to extrapolate their claim that women are not disadvantaged in hiring, and in fact, they make the claim that women are advantaged over men 2:1. Notwithstanding the obvious problem with this premise (that not all academics fall into a neat divide of “men” and “women” and race as well as other social markers compound inequality), the fact remains: extraordinary claims require extraordinary evidence. You are convinced by Williams and Ceci’s evidence, even though it contradicts the empirical literature and lived experiences of women in STEM. We can agree to disagree, but perhaps let’s stick to the evidence. This study does not offer conclusive evidence to negate gender bias. You can choose to believe otherwise. That is the beautiful thing about science, we can weigh up evidence using our training, expertise and experience, as well as the extensive body of literature. In this case, it is stacked against Williams and Ceci’s outlandish supposition that hiring bias is in women’s favour and that academia is gender neutral.
One sample narrative provided by Williams & Ceci (p.7):
Imagine you are on your department’s personnel/search committee. Your department plans to hire one person at the entry assistant-professor level. Your committee has struggled to narrow the applicant pool to three short-listed candidates (below), each of whom works in a hot area with an eminent advisor. The search committee evaluated each candidate’s research record, and the entire faculty rated each candidate’s job talk and interview on a 1-to-10 scale; average ratings are reported below. Now you must rank the candidates in order of hiring preference. Please read the search committee chair’s notes below and rate each candidate. The notes include comments made by some candidates regarding partner-hire and family issues, including the need for guaranteed slots at university daycare. If the candidate did not mention family issues, the chair did
not discuss them.
Dr. X: X struck the search committee as a real powerhouse. Based on her vita, letters of recommendation, and their own reading of her work, the committee rated X’s research record as “extremely strong.” X’s recommenders all especially noted her high productivity, impressive analytical ability, independence, ambition, and competitive skills, with comments like “X produces high-quality research and always stands up under pressure, often working on multiple projects at a time.” They described her tendency to “tirelessly and single-mindedly work long hours on research, as though she is on a mission to build an impressive portfolio of work.” She also won a dissertation award in her final year of graduate school. X’s faculty job talk/interview score was 9.5 out of 10. At dinner with the committee, she impressed everyone as being a confident and professional individual with a great deal to offer the department. During our private meeting, X was enthusiastic about our department, and there did not appear to be any obstacles if we decided to offer her the job. She said she is single with no partner/family issues. X said our department has all the resources needed for her research.
Zuleyka, yes I agree the data and methods are the most important to consider, so let’s focus on those. I’m going to organize my thoughts under the same categories as before. We will probably have to agree to disagree on some points and that’s fine.
But, as a sidenote, I completely agree that some other studies show evidence of gender bias against women. Including the Moss-Racusin study. That study did show bias against women in hiring lab managers and that’s an important finding. And we need more experimental evidence of the contexts that do undermine women in a similar way. Agreed.
Let’s go back to the PNAS study now.
You make a fair critique of the control experiment that sampled psychologists to assess self-selection bias.
Can we agree, though, that there were other data for assessing self-selection bias? Specifically, the analyses comparing response rates by gender, rank, and discipline.
(b) Guessing study’s purpose
You note a Facebook comment made by Lisa Boulanger’s husband suggesting that participants did guess the study’s purpose, contrary to the data with those 30 participants.
Certainly her husband did think the study was about hiring biases. But can we agree her husband said nothing specific about gender?
And like I mentioned earlier, “Many of those surveyed did think the study was about hiring biases for ‘analytic powerhouses’ or ‘socially-skilled colleagues.’ But not about gender biases, specifically.”
Can we agree that the NRC 2010 report showed that the percent women increased from the applicant pool to job offers?
David, I’ve already covered these points more than once; first on social media, then in my article, and then in my reply to your first verbose response on my blog. Science does not require that we agree on anything. You can hold your views without needing me to validate things that I’ve already proven to be scientifically flawed.
While the overall number of women in STEM has improved compared to the 1970s, these gains have not gone nearly far enough. As I’ve mentioned repeatedly, here and elsewhere, most of the increase has been for some White women, while there are not enough women of colour, transgender women, LBTQIA women, women with disabilities, and women of other intersecting minority backgrounds. Over this past week Harvard announced that it had achieved gender parity for the first time in its history. MIT and other elite universities have publicly stated they are working towards gender equity. The Athena Swan Charter in the UK and the Australian Academy of Science’s Science in Australia Gender Equity (SAGE) Forum are set up to evaluate and address gender inequality, including in hires, and many universities are not fairing as well as they should, which is why these programs support their improvement. In a forthcoming piece, my colleagues and I will show that the number of women academic hires has dropped in other universities as soon as they stop actively targeting gender diversity.
So: in light of real world evidence, the overall patterns (with less than 17% of women in senior faculty positions in Australia and the UK) may look positive to people with White privilege, male privilege, as well as anyone not aware of heterosexism, ableism and other dynamics of power, but the gender patterns are abysmal to everyone else who is invested in gender equality and inclusion in STEM.
An increase from almost no women in STEM, to a little bit more (mostly White, heterosexual, able-bodied) women is not much progress given the year we live in 2015. Lazy ideologies that these small numbers represent a 2:1 advantage in hiring are misguided.
My criticisms of the Williams and Ceci study stand, no matter how much you’d like to show up on my blog to try to change my mind. The goal of science is not to harass women into agreeing with you; you present evidence, that evidence is evaluated using science, and if the evidence is valid and reliable, then minds get changed.
Williams and Ceci have produced a weak study with poor methods, publishing biased conclusions that fit into their pre-existing world-view that gender equality is not a big issue in academia. You can keep defending them for whatever reasons you feel fit; but you don’t get to do it over and over on my blog. You have @ mentioned me at least 40 times on Twitter about this, you’ve submitted two long comments (one of them twice) on my blog. That’s enough. Just so you know: berating women in science, on social media and in general in life, is not a good way to be a colleague. I will not allow you to post further comments on this post as you’re simply restating the same points over and over and I don’t see your behaviour as constructive.
The real question now is why it’s so important to you, a White man, that a woman (of colour no less) bends to your perception of gender bias. Gender, race and other social dynamics are present in all social interactions and they influence our epistemologies (which is why we train to control and address them). Please refrain from pushing the same arguments, especially given I specifically asked last time that you stop, and given that I’d already explained elsewhere that my blog is moderated because White men like to harass me constantly given I write about racism, sexism and other inequalities. I publish comments when I have the time to read them and respond to them; whatever your timelines are don’t factor into my ability to moderate this blog, which I do for free, as a service to sociology, social justice studies, and STEM public education.
Your valiant defence of the #GaslightingDuo can continue wherever you like, just not on my blog.
Thank you for the thoughtful response. It will take me some time to process. But I first wanted to clarify one thing.
I did not contact PNAS about who the peer reviewers were. I was discussing the study with my colleagues, and two of them volunteered to me that they were peer reviewers. I did not ask them, but rather they volunteered that information. I even sent my comment to the reviewer whose views I discuss, and that reviewer expressed no concern with my comment. And like you mention, I have kept their identities private. The fact that there were seven reviewers is publicly noted in the acknowledgments section of the PNAS paper.
Hi David, I hope you understand my point. It is troubling that you evoke the peer review process as a way to defend this study. You say you sent your comment to one of the anonymous reviewers and they saw no problem. This seems odd to me – you had one of the anonymous PNAS peer reviewers read over your comment to my blog? Why? This question is rhetorical because the issue is broader. We don’t discuss the peer review process as a way to defend a paper for the reasons stated.
As I said, academic circles can be small. That does not mean you can evoke these personal connections in a public defence of a study. It serves no purpose other than to muddy the lines of objectivity and ethics. What is okay to you and one or two reviewers is not necessarily how peer review system should be working.
Still, as I say, this is all beside the point. You are using personal discussions and personal attacks to publicly defend the research by Williams and Ceci. They don’t need you to defend their work. As I’ve repeatedly stated, here and on Twitter, their data speaks for them. You want to believe their data and conclusions are justified – that’s your prerogative. There’s no need to keep going over the same ground. Your enthusiasm for this study is noted.
My critique stands.
LikeLiked by 1 person
Zuleyka, I agree that the most important issue is the methods and data, and I promise I will reply to your thoughtful comments about that. I just wanted to clarify the information. You said that what I said, “may suggest that PNAS does not adequately protect the integrity of the peer review process by allowing researchers to contact and berate reviewers to explain or justify their publishing recommendations.”
If that were true, that would be politically damaging to me. But it’s not true. So I wanted to clarify that.
I promise I will reply to your comments on the methods and data after I’ve had the chance to fully process your arguments.
Gender bias in STEM is a myth. Watch and learn:
Hi Shady and thanks for the link! For others reading, this link is to Hjernevask (Brainwash), a controversial documentary by two Norwegian comedians. While the presenter talks to some experts and they include lots of vox pops with the public, this is not science. The documentary covers the so-called “nature versus nurture” debate about why there are less girls in STEM professions, despite Norway’s progressive gender policies. My colleagues and I have already addressed the scientific literature addressing these issues, showing how institutional processes influence girls’ and women’s transition from education to careers in science. I’ve already linked to our writing in my post, but here’s the link again: http://blogs.nature.com/soapboxscience/2014/09/04/nature-vs-nurture-girls-and-stem
The comedians are just hosts of the documentary, the research is valid and I read your blog post and none of what you state negates any of that but I have several problems with your rhetoric there, it’s quite reaching and you are forming false connections in many cases. The stereotype that men are male that you mention arose from the fact that scientists have always been largely male in the past, this didn’t pop out of thin air, this stereotype was based on reality before the mainstream push to coax women into stem fields. You mention exposure to women scientists in classrooms increases the chance that boys and girls become scientist but you’re making an egregious error here, this is largely in part do to these women entering classrooms and telling children this is what they should aspire to, how many other professions are doing this? If every profession had a representative doing this would the results be the same? I highly doubt it and this is representative of shoddy methodology. And to add to this in societies with higher socio economic statuses for the average individual (that also have a larger prevalence of programs to get women in sciences) the differences in choices between men and women increase rather than decrease, less women go into stem in these situations in comparison to countries like India where these stereotypes are rampant, so you’re plain wrong there. You weigh a lot of your argument on stereotypes and avoid innate preferences. Have you actually asked other women what they like? Both my sisters moved from technical fields(which they got into thinking that’s what women should aspire to) to people oriented fields just because they liked dealing with people more, are you oblivious to the fact women are making these choices in mass, and the choices they are making reflect the data we have on this? Are you going to avoid all the research that proves biology has an effect on behavior as well? The study on identical twins raised in separate environments for example. Are you going to avoid the fact that even though women scientists are visiting classrooms trying to coax them into joining stem, the overwhelming majority of girls in those classrooms still don’t want to go into stem? Are you going to avoid the fact that less women are going into stem than men even though it is much easier for them with the abundance of women only scholarships and other programs as well. Are you also going to avoid the fact that all the research in that Norwegian documentary on biology’s effect on behavior reflect societal structure to a T (which includes societies that have developed in isolation), due you think this is mere coincidence? Are you also going to avoid the fact that there is a massive difference in the literature, music and movie preferences that appeal to one gender over the other? I could go on….You also seem to think equality of outcome is somehow equality, it isn’t and you will have to intentionally discriminate against a group to achieve this which is what is happening now, women only scholarships is an example, it’s disgusting that this is all done in an arbitrary ideal of what some groups thinks equality is. A major problem with your approach is that you assume behavior is purely a product of the environment, we know for a fact this isn’t true. Also under representation doesn’t mean inequality, saying it does is based upon assumptions, women were overwhelmingly represented at the health services at the university of Waterloo where I did my masters from the receptionists to the psychologists present there, does this indicate sexism against men? Of course not. I see in the comments people referring to the bias in resumes, Christina hoff sommers points out the flaws of that research in one of her factual feminist videos or lectures (I don’t remember which one) that the research only highlights the disciplines where there is a bias towards men, they conveniently leave out the other ones where there is a bias towards women, this is disingenuous to say the least. Why are you trying to push young girls and women into professions that they may not like? Don’t you see the problem with that?
Your comments are very long and hard to follow. I’ll address some of your key points.
Documentaries are not scientific evidence. I have used scientific research to back up my analysis. You have relied solely on your subjective ideas, which are coloured by your belief that women do not belong in science. That you say your sisters tried STEM and didn’t like it has zero bearing on the institutional patterns and data that I’ve discussed.
Professionals regularly visit schools, from members of the police, to community workers, to scientists and beyond. Science curriculum is filled with examples of male scientists, and there is a lack of women role models because many teachers are not trained to examine their own gender biases in how they teach science. We know from research that when girls interact with professional women scientists, this helps them see that they too can have a career in science. We also know from international research that girls perform the same as boys in maths, science and engineering subjects, and in some cases, they outperform boys. Lack of role models and negative experiences that have discouraged girls from pursuing STEM is an outcome of institutional sexism. Again, see the extensive research, which shows that programs that involve women scientists as part of the solution, working together with parents and teachers, helps to improve outcomes for girls in STEM.
Do you have such an adverse reaction when male professionals go into schools?
Point me to specific biological research that shows that girls are inferior at STEM or are somehow innately disinterested in science. I have linked to an article I co-authored with my colleagues, both biologists, showing that no such research exists.
There are many examples of women who have excelled in science throughout history and across cultures; however, due to sexism, these women’s achievements are slow to be recognised.
We already know that there is greater gender balance in some STEM fields, such as health sciences; however, this parity disappears as soon as we start seeing the data on senior research and clinical positions. In some fields where women outnumber men, such as nursing, men are still overrepresented in senior roles, and men are paid more than women.
Show me where I’m trying to “push young girls and women into professions that they may not like.”
By the way, since you erroneously use India as an example, women studying science in India actually match the number of men. Plus there is greater gender parity between men and women in other non-Western countries like Iran, across Latin America and in other nations. Nevertheless, women in these nations still face tremendous barriers once they enter the workplace, particularly once they have children, so gender bias is still an issue that needs to be redressed.
I’d welcome you to respond with credible, peer reviewed sources and be sure to explain your thoughts more clearly. Your personal biases against women are not evidence of women being disinterested in science.
LikeLiked by 1 person
Reblogged this on GEOCOGNITION RESEARCH LABORATORY and commented:
What she said!
ceci and willians in the study used people who are above the average, this may have had an effect that could downplaying the condition of bias
Hi Fernando Junior. Not sure what you mean by “above the average”? The biggest problem with their methods is that they sent out a survey asking faculty to respond to a short hypothetical scenario. This does not simulate the reality of how women are hired in academia. So their claim that women are favoured 2:1 is invalid: they don’t have the data to support this claim.
First, I have to say that I have not read *every word* of the discussion above. The study that was carried out that purports to debunk the inequity of STEM for females carried out verbal interviews with members of the academic community. (Was there a link provided to the original study?) In verbal interviews, people can easily reply with the politically correct answers.
However, an earlier study published by Cornell University highlights how when it comes to looking at resumes and doing the actual hiring and promotion, there is a different result indeed. Here is the link to the study:
The Impact of Gender on the Review of the Curricula Vitae of Job Applicants and Tenure
Candidates: A National Empirical Study
Rhea E. Ste in preis,1 Katie A . A nders, and Dawn Ritzke
University of Wisconsin-Milwaukee
I believe that you will see how this study sheds light on the above discussion.
Hi Jan B,
I did link to the original study and I’ve actually discussed and linked to the study you’ve included in your comment. The study shows that psychology faculty are biased towards male applicants, which is in line with my overall argument that gender bias is alive and well in academia. Those with vested interests may deny the overwhelming empirical evidence, but all that does is detract from the goal of increasing gender equity. Thankfully there are many dedicated scholars who are fighting such biases with targeted policies, programs and training!
Hello Dr. Zevallos,
Thanks for the interesting article. I guess I have some questions.
You say the narrative is so far removed from the hiring process that it weakens the study. Isn’t it true that resumes and CVs also come with cover letters, teaching philosophies, recommendation letters, and other items very similar to a “narrative statement”? Even in the Moss-Racusin study about half of the CV page evaluated was “narrative” . Yes, the CV and grades and numbers are important, but they aren’t all important, and I speak from experience when I say in higher education hiring more than the CV is assessed.
You seem knowledgeable on Moss-Racusin as well, so perhaps you can help clear an issue I have with it. This study is often referenced to establish bias in science hiring, but I’m not convinced because of a fatal flaw in the method of the study. I haven’t seen this brought up, and so I wonder whether I’m missing something or am off in some way (and someone can inform me). Here’s the method in a nutshell: about 120 academic faculty professionals were sent identical resumes except for 60 the name was John and for the other 60 the name was Jennifer. Turns out that on average, Jennifer got lower scores and a lower salary offer. The conclusion? Bias against women in hiring. But hold on. How do count out that the 60 that got Jennifer just so happen to be the types of people to undervalue the resume? Without giving the same people the opportunity to show, without a doubt, they give the woman lower scores, how do you discount the possibility that what you’re seeing is just one group giving a higher score for whatever reasons?
Thank you for your time,
CVs often come with accompanying information such as responses to the selection criteria and recommendation letters. These are not presented in Ceci and Williams’ study. Ceci and Williams constructed poorly conceived and leading narratives that encouraged their participants to think about gender issues in hiring in very specific ways, which supported the researchers’ pre-established hypotheses. A better methodology would be to provide the CVs and have participants discuss and debate the merits of the skills presented in the CVs, using methodologies that are well-established in the scientific literature.
Moss-Racusin’s study did not present a narrative that included obviously gendered conditions that Ceci and Williams included – such as the applicant’s family situation. Moss-Racusin provided an application form that was developed with academic experts providing the type of information usually reviewed by hiring panels. I’ve linked to the original study. Have a read.
You perceive a “fatal flaw” with Moss-Racusin’s study, based on your perception that the people who preferred to hire a male applicant may be atypical. Again, read Moss-Racusin’s study which included:
Why do you think that a widely-cited, reputable study needs to prove gender bias in additional ways to the methodology? The study also draws on wide-ranging empirical literature that demonstrates that gender bias exists in many ways. I have also discussed and linked to studies and resources that explain how gender bias is pervasive and institutional.
Confirmation bias is a phenomenon where individuals reject scientific evidence that contradicts their personal beliefs, such as the belief that gender bias does not exist. Conversely, confirmation bias also leads individuals to believe evidence that validates their subjective world views.
I’ve shown the methodological flaws in Ceci and Williams’ study using established social science. Ceci and Williams are not experts on gender and they have been criticised for their scientific flaws accordingly. The study was published on the author’s own journal, and then widely criticised by the broader scientific community. This is how peer review works: studies may make it through peer review (especially if the authors own the journal) but the peer review process does not end there. Peer review continues as other researchers see the published results, and weigh the evidence presented in light of established theories and methods. Ceci and Williams’ study is a poor example of social science used to advance the authors’ personal or political agendas. That’s not good for science, is it?
Comments are closed.