A new article on CNN by psychology professors, Wendy Williams and Stephen Ceci, boldly proclaims that gender bias in Science, Technology, Engineering and Mathematics (STEM) is a myth. Their research has been published in the Proceedings of the National Academy of Sciences (PNAS). Unfortunately, their work has a flawed methodological premise and their conclusions do not match their study design. This is not the first time these researchers have whipped up false controversy by decrying the end of sexism in science.
Williams and Ceci write on CNN:
Many female graduate students worry that hiring bias is inevitable. A walk through the science departments of any college or university could convince us that the scarcity of female faculty (20% or less) in fields like engineering, computer science, physics, economics and mathematics must reflect sexism in hiring.
But the facts tell a different story…
Our results, coupled with actuarial data on real-world academic hiring showing a female advantage, suggest this is a propitious time for women beginning careers in academic science. The low numbers of women in math-based fields of science do not result from sexist hiring, but rather from women’s lower rates of choosing to enter math-based fields in the first place, due to sex differences in preferred careers and perhaps to lack of female role models and mentors.
While women may encounter sexism before and during graduate training and after becoming professors, the only sexism they face in the hiring process is bias in their favour.
Williams and Ceci’s data show that, amongst their sample, women and male faculty say they would not discriminate against a woman candidate for a tenure-track position at a university. Sounds great, right? The problem is the discrepancy between their study design, that elicits hypothetical responses to hypothetical candidates in a manner that is nothing like real-world hiring conditions, and the researchers’ conclusions, which is that this hypothetical setting dispels the “myth” that women are disadvantaged in academic hiring. The background to this problem of inequality is that this is not a myth at all: a plethora of robust empirical research already shows that, not only are there less women in STEM fields, but that women are less likely to be hired for STEM jobs, as well as promoted, remunerated and professionally recognised in every respect of academic life.
Flawed Research

Photo: Argonne, via Flickr CC 2.0
Williams and Ceci sent out an email survey to a randomised sample of over 2,000 faculty members in the USA in two maths-intensive fields where women are under-represented (engineering and economics) and two non-maths intensive fields where women are relatively better represented (psychology and biology). They had a 34% response rate, meaning their final sample was over 700 faculty. This rate of response is standard in many email surveys, but with this sort of study design, researchers need to critically examine and control for bias. In the social sciences, we know that people will participate in studies where they are 1) Given an incentive (usually paid); or 2) They have a personal stake or interest in the study.
Williams and Ceci say they have addressed self-selection bias of their sample by conducting two control experiments. In one, they sent out surveys to only 90 psychology faculty who were paid $25 for participation. They had 91% response rate (82 agreed to participate). Psychology not only has one of the highest proportions of women faculty relative to other fields, but this discipline uses gender as a central concept of study. That means that awareness of gender issues is higher than for most other fields. So including psychology as a control is not a true reflection of gender bias in broader STEM fields.
In another control study, Wiliams and Ceci surveyed engineering faculty by sending out hypothetical applicants’ CVs to 35 academics. This means that for a small sub-set of participants, they were evaluating material that is more like what we usually review when we are considering a candidate pool.
The rest of the sample – over 500 participants – were asked to rate three candidates based on narratives. This is not how we hire scientists.
In effect the study design does not simulate the conditions in which hiring decisions are made. Instead, participants self-selected to participate in a study knowing they’d be judging hypothetical candidates. While the researchers included a “foil” in their study design (one weaker male candidate) to contrast with two identical candidates who only varied in their gender, it is very easy to see from their study design that the researchers were examining gender bias in hiring.

Photo: Argonne, via Flickr, CC2.0
Participants read a small vignette about three candidates where the gender pronouns (he and she) were varied; some were given stories about candidates who were single; some were single divorced mothers; some were married mothers; and vice versa for men. Some of the stories contained adjectives usually associated with men (independent, analytic), others with “feminine” characteristics (creative, kind). Participants were assessing candidates based on the narrative by a hypothetical hiring committee chair. They were then asked to rate their preferred candidate. Under these highly atypical conditions, the participants were found almost equally likely to hire women and men, and in some cases, some sub-groups say they would prefer to hire a woman.
Here’s the thing; we don’t hire scientists based on short narratives.
When we hire scientists, the first thing that is assessed is their CV. It is the CV that gets an interview; the interviewee sits before a panel; individual panellists make notes; the committee makes a decision together. The researchers claim that their control groups and their “consultants” have proved that these individual evaluations would not be any different than the way in which a panel makes hiring choices. To suggest this is ludicrous given that they don’t have data about how hiring panels make decisions. If they did have these data, their study would be a completely different piece of research.
Gender Bias in Hiring

Photo: Argonne, via Flickr, CC 2.0
The process of hiring any professional is the outcome of social interaction. Biases shape social exchanges. Biases also influence how we read and interpret CVs, so our previous social interactions, from education to our workplace setting, all have a bearing on how CVs are assessed. A panel involves deliberation, another social exchange that is influenced by pre-existing biases.
Various studies have used hypothetical CVs in an experiment and these demonstrate how gender and other biases influence outcomes. This includes a study showing that amongst male and female psychologists who assessed potential candidates, men and women prefer to hire a man, even if women have the same qualifications. In light of this previous research, it is most striking that in Williams and Ceci’s study, the paid control group of psychologists were used to show that gender bias is not present. The fact that the control group was paid for their time and opinion in a study by two psychologists, where no other participants were paid is most unusual. There is nothing wrong with paying participants (their time is valuable) but if only a small sub-set are paid and others are not, then we need to question why.
Regardless, other research, including another study published by PNAS, shows that academics would prefer to hire men over women for prestigious managerial positions. Moreover, even in the life sciences, which has a relatively higher rate of women, male scientists in elite research institutions prefer to recruit men over women.
This issue aside, the fact remains that Williams and Ceci do not have data to support how scientists rank potential candidates. They have produced data about how scientists respond to a study about gender bias in academia, when they can easily guess that gender bias is being observed. Academics already understand that gender discrimination is morally wrong and unlawful. After all, North American universities have anti-discrimination policies in place, and they offer some level of training and information about their institutional stance on sex discrimination.

Photo: The Women’s Museum, CC 2.0 via Flickr
Research shows that academics do not fully understand how unconscious gender bias informs their decision-making and behaviour. Unconscious bias plays out in everyday interactions within STEM environments, from comments that undermine women’s professionalism, to “jokes,” to broader institutional practices that exclude women. Unconscious bias has a damaging effect on women, who are continually undermined at every stage of their education and careers.
The same goes for other professionals and the public at large: people are not aware of their biases unless they are trained to understand and address these preferences, which are deeply ingrained into us through early childhood socialisation. The myth that girls and women can’t succeed in STEM is demonstrated through the Draw a Scientist Test, a process that measures how young children are conditioned to accept the image of a scientist as being a White older man in a lab coat. This latent stereotype is further reinforced in the way girls are discouraged from learning STEM, and it impacts on their subsequent success when following these career paths.
A wealth of literature has shown that women are disadvantaged in STEM. Women academics who are mothers are less likely to be hired over fathers; these fathers are offered an average $11,000 more than mothers as a starting wage. Women are disadvantaged at every step of the hiring process, including for the types of activities that boost CVs for tenure-track positions, such as Fellowships.
Other research shows that, even when presented with empirical data about gender inequality in STEM, men are overwhelmingly more likely than others to reject the existence of gender bias. White men in particular either reject outright that inequality exists, or they otherwise think that inequality impacts on men, and that women are conversely more favoured. Sound familiar? This body of research demonstrates just how deeply held the so-called “myth” of gender inequality runs. Williams and Ceci have managed to reaffirm the popular, but ill-informed, idea that gender inequality is over, even when their own data cannot prove such a feat, particularly since it runs counter to decades of research.
Nationally representative data shows that over a 30 year-period, it is White women who have benefited from affirmative action, and that women of colour have made minimal progress under these diversity policies. Even still, White women remain under-represented in STEM relative to White men, while women of colour scientists are even more marginalised and less likely to be hired for jobs. As for the minuscule proportion of women of colour who manage to secure employment, they are subjected to routine sexism and racism within scientific settings. Transgender women, especially women of colour, are further subjected to additional prejudices, including gender policing and stigma that further alienates and undermines their professionalism.
Power, Race & Gender Bias

Photo: Argonne, via Flickr, CC 2.0
This is not the first time Williams and Ceci have published flawed results on gender in STEM, and it’s not the first time when they’ve completely ignored the real-world context in which women in STEM are battling to be hired, promoted and rewarded. This includes fighting not just sexism, but racism, homophobia, transphobia, ableism and so on. I critiqued their last study, which similarly tried to argue that sexism in academia is dead. Their data and methods prove no such thing. Rather, their previous studies have set up a precedence that is continued in their current research, which shows that two White, tenured professors draw insubstantial conclusions about gender inequalities that are simply not supported by their findings.
Power and race matters: these White professors who have “made it” in academia do not see a major problem with the gender imbalance in STEM. Instead, they explain this inequity away by arguing that women are self-selecting not to enter academia, and that those who do subsequently accept the “motherhood penalty.” That is, that women choose to sacrifice their careers for child-rearing. Williams and Ceci do not recognise that institutional factors and unfair policies do not really give women a real “choice” about their family and professional responsibilities.
Elsewhere, I have shown that Williams and Ceci’s previous research is informed by a false narrative of individual choice. The same can be said for the present study. The researchers’ own biases lead them to believe that women and men belong to two discrete groups (making genderqueer and transgender scientists invisible). Similarly, they do not see that issues of intersectionality (the multiple experiences of inequalities faced by minority women) have a profound impact on gender inequity in STEM.

Women’s experiences of gender are mediated by their culture, family, sexuality and other social factors
Ignoring race, sexuality and other socio-economic factors is a power dynamic: White, senior academics can pretend that race doesn’t matter, because racism does not adversely affect their individual progress. They can choose to believe that sexism is over because they have secured their tenure, even though they did so in a different climate to present-day pressures, where tenure is even tougher to find and early career researchers face precarious employment.
We must be ever-vigilant of how our biases contribute to inequality in STEM, and we must not accept abuse of power pandering to populist notions that we live and work in a so-called post-feminist, post-racial world. The evidence does not support such White patriarchal fantasies. Inequality has a concrete impact on the working lives of many women scientists, and this is felt most acutely by women of minority backgrounds. Rather than pretending the problem does not exist, let’s work together to eradicate gender inequality.
Further Reading
Edited to Add: Some excellent posts analysing Williams & Ceci below. I’ll add more as I find them.
- Karen James, #StillaProblem II: academic science is (still) sexist. Storify of discussions on Twitter about problems with the study.
- Helen De Cruz, Assessing inductive risk in the Williams and Ceci studies. “A problem with Ceci and Williams’ research (not just this paper, but their project as a whole) is their too-narrow focus on what counts as personal choices by women.”
- Marie Claire Shanahan, Be careful saying “The Myth about Women in Science” is solved. With discussion of other research showing bias exists
- David Charbonneau, New Study Demonstrates Shocking Truth About Faculty Hiring. Satiric article on the problems with the study, with useful introduction about how hiring committees actually work.
- Claire Griffin, Women preferentially hired in STEM – but does that solve the problem? “Many of these hindrances are not based on “supply-side” decisions, as the paper calls the problem. Rather, they are a result of structural obstacles and biases within academia and society at large.”
- skullsinthestars, A one-act play about a study in hiring practices in STEM: “if you question people on their hypothetical preferences for hiring, it seems obvious to me that you’ll get very different answers than what you’d get in an actual hiring process.”
- Dr Kate Clancy has started a collection of articles on this study as an example of Gaslighting, leading to the #GaslightingDuo on Twitter
- Matthew R. Francis, A study in how not to talk about sexism in science. A follow up to his Slate post which covers the mainstream media’s uncritical reporting on this study.
Photo Credits
Top image: photos adapted by Zuleyka Zevallos from these Creative Commons (2.0) sources:


Pingback: Be careful saying “The Myth about Women in Science” is solved | Boundary Vision
Pingback: Sexism in Science is Dead! Long live Sexism in Science! | Why Science Is Sexist
Pingback: This Week’s Good Reads: Staff Scientists, Gender Bias, Open Access, and Peer Review’s Repeat Referees | The UnderStory
Pingback: careful, your bias is showing | cubistcrystal
Pingback: I’ve Got Your Missing Links Right Here (19 April 2015) – Phenomena
Zuleyka, thank you for your engaging and well researched perspective. On Twitter, you mentioned that you were interested in my take on the study’s methods. So here are my thoughts.
I’ll respond to your methodological critiques point-by-point in the same order as you: (a) self-selection bias is a concern, (b) raters likely suspected study’s purpose, and (c) study did not simulate the real world. Have I missed anything? If so, let me know. Then I’ll also discuss the rigor of the peer review process.
As a forewarning to readers, the first half of this comment may come across as a boring methods discussion. However, the second half talks a little bit about the relevant players in this story and how the story has unfolded over time. Hence, the second half of this comment may interest a broader readership than the first half. But nevertheless, let’s dig into the methods.
(a) WAS SELF-SELECTION A CONCERN?
You note how emails were sent out to 2,090 professors in the first three of five experiments, of which 711 provided data yielding a response rate of 34%. You also note a control experiment involving psychology professors that aimed to assess self-selection bias.
You critique this control experiment because, “including psychology as a control is not a true reflection of gender bias in broader STEM fields.” Would that experiment have been better if it incorporated other STEM fields? Sure.
But there’s other data that also speak to this issue. Analyses reported in the Supporting Information found that respondents and nonrespondents were similar “in terms of their gender, rank, and discipline.” And that finding held true across all four sampled STEM fields, not just psychology.
The authors note this type of analysis “has often been the only validation check researchers have utilized in experimental email surveys.” And often such analyses aren’t even done in many studies. Hence, the control experiment with psychology was their attempt to improve prior methodological approaches and was only one part of their strategy for assessing self-selection bias.
(b) DID RATERS GUESS THE STUDY’S PURPOSE?
You noted that, for faculty raters, “it is very easy to see from their study design that the researchers were examining gender bias in hiring.” I agree this might be a potential concern.
But they did have data addressing that issue. As noted in the Supporting Information, “when a subset of 30 respondents was asked to guess the hypothesis of the study, none suspected it was related to applicant gender.” Many of those surveyed did think the study was about hiring biases for “analytic powerhouses” or “socially-skilled colleagues.” But not about gender biases, specifically. In fact, these descriptors were added to mask the true purpose of the study. And importantly, the gendered descriptors were counter-balanced.
The fifth experiment also addresses this concern by presenting raters with only one applicant. This methodological feature meant that raters couldn’t compare different applicants and then infer that the study was about gender bias. A female preference was still found even in this setup that more closely matched the earlier 2012 PNAS study.
(c) HOW WELL DID THE STUDY SIMULATE THE REAL WORLD?
You note scientists hire based on CVs, not short narratives. Do the results extend to evaluation of CVs?
There’s some evidence they do. From Experiment 4.
In that experiment, 35 engineering professors favored women by 3-to-1.
Could the evidence for CV evaluation be strengthened? Absolutely. With the right resources (time; money), any empirical evidence can be strengthened. That experiment with CVs could have sampled more faculty or other fields of study. But let’s also consider that this study had 5 experiments involving 873 participants, which took three years for data collection.
Now let’s contrast the resources invested in the widely reported 2012 PNAS study. That study had 1 experiment involving 127 participants, which took two months for data collection. In other words, this current PNAS study invested more resources than the earlier one by almost 7:1 for number of participants and over 18:1 for time collecting data. The current PNAS study also replicated its findings across five experiments, whereas the earlier study had no replication experiment.
My point is this: the available data show that the results for narrative summaries extend to CVs. Evidence for the CV results could be strengthened, but that involves substantial time and effort. Perhaps the results don’t extend to evaluation of CVs in, say, biology. But we have no particular reason to suspect that.
You raise a valuable point, though, that we should be cautious about generalizing from studies of hypothetical scenarios to real-world outcomes. So what do the real-world data show?
Scientists prefer *actual* female tenure-track applicants too. As I’ve noted elsewhere, “the proportion of women among tenure-track applicants increased substantially as jobseekers advanced through the process from applying to receiving job offers.”
https://theconversation.com/some-good-news-about-hiring-women-in-stem-doesnt-erase-sex-bias-issue-40212
This real-world preference for female applicants may come as a surprise to some. You wouldn’t learn about these real-world data by reading the introduction or discussion sections of the 2012 PNAS study, for instance.
That paper’s introduction section does acknowledge a scholarly debate about gender bias. But it doesn’t discuss the data that surround the debate. The discussion section makes one very brief reference to correlational data, but is silent beyond that.
Feeling somewhat unsatisfied with the lack of discussion, I was eager to hear what those authors had to say about those real-world data in more depth. So I talked with that study’s lead author, Corinne Moss-Racusin, in person after her talk at a social psychology conference in 2013.
She acknowledged knowing about those real-world data, but quickly dismissed them as correlational. She had a fair point. Correlational data can be ambiguous. These ambiguous interpretations are discussed at length in the Supporting Information for the most recent PNAS paper.
Unfortunately, however, I’ve found that dismissing evidence simply because it’s “correlational” can stunt productive discussion. In one instance, an academic journal declined to even send a manuscript of mine out for peer review “due to the strictly correlational nature of the data.” No specific concerns were mentioned, other than the study being merely “correlational.”
Moss-Racusin’s most recent paper on gender bias pretends that a scholarly debate doesn’t even exist. Her most recent paper cites an earlier paper by Ceci and Williams, but only to say that “among other factors (Ceci & Williams, 2011), gender bias may play a role in constraining women’s STEM opportunities.”
dx.doi.org/10.1177/0361684314565777
Failing to acknowledge this debate prevents newcomers to this conversation from learning about the real-world, “correlational” data. All data points should be discussed, including both the earlier and new PNAS studies on gender bias. The real-world data, no doubt, have ambiguity attached to them. But they deserve discussion nevertheless.
WAS THE PEER REVIEW PROCESS RIGOROUS?
Peer review is a cornerstone of producing valid science. But was the peer review process rigorous in this case? I have some knowledge on that.
I’ve talked at some length with two of the seven anonymous peer reviewers for this study. Both of them are extremely well respected scholars in my field (psychology), but had very different takes on the study and its methods.
One reviewer embraced the study, while the other said to reject it. This is common in peer review. The reviewer recommending rejection echoed your concern that raters might guess the purpose of the study if they saw two men and one woman as applicants.
You know what Williams and Ceci did to address that concern? They did another study.
Enter data, stage Experiment 5.
That experiment more closely resembled the earlier 2012 PNAS paper and still found similar results by presenting only one applicant to each rater. These new data definitely did help assuage the critical reviewer’s concerns.
That reviewer still has a few other concerns. For instance, the reviewer noted the importance of “true” audit studies, like Shelley Correll’s excellent work on motherhood discrimination. However, a “true” audit study might be impossible for the tenure-track hiring context because of the small size of academia.
The PNAS study was notable for having seven reviewers because the norm is two. The earlier 2012 PNAS study had two reviewers. I’ve reviewed for PNAS myself (not on a gender bias study). The journal published that study with only myself and one other scholar as the peer reviewers. The journal’s website even notes that having two reviewers is common at PNAS.
http://www.pnas.org/site/authors/guidelines.xhtml
So having seven reviewers is extremely uncommon. My guess is that the journal’s editorial board knew that the results would be controversial and therefore took heroic efforts to protect the reputation of the journal. PNAS has come under fire by multiple scientists who repeatedly criticize the journal for letting studies simply “slip by” and get published because of an old boy’s network.
The editorial board probably knew that would be a concern for this current study, regardless of the study’s actual methodological strengths. This suspicion is further supported by some other facts about the study’s review process.
External statisticians evaluated the data analyses, for instance. This is not common. Quoting from the Supporting Information, “an independent statistician requested these raw data through a third party associated with the peer review process in order to replicate the results. His analyses did in fact replicate these findings using R rather than the SAS we used.”
Now I embrace methodological scrutiny in the peer review process. Frankly, I’m disappointed when I get peer reviews back and all I get is “methods were great.” I want people to critique my work! Critique helps improve it. But the scrutiny given to this study seems extreme, especially considering all the authors did to address the concerns such as collecting data for a fifth experiment.
I plan on independently analyzing the data myself, but I trust the integrity of the analyses based on the information that I’ve read so far.
SO WHAT’S MY OVERALL ASSESSMENT?
Bloggers have brought up valid methodological concerns about the new PNAS paper. I am impressed with the time and effort put into producing detailed posts such as yours. However, my overall assessment is that these methodological concerns are not persuasive in the grand scheme. But other scholars may disagree.
So that’s my take on the methods. I welcome your thoughts in response. I doubt this current study will end debate about sex bias in science. Nor should it. We still have a lot to learn about what contexts might undermine women.
But the current study’s diverse methods and robust results indicate that hiring STEM faculty is likely not one of those contexts.
Disclaimer: Ceci was the editor of a study I recently published in Frontiers in Psychology. I have been in email conversation with Williams and Ceci, but did not send them a draft of this comment before posting. I was not asked by them to write this comment.
dx.doi.org/10.3389/fpsyg.2015.00037
LikeLike
Hi David,
In the first two points (a & b), you have restated the study’s methodology, which I’ve already critiqued. The study makes broad comments about hiring practices in STEM by sampling four disciplines. It uses only one of these disciplines in the first instance as a control, psychology, with only 82 participants who were paid, unlike other participants. If the study was about hiring bias in psychology, I would expect the control to be psychologists. It is most odd that the authors have chosen only a sub-set of people to prove their control. Second, as I already noted, Williams and Ceci sent CVs, as part of their bias control, to only another sub-set of their sample, mechanical engineers, so this subset were evaluating data that was unlike the rest of the study, on which the conclusions are drawn.
The odd choices of the researchers weakens their position that their dataset has been tested for self-selection bias. I’ve already covered this, and you have not proved anything to the contrary. All we have to go on is what is published, and the published results are riddled with methodological flaws. At the end of the day, that’s what counts: what the researchers say about their data. As presented, the data and methods do not match the conclusions, which make wild claims about the lack of bias in hiring preferences in STEM.
A critical review of the methodology in light of the existing evidence of empirical studies demonstrates that the study is highly flawed, as the authors have overstated the significance of their results. All they have collected is data about how a sub-set of self-selecting faculty (those who responded to the survey) read short narratives about candidates. Again: such narratives are nothing like the actual material that faculty teams assess. The researchers have collected data about narratives and nothing more.
As for being able to guess the design of the study – as noted – it is not difficult to see the gender component, as has been discussed on social media by other faculty. For the benefit of other readers, the sample narrative provided by Williams and Ceci in their supplementary paper is reproduced at the end of my comment. This narrative is about a single woman candidate, who supposedly mentions to the Chair that she has no children and has no problems starting work. In the case of women with kids, their narratives mentioned their family situation. The study told participants to consider the information provided by a fictional chair, including family circumstances. Why would a fictional faculty Chair write a narrative including the applicant’s family situation unless this fictional chair wanted this information to be assessed? As I’ve noted, university policy requires some level of mandatory training on sexual discrimination. Anyone reading this narrative would pick up the language and social clues that gender discrimination was being screened. The 30 participants in the sub-set who were used in the control did not guess the hypothesis was related to gender. To reiterate: only 30 out of over 2,000 were asked this question. Nevertheless, being asked to review three candidates, with only one woman, where family situation is being discussed represents an obvious social cue that gender is an issue. Social desirability is a central part of social science methods training. Participants always try to answer questions on surveys in a way that they guess might be the “right” way. This is why we control for social desirability. This hasn’t been done in Williams and Ceci’s dataset.
Regardless, none of this changes the fact that the data we’ve been presented with in the published PNAS study concerns narratives that have no bearing on real life materials that are assessed as part of the hiring process. Contrary to the researchers’ claims, they do not have data proving that individuals reading narratives is a useful simulation for how selection teams deliberate hiring decisions. If they had this data, we would not likely be having this discussion about methods not matching conclusions. If they wanted to claim that gender bias favours women in hiring, they should have conceived an experiment that better simulated the context in which hiring decisions are made. And they are not made on the basis of one short paragraph about family.
I’ve linked to other studies showing that experiments with CVs do show bias presented when potential faculty candidates are mothers. These women are disadvantaged and less likely to be offered a position, and when they are, they are offered a dramatically lower starting rate than fathers. Real world data bears this out, which I’ve also linked to, showing that mothers take much longer to achieve tenure. Williams and Ceci argued in their last paper that women are not disadvantaged in academia and they made reference to this (then-forthcoming) paper now published in PNAS. All of their research is tied together through the same ideological position that academia is “gender neutral” a quote straight out of Williams and Ceci’s mouths (and their colleagues) in reference to their broader research project (of which this PNAS paper is one component).
For some reason, you have chosen to critique Moss-Racusin based on what you perceive to be her lack of resources and based on some once-off conversation after a conference many years ago. You say Ceci and Williams spent three years collecting data for their study: they clearly needed to spend longer, and channel their time and seemingly expansive resources into actually collecting the type of data that would allow them to make the claims they make. In any case, the length of time of data collection is neither here nor there. Neither is your perception of how much time and resources Moss-Racusin invested in her study relevant. A well funded study with flawed methods, which sets out to prove a political or ideological view point that academia is “gender neutral” is still a poor piece of research. Your critique of Moss-Racusin seems to be personal, and frankly does not help your position. Moss-Racusin conducted a good study that did not claim to have solved the problem of sexism in academia. Williams and Ceci have conducted a weak study given their grand conclusions. Data must match methods, that is the crux of my critique. Resources don’t change this fact.
As for your final point about the peer review process of this paper – your comments are perplexing. You make the assertion that you know that seven reviewers read Williams and Ceci’s paper. Whether it’s two or seven reviewers, the peer review process for any given peer-review journal is set by the editors of that journal as they see fit. It depends on the field or sub-discipline; most sociology journals will seek three reviewers, but it will go out to more if there are discrepancies in the reviewers’ recommendations. So if anything, the fact that this study was sent to seven reviewers may suggest that some reviewers’ comments gave rise to editorial misgivings and so further reviewers may have been sought. Perhaps the PNAS editors decided on the otherwise arbitrary number of seven reviewers knowing that the study would be controversial. Regardless, PNAS decided to publish the study in the end. So, in fact, whatever the reviewers said is a null point after the editorial team’s decision to publish a paper.
Ultimately, the decision to publish a flawed study rests on the shoulders of the editors, PNAS in this case, and critiques about the data and conclusions still reflect on the researchers, in this case Williams and Ceci. The number of reviewers has zero correlation on the data that is actually published. Unless, of course, the editors are also the authors of a study, as was the case in Williams and Ceci’s last study on gender, in which case, then the peer review process has failed completely, given that study was riddled with methodological issues.
That’s the theme with Williams and Ceci’s work: their scholarship is open to critique because their methods, conclusions, omissions and presumptions are coloured by privilege on several levels: race, class, sexuality and other socio-economic measures are repeatedly sidestepped because the researchers, being White tenured professors, presume that the academy has no problem with sexism. In their world view as White privileged faculty, they don’t see discrimination, claiming in the New York Times about their previous study, “We are not your father’s academy anymore.” “We” being White professors with an aversion to addressing gender inequality and power dynamics. Well, plenty of studies have shown that girls and women are disadvantaged at every spot in their careers, including in hiring.
For some puzzling reason, you have elected to divulge your contact with two of seven anonymous reviewers of Williams and Ceci’s study. The peer review process is anonymous for a reason: to protect the authors and researchers from bias, influence and coercion, and to ensure that the process is as objective as possible. Many academic circles can be insular, so perhaps it’s easy to guess who might have been involved in reviewing certain papers. Even though you do not mention names, I find the fact that you discuss this in defence of Williams and Ceci is both disturbing and off topic. The data matters at the end of the day, as I keep noting, not whatever investigations you decided to carry out on your own.
You are crossing over into unsavoury ethical ground here. If what you say here is correct, what you say may suggest that PNAS does not adequately protect the integrity of the peer review process by allowing researchers to contact and berate reviewers to explain or justify their publishing recommendations. The academic peer review system relies on trust and professional conduct. Part of that is not evoking what reviewers may have told you in confidence, to support a study that you seem to be personally or professionally invested in. You are a PhD student still David, I hope you will reflect on how this behaviour may impact on your professional conduct.
You also disclose that Ceci reviewed a previous article of your yours, that you have been in contact with Williams and Ceci over email and that they have not read over your comment you’ve submitted here. You also saw fit to tell me on Twitter that Williams and Ceci have read my blog post. How are all these pieces of information related? You seem to be closely aligning yourself with the authors and while you claim to not be working on their behalf, you sure are spending a lot of time defending them, by evoking your connection to them. It is good that you enjoy their work. Critiquing the resources of other researchers and sharing private conversations are not the best way to convey this defence.
The peer review system is not the end of scholarly engagement nor critique. It is simply a gatekeeping service. We trust peer review journals to provide robust, expert opinions by reviewers and editors before publishing, but once a study is published, it is critiqued on the merits of the material that is published.
A fatal flaw in Williams and Ceci’s PNAS study methodology is that, no matter how the authors, or anyone else, spins it, Williams and Ceci simply do not have the data to back up their claims. Williams and Ceci have data about how faculty respond to narratives that have zero connection to how real life hiring is conducted in academia. They have overstated their flawed data to extrapolate their claim that women are not disadvantaged in hiring, and in fact, they make the claim that women are advantaged over men 2:1. Notwithstanding the obvious problem with this premise (that not all academics fall into a neat divide of “men” and “women” and race as well as other social markers compound inequality), the fact remains: extraordinary claims require extraordinary evidence. You are convinced by Williams and Ceci’s evidence, even though it contradicts the empirical literature and lived experiences of women in STEM. We can agree to disagree, but perhaps let’s stick to the evidence. This study does not offer conclusive evidence to negate gender bias. You can choose to believe otherwise. That is the beautiful thing about science, we can weigh up evidence using our training, expertise and experience, as well as the extensive body of literature. In this case, it is stacked against Williams and Ceci’s outlandish supposition that hiring bias is in women’s favour and that academia is gender neutral.
***
One sample narrative provided by Williams & Ceci (p.7):
Imagine you are on your department’s personnel/search committee. Your department plans to hire one person at the entry assistant-professor level. Your committee has struggled to narrow the applicant pool to three short-listed candidates (below), each of whom works in a hot area with an eminent advisor. The search committee evaluated each candidate’s research record, and the entire faculty rated each candidate’s job talk and interview on a 1-to-10 scale; average ratings are reported below. Now you must rank the candidates in order of hiring preference. Please read the search committee chair’s notes below and rate each candidate. The notes include comments made by some candidates regarding partner-hire and family issues, including the need for guaranteed slots at university daycare. If the candidate did not mention family issues, the chair did
not discuss them.
Dr. X: X struck the search committee as a real powerhouse. Based on her vita, letters of recommendation, and their own reading of her work, the committee rated X’s research record as “extremely strong.” X’s recommenders all especially noted her high productivity, impressive analytical ability, independence, ambition, and competitive skills, with comments like “X produces high-quality research and always stands up under pressure, often working on multiple projects at a time.” They described her tendency to “tirelessly and single-mindedly work long hours on research, as though she is on a mission to build an impressive portfolio of work.” She also won a dissertation award in her final year of graduate school. X’s faculty job talk/interview score was 9.5 out of 10. At dinner with the committee, she impressed everyone as being a confident and professional individual with a great deal to offer the department. During our private meeting, X was enthusiastic about our department, and there did not appear to be any obstacles if we decided to offer her the job. She said she is single with no partner/family issues. X said our department has all the resources needed for her research.
LikeLike
Zuleyka, yes I agree the data and methods are the most important to consider, so let’s focus on those. I’m going to organize my thoughts under the same categories as before. We will probably have to agree to disagree on some points and that’s fine.
But, as a sidenote, I completely agree that some other studies show evidence of gender bias against women. Including the Moss-Racusin study. That study did show bias against women in hiring lab managers and that’s an important finding. And we need more experimental evidence of the contexts that do undermine women in a similar way. Agreed.
Let’s go back to the PNAS study now.
(a) Self-selection
You make a fair critique of the control experiment that sampled psychologists to assess self-selection bias.
Can we agree, though, that there were other data for assessing self-selection bias? Specifically, the analyses comparing response rates by gender, rank, and discipline.
(b) Guessing study’s purpose
You note a Facebook comment made by Lisa Boulanger’s husband suggesting that participants did guess the study’s purpose, contrary to the data with those 30 participants.
Certainly her husband did think the study was about hiring biases. But can we agree her husband said nothing specific about gender?
And like I mentioned earlier, “Many of those surveyed did think the study was about hiring biases for ‘analytic powerhouses’ or ‘socially-skilled colleagues.’ But not about gender biases, specifically.”
(c) Real-world
Can we agree that the NRC 2010 report showed that the percent women increased from the applicant pool to job offers?
LikeLike
David, I’ve already covered these points more than once; first on social media, then in my article, and then in my reply to your first verbose response on my blog. Science does not require that we agree on anything. You can hold your views without needing me to validate things that I’ve already proven to be scientifically flawed.
While the overall number of women in STEM has improved compared to the 1970s, these gains have not gone nearly far enough. As I’ve mentioned repeatedly, here and elsewhere, most of the increase has been for some White women, while there are not enough women of colour, transgender women, LBTQIA women, women with disabilities, and women of other intersecting minority backgrounds. Over this past week Harvard announced that it had achieved gender parity for the first time in its history. MIT and other elite universities have publicly stated they are working towards gender equity. The Athena Swan Charter in the UK and the Australian Academy of Science’s Science in Australia Gender Equity (SAGE) Forum are set up to evaluate and address gender inequality, including in hires, and many universities are not fairing as well as they should, which is why these programs support their improvement. In a forthcoming piece, my colleagues and I will show that the number of women academic hires has dropped in other universities as soon as they stop actively targeting gender diversity.
So: in light of real world evidence, the overall patterns (with less than 17% of women in senior faculty positions in Australia and the UK) may look positive to people with White privilege, male privilege, as well as anyone not aware of heterosexism, ableism and other dynamics of power, but the gender patterns are abysmal to everyone else who is invested in gender equality and inclusion in STEM.
An increase from almost no women in STEM, to a little bit more (mostly White, heterosexual, able-bodied) women is not much progress given the year we live in 2015. Lazy ideologies that these small numbers represent a 2:1 advantage in hiring are misguided.
My criticisms of the Williams and Ceci study stand, no matter how much you’d like to show up on my blog to try to change my mind. The goal of science is not to harass women into agreeing with you; you present evidence, that evidence is evaluated using science, and if the evidence is valid and reliable, then minds get changed.
Williams and Ceci have produced a weak study with poor methods, publishing biased conclusions that fit into their pre-existing world-view that gender equality is not a big issue in academia. You can keep defending them for whatever reasons you feel fit; but you don’t get to do it over and over on my blog. You have @ mentioned me at least 40 times on Twitter about this, you’ve submitted two long comments (one of them twice) on my blog. That’s enough. Just so you know: berating women in science, on social media and in general in life, is not a good way to be a colleague. I will not allow you to post further comments on this post as you’re simply restating the same points over and over and I don’t see your behaviour as constructive.
The real question now is why it’s so important to you, a White man, that a woman (of colour no less) bends to your perception of gender bias. Gender, race and other social dynamics are present in all social interactions and they influence our epistemologies (which is why we train to control and address them). Please refrain from pushing the same arguments, especially given I specifically asked last time that you stop, and given that I’d already explained elsewhere that my blog is moderated because White men like to harass me constantly given I write about racism, sexism and other inequalities. I publish comments when I have the time to read them and respond to them; whatever your timelines are don’t factor into my ability to moderate this blog, which I do for free, as a service to sociology, social justice studies, and STEM public education.
Your valiant defence of the #GaslightingDuo can continue wherever you like, just not on my blog.
LikeLike
Thank you for the thoughtful response. It will take me some time to process. But I first wanted to clarify one thing.
I did not contact PNAS about who the peer reviewers were. I was discussing the study with my colleagues, and two of them volunteered to me that they were peer reviewers. I did not ask them, but rather they volunteered that information. I even sent my comment to the reviewer whose views I discuss, and that reviewer expressed no concern with my comment. And like you mention, I have kept their identities private. The fact that there were seven reviewers is publicly noted in the acknowledgments section of the PNAS paper.
LikeLike
Hi David, I hope you understand my point. It is troubling that you evoke the peer review process as a way to defend this study. You say you sent your comment to one of the anonymous reviewers and they saw no problem. This seems odd to me – you had one of the anonymous PNAS peer reviewers read over your comment to my blog? Why? This question is rhetorical because the issue is broader. We don’t discuss the peer review process as a way to defend a paper for the reasons stated.
As I said, academic circles can be small. That does not mean you can evoke these personal connections in a public defence of a study. It serves no purpose other than to muddy the lines of objectivity and ethics. What is okay to you and one or two reviewers is not necessarily how peer review system should be working.
Still, as I say, this is all beside the point. You are using personal discussions and personal attacks to publicly defend the research by Williams and Ceci. They don’t need you to defend their work. As I’ve repeatedly stated, here and on Twitter, their data speaks for them. You want to believe their data and conclusions are justified – that’s your prerogative. There’s no need to keep going over the same ground. Your enthusiasm for this study is noted.
My critique stands.
LikeLiked by 1 person
Zuleyka, I agree that the most important issue is the methods and data, and I promise I will reply to your thoughtful comments about that. I just wanted to clarify the information. You said that what I said, “may suggest that PNAS does not adequately protect the integrity of the peer review process by allowing researchers to contact and berate reviewers to explain or justify their publishing recommendations.”
If that were true, that would be politically damaging to me. But it’s not true. So I wanted to clarify that.
I promise I will reply to your comments on the methods and data after I’ve had the chance to fully process your arguments.
LikeLike
Pingback: Links 4/22/15 | Mike the Mad Biologist
Gender bias in STEM is a myth. Watch and learn:
LikeLike
Hi Shady and thanks for the link! For others reading, this link is to Hjernevask (Brainwash), a controversial documentary by two Norwegian comedians. While the presenter talks to some experts and they include lots of vox pops with the public, this is not science. The documentary covers the so-called “nature versus nurture” debate about why there are less girls in STEM professions, despite Norway’s progressive gender policies. My colleagues and I have already addressed the scientific literature addressing these issues, showing how institutional processes influence girls’ and women’s transition from education to careers in science. I’ve already linked to our writing in my post, but here’s the link again: http://blogs.nature.com/soapboxscience/2014/09/04/nature-vs-nurture-girls-and-stem
LikeLike
Pingback: Recommended Reads #51 | Small Pond Science
Pingback: Link love for squirrelly April | Grumpy Rumblings (of the formerly untenured)
Reblogged this on GEOCOGNITION RESEARCH LABORATORY and commented:
What she said!
LikeLike
Pingback: The Big Lie of science | Galileo's Pendulum
ceci and willians in the study used people who are above the average, this may have had an effect that could downplaying the condition of bias
LikeLike
Hi Fernando Junior. Not sure what you mean by “above the average”? The biggest problem with their methods is that they sent out a survey asking faculty to respond to a short hypothetical scenario. This does not simulate the reality of how women are hired in academia. So their claim that women are favoured 2:1 is invalid: they don’t have the data to support this claim.
LikeLike
Pingback: Video games make you less sexist? It’s not quite that simple - OK4me2
Pingback: Video games make you less sexist? It's not quite that simple | Em News
Pingback: Video Games Make You Less Sexist? Science Says It's Not Quite That Simple | Gizmodo Australia
Pingback: Video Games Make You Less Sexist? Science Says It's Not Quite That Simple | Kotaku Australia
Pingback: Addressing Sexism in Scientific Publishing | The Other Sociologist - Analysis of Difference... By Dr Zuleyka Zevallos
Pingback: You know what? Leave me out of this | Galileo's Pendulum
Pingback: Subtle sexism: Stereotypes and how they shape us | NeuWrite San Diego
Pingback: Recipe for a flawed conversation on sexism in STEM | From the Balcony : observing the landscapes of sexism (and stuff) in science
Pingback: Summer of Sexism: Blunders, Blow-ups and Backlash | STEM Women