Are bankers really more dishonest?

Nobody likes a merchant banker, and a new report in Nature, Business culture and dishonesty in the banking industry, makes the case that such distaste may have a sound basis: Bankers who took a survey which asked questions about their jobs behaved more dishonestly than bankers who took a survey which addressed mundane, everyday topics, such as how much television they watched per week. It’s a catchy claim. But in contrast to the headlines, the data suggest something else: bankers were more honest overall than other groups, and at worst no more dishonest.

Each group of bankers was asked to toss a coin 10 times and report, online and anonymously, how often it landed on each side. They were told that each time the coin landed on a particular side (heads for some, tails for others), they could win $20 dollars.

The group who took the job-related survey reported 58.2% successful coin flips, while the control group reported 51.6% successful coin flips. Thus, the authors argued, priming the bankers with their professional identity made them more likely to dishonestly claim that they had tossed coins more successfully than they actually had.

To follow this up, the authors conducted two more studies with different populations, non-banking professionals and students. For these two groups, there was no effect of priming with professional identity; control groups and “treatment” (i.e. primed) groups performed similarly. Hence, the headline finding that making bankers think about their professional identity as bankers made them more dishonest. Other groups did not become more dishonest when primed with their professional identity, and thus there is something about banking and banking culture that makes an honest person crooked.

But more dishonest than who?

Curiously, what is glossed over in the main paper – instead it can be found in the extended figures and the supplementary information – is that what was different about the results from the non-banking professionals and students is that the control groups were as dishonest as the primed groups. In fact, of all the groups, the odd one out is the banking control group. Whereas the banking control group reported 51.6% successful coin flips, the non-banker and student control groups reported 59.8% and 57.9% respectively. The primed banking group reported 58.2% successful flips, while the non-banker and student primed groups reported 55.8% and 56.4% respectively.

If we collapse across the control and primed groups and simply look at the average success rate for each sample population, on average, bankers reported 54.6% successful coin flips, non-banking professionals 57.8%, and students 57.15%. Thus, overall, the bankers were the most honest group.

So maybe the headline should be that bankers are more honest than other groups, until they’re reminded that they’re bankers. Then they’re as dishonest as everyone else (or at least, non-banking professionals, and students).

Hidden moderators and experimental control

Hidden moderators come up regularly as a possible explanation for failed replications. The argument goes something like this: the original experiment found the effect, but the replication did not. Therefore, some third, unknown variable has changed. Perhaps the attitudes or behaviours which gave rise to the effect are not present in the sampled population, or at least this specific sample:

Doyen et al. apparently did not check to make sure their participants possessed the same stereotype of the elderly as our participants did.– John Bargh

Perhaps the transposition of the experiment across time and space has lead to the recruitment of subjects from a qualitatively different population:

Based on the literature on moral judgment, one possibility is that participants in the Michigan samples were on average more politically conservative than the participants in the original studies conducted in the UK. —Simone Schnall

And perhaps, in the case of some social priming effects, societal values have changed so much in the period between the original study and the replication that this specific effect will never be found again: its ecological niche has vanished, or has been occupied by another, more contemporary social more.

These are valid possible explanations for why a replication may have failed [1]. But the implication typically seems to be that since the replicators did not account for these potential hidden moderators, the replication is fatally flawed and should not be published as is. Faced with this critique from a reviewer, replicating authors are left with two alternatives: give up and don’t publish it; or collect more data and attempt to establish experimental control:

My recollection is that we used to talk about experimental control. Perhaps this was in the days of behaviourism. The idea was that the purpose of an experiment was to gain control over the behaviour of interest. A failure to replicate indicates that we don’t have control over the behaviour of interest, and is a sign that we should be doing more work in order to gain control.

Chris Frith

In an ideal world, establishing experimental control is the best alternative. The original effect is genuine, but perhaps the luminance of the stimuli, the lighting in the experimental chamber, or the political leanings of the participants differed across experiments. Running more experiments which account for these variables means we improve our understanding of the effect, establishing the boundary conditions under which it does and does not appear. If the reviewer has correctly identified a hidden moderator, then the understanding of the effect is greater than it was before.

So what’s the catch?

This is all well and good when the effect itself is well established, with strong evidence in its favour. But what if the original evidence was weak? The effect being significant does not mean the evidence was strong, and you can’t establish boundary conditions for an effect which doesn’t exist; you can only provide more opportunities for false positives. Demanding that replicators run more experiments to test for potential hidden moderators places an additional experimental burden on them for an effect that they have already provided evidence may at least be substantially weaker than was originally reported, and places them in a difficult situation: running more experiments can never provide a definitive answer to the hidden moderator critique.

Catch-22's Yossarian

Damned if you do; and damned if you don’t

Even if the effect re-emerges, this does not mean that it explains the discrepancy between the replicated and replicating experiments: the problem with hidden moderators is that they’re hidden, and by definition, their influence on the results of original study is unknown [2]. Thus, as an author, the hidden moderator critique can feel somewhat unfair: you are criticized for not controlling something which was not controlled in the original study. And if the reviewer identifies a potential hidden moderator that turns out to have no effect, then they may demand yet more experiments to account for yet more hidden moderators, or worse, criticize the replicators for failing to identify conditions under which the effect emerges.

How sure are you about the results?

What’s missing is a consideration of the strength of the evidence [3]. It’s all too easy to over-estimate how strong the original evidence was [4]. It shouldn’t always be enough to simply say that the effect was significant in the original study, and therefore those wishing to publish a failed replication must also find conditions under which it emerges, or at least account for as many different reasons why it may not emerge as the reviewer can think of. This may be appropriate if the original study provided strong evidence in favour of the effect – but if it doesn’t, the barrier should be lower for a replication to be viable in its own right. What should be necessary is that the evidence the replication provides on its own is strong; and if that is true, it provides a valuable data point in its own right, even without follow-ups aimed at uncovering a putative moderator or mechanism for an effect we should have less confidence is a general one.

 


[1] And if not, there’s always

Aliens?!

[2] Even if the participant’s predilection for wearing outlandish hats moderates their susceptibility to the priming of personality judgements by the colour of the experimenter’s hat, there was no measure of outlandish hat-wearing in the original study.

[3] Here’s a nice example of using the Bayes Factor to do this from Felix Schoenbrodt: http://www.nicebread.de/reanalyzing-the-schnalljohnson-cleanliness-data-sets-new-insights-from-bayesian-and-robust-approaches/

[4] And this does not imply the original researchers did something wrong, a la QRP or p-hacking: I’m talking simply here about statistical strength and evidential value, not implying that there is evidence of questionable practice or methodological failure. These things happen. That’s why we do statistics!

Every action potential, every neuron

Neuroscience was not always what I wanted to do. All I really wanted to do was play Smells Like Teen Spirit.

My parents bought me a battered nylon-strung acoustic from a car boot sale. Mr Brown, my chemistry teacher, taught me the basics. I quickly got the hang of simple chord shapes, the As, the Gs, the Es. Soon, I was banging out House of the Rising Sun and Blowing in the Wind like every neophyte guitarist before me.

But all I really wanted to do was play Smells Like Teen Spirit. How could I make my guitar sound like that? I started playing the basic melody on the low E string. Then I figured out power chords, which sound as limp on a classical acoustic as classical implies. I moved on to electric guitar (“Judas!”). Bigger. Louder. Cooler.Striking a pose

I had the basics. I cranked up the overdrive and made my chords crunch. I listened to the solo over and over until I could play every note without watching my fingers move up and down the fretboard. I figured out what effects to use, how close to stand to the amp to get feedback, how to mute the strings with my palm to get a percussive effect.

And all I’d really wanted to do was play Smells Like Teen Spirit. I moved on to more complicated songs, learning new riffs and new techniques, repeating them over and over until they became as natural as speaking. And I realized how often the details were less important than the generalities. You could shift all the notes to a different key, or play them all on a glockenspiel. You didn’t even have to play exactly the same notes. Once you had the overall structure down, you could take it for a walk to wherever you wanted to go.

These are grand times for neuroscience. Huge, ambitious projects with incredible scope garner Presidential attention and lavish funding. The big new idea? To record every action potential from every neuron; to build the most complete model of the human brain that’s ever been built; to be able to reproduce every instant of every task, on demand, just as if it were happening right now.

If all I’d really wanted to do was play Smells Like Teen Spirit, maybe, if I’d had the technology, I could have broken every instant down all the way to its individual frequency components. And then I could reproduce those exact frequencies on demand, without worrying about what produced them. Every detail, all the way through the song, exactly as it was, all without knowing a single chord, all without knowing how the guitar makes the sounds it does, what shape it needs to be, or how and why the strings resonate at particular frequencies, and all without knowing how and why the song was made, or why hearing it made me want to play it.

When it comes to researching the brain, there are thousands of people playing in different keys, each learning different parts on different instruments, each trying to find out what note they should be playing. Sometimes we find a new instrument, or even a new note. Sometimes it turns out to be the same old instrument playing a different note, or the same old note on a different instrument. And different movements in the composition rise and fall on the weight of evidence.

Do we really need to rebuild a particular guitar to learn the song? Of course, details are important. But knowing how to reproduce the notes is not the same thing as knowing how to play them in the right order. And sometimes you need to know how the song goes before you can know when you’re hitting the wrong notes. Until we have a feel for the movements, how can we understand where the notes should go?

But I digress; after all, all I really wanted to do was play Smells Like Teen Spirit.

Frontiers Research Topics

A few months ago, I got an invitation to host a Research Topic at Frontiers, one of the (relatively) new wave of Open Access journals, where authors pay publication costs and readers can freely access articles. Apparently, my recent article would be an excellent fit for the Research Topics initiative! Research Topics are where a couple of editors get together and invite submissions on their pet topic – a bit like a special issue, or a conference symposium. I’ve seen some great examples of these on topics dear to my own heart, like VanRullen & Krieman’s The timing of visual object recognition, so you’d think being asked to host one would be pretty cool, no?

Now here’s the thing: I get plenty of spam. Invitations to random conferences, offers of monoclonal antibodies, invitations to enlarge various appendages. Every couple of months I get a letter from a vanity publishing press asking if I want to publish my thesis as a book. The common theme is that they’re rather impersonal. When I get what reads like a form email that makes only a cursory reference to me and my work and then tells me what a great opportunity it’s providing me, I get suspicious.

After a moment’s pause, I discounted the invitation as spam, as I have done the repeat invitations. And the pause was only because it was from Frontiers, a journal family I like. From a few conversations on Twitter, it feels like this is a pretty common reaction.

The attitude that Open Access is simply vanity publishing is one I clearly disagree with (published in both PLOS ONE and Frontiers), but it’s a long way from being a dead opinion yet. It’s not great if even supporters of the OA movement and Frontiers find these kind of invitations a bit spammy.

There are a couple of things at play here. I’m a junior researcher. Nobody asks me to host a symposium or edit a special issue. I once mentioned to a colleague that perhaps we could try setting up a research topic with Frontiers, and the reaction was “isn’t that really for more senior researchers?” If I start inviting people, I feel like the most likely reaction will be “Who are you, and how did you get my address?” This is something Frontiers have explicitly claimed they’re trying to address, opening up such paths to junior researchers, but it’s not a stated aim on their website or in the emails.

The article that formed the basis of my invitation has been cited once – by me – so if you want to tell me that this is an article that can form the keystone of a research topic, you need to do more than re-state the title. Tell me why it’s interesting and why it might fit. Otherwise, I don’t get the feeling you’ve even read the abstract, and I start to get the impression that these special issue Research Topics are perhaps not so special.

So in other words, if you want to appeal to us juniors, you have to overcome both our insecurity in our early career status and our doubts about your sincerity. If you want to reach out to us, make it feel like you’re interested in *us* and have some idea what our research area actually *is*. That’s if that’s what you really want to do.