Monday, November 27, 2017

Improving Evidence Presentation: An Example and Some Tips

This is an unusual blog post because it is on how to do research, not on what we have learned from research that has been done. Well, there is also something on what we have learned. I think it is very important that researchers show the data in their manuscripts by making graphs that show the reader the phenomenon and their explanation of it. This is not easy to do currently because management journals like models more than they like graphs, and models don’t show the data as clearly as graphs do.

As the editor of Administrative Science Quarterly, I am encouraging authors, associate editors, and reviewers to use more graphs. I also do it as an author, and this blog post is about a paper in Advances in Strategic Management that I wrote with Seo Yeon Song. We analyzed the ebook business, and our starting point was that there is a big movement toward self-publishing and indie (independent) publishing there, with an increased market share relative to the Big 5 publishers. Here is the graph showing this change:

How did we explain the change? Big 5 publishers can pay for advertisements, unlike indies, so indies must have some other advantage. We thought it was their readers rewarding good indie books with tweets and reviews on the amazon.com website. Here is a comparison of how Amazon reviews affect sales of Big 5 and indie ebooks:



See the difference? Indies don’t have advertisements to support sales, so each new review increases their sales more. This is something that can be seen from the data without any modeling. Of course we also modeled the data. I won’t show the model here, but instead show a graph comparing the effect of Amazon reviews (the count), Amazon review score, tweets (the count), and sentiment (how positive they were). It is easy to see the results, right? Amazon reviews have a much stronger effect than Twitter posts.


Finally, here is a graph that shows the review effect in a model that extracts all other effects we could control for, such as the tweets. This is called a residual graph, and it can be used to check how much of the relation between the reviews and sales is explained by other factors. The answer is… almost nothing. This graph (a residual graph) is visually nearly the same as the earlier one. It also shows how much is left to explain by other factors that are not yet in the model, which is clearly a lot.



Well, this was a short story about ebook sales, but the more important point is that researchers can show their findings well just by graphing the data. If you want to see the program that made these graphs and some sample data to use it on, click here and here.