Scientific Study Claims Ability to Predict Best-Selling Novels

January 17, 2014

Scientists from Stony Brook University conducted a study using a technique called statistical stylometry to analyze a book’s commercial success to within eighty-four-percent accuracy. The computer scientists running the study used a range of factors in determining a book’s potential success, including style of writing, novelty, engaging storyline, and “interestingness.” But, they did account for immeasurable factors such as luck. The experiment used downloaded books from the Project Gutenberg archives and applied their algorithm to analyze text, comparing prediction results to actual historical information available regarding the success of the book. They also sought “less successful” books among the low-ranking Amazon lists both in terms of sales and negative critique from media. There was a wide range of literature used ranging from science fiction to poetry.

Several trends became evident during the course of the study. Heavy use of conjunctions like “and” and “but,” large numbers of nouns and adjectives, and the use of verbs describing thought processes such as “recognized” or “remembered” were found in successful books. Conversely, less successful work seemed to use explicitly described emotions and actions such as “wanted” or “promised,” and use more verbs and adverbs. Assistant Professor Yejin Choi, a co-author of the paper, said of the study, “To the best of our knowledge, our work is the first that provides quantitative insights into the connection between the writing style and the success of literary works… Our work examines a considerably larger collection—800 books over multiple genres, providing insights into lexical, syntactic, and discourse patterns that characterize the writing styles commonly shared among the successful literature.”

 

Source: The Telegraph

Previous Story:
How Many Novelists are there in America?
January 14, 2014
Next Story:
Crime Novel by Convicted Murderer Wins Award
January 17, 2014

No Comments