9.3 Buzzwords

One of the methods I used for building the model was to run repeated refinements of the model, to try to make it better track actual philosophical topics. At the time I did this, I was worried that this would have bad consequences. It felt like tightening the strings. And while this is generally a good idea, if things are made too tight, they snap. After a few refinements, I started to think this was a silly bit of reasoning by analogy. These aren’t actually strings, so they can’t actually snap, right? Well, after one hundred iterations of the refinement script, I got a topic whose distribution looked like this.

A series of scatterplots showing the weighted number of articles about norms in twelve journals from 1900 to 2000. The number rises dramatically starting sometime between 1940 and 1980 across all journals.

Figure 9.7: Number of articles in the norms topic in the bad LDA.

Those are remarkable graphs—it seems that this topic is getting to be a bigger and bigger deal in all the journals. I had not seen anything like this; after 1970 it’s almost impossible to get the “generalist” journals, the philosophy of science journals, and the ethics journals moving in the same direction. Maybe it’s a function of the journals publishing more articles over time. We could check this by looking at what proportion of the journals are made up by this topic, not the absolute number of (expected) articles.22

A series of scatterplots showing the proportion of articles about norms in twelve journals from 1900 to 2000. The proportion rises dramatically sometime between 1940 and 1980 for all journals, though the rise is less steep for Analysis, Philosophy of Science, and British Journal for the Philosophy of Science.

Figure 9.8: Proportion of articles in the norms topic in the bad LDA.

It’s slightly less steep, especially in Philosophy of Science. But the generalist journals—except Analysis—and British Journal for the Philosophy of Science are still rising rapidly. Let’s look at what articles are primarily in this topic.

Table 9.31: Top articles in the norms topic in the bad LDA.
Article Topic Probability
Gershon Weiler, 1962, “On Relevance,” Mind 71:487–93. 0.4531
Rahul Kumar, 2003, “Reasonable Reasons in Contractualist Moral Argument,” Ethics 114:6–37. 0.3851
Jay F. Rosenberg, 1997, “Brandom’s Making It Explicit: A First Encounter,” Philosophy and Phenomenological Research 57:179–87. 0.3749
Sven Rosenkranz, 2001, “Farewell to Objectivity: A Critique of Brandom,” The Philosophical Quarterly 51:232–7. 0.3619
David Enoch, 2005, “Why Idealize?,” Ethics 115:759–87. 0.3433
Robert Brandom, 1997, “Replies,” Philosophy and Phenomenological Research 57:189–204. 0.3433
Mark Couch, 2009, “Functional Explanation in Context,” Philosophy of Science 76:253–69. 0.3422
David J. Chalmers, 2011, “Verbal Disputes,” Philosophical Review 120:515–66. 0.3385
Joshua Gert, 2010, “Color Constancy and the ColorValue Analogy,” Ethics 121:58–87. 0.3368
Paul Weiss, 1934, “A Home for Logic,” Philosophy of Science 1:238–8. 0.3301

This is very confusing in three different ways.

  1. Although the topic seems concentrated in the twenty-first century, two of the top ten articles are from a fair way ago - including the top one.
  2. If that top article is excluded, no article has a topic probability of over 0.4. This is true even though for some journal-year pairs, the average topic probability is over 0.1. It feels like every article must get a reasonable probability of being in this topic.
  3. Relatedly, there doesn’t seem to be any thematic unity to the articles here. What could we even call the “topic” which has these ten articles as paradigm cases? I’m calling it norms because it looks from the graphs like the counterpart of our topic norms, but this is hardly a perfect name.

For comparison, the original topic 90 had a top ten list that looked a little more sensible.

Table 9.32: Top articles in the norms topic in the good LDA.
Article Topic Probability
Gideon Rosen, 1997, “Who Makes the Rules Around Here?,” Philosophy and Phenomenological Research 57:163–71. 0.6952
Jay F. Rosenberg, 1997, “Brandom’s Making It Explicit: A First Encounter,” Philosophy and Phenomenological Research 57:179–87. 0.6807
Sven Rosenkranz, 2001, “Farewell to Objectivity: A Critique of Brandom,” The Philosophical Quarterly 51:232–7. 0.6364
Michael Pendlebury, 2010, “How to be a Normative Expressivist,” Philosophy and Phenomenological Research 80:182–207. 0.5925
Anandi Hattiangadi, 2003, “Making It Implicit: Brandom on Rule Following,” Philosophy and Phenomenological Research 66:419–31. 0.5834
Macalester Bell, 2011, “Globalist Attitudes and the Fittingness Objection,” The Philosophical Quarterly 61:449–72. 0.5758
Neil Sinclair, 2009, “Recent Work : Recent Work in Expressivism,” Analysis 69:136–47. 0.5690
Allan Gibbard, 1996, “Review Essays: Thought, Norms, and Discursive Practice: Commentary on Robert Brandom, Making It Explicit,” Philosophy and Phenomenological Research 56:699–717. 0.5306
Pekka Väyrynen, 2013, “Grounding and Normative Explanation,” Proceedings of the Aristotelian Society (Supplementary Volume) 87:155–78. 0.5234
R. Jay Wallace, 2007, “Reasons, Relations, and Commands: Reflections on Darwall,” Ethics 118:24–36. 0.5198

The papers are more recent, the probabilities are higher, and there is some more unity to the topic. It’s about normativity and objectivity, very broadly construed, with a bit of a focus on Brandom. And we can still see hints of that in the new top ten list, but it’s gotten blurrier. The Rosen article that’s at nearly 70 percent in the original topic has dropped to around 20 percent in the new topic, reflecting the lack of thematic unity.

Maybe we can get a bit of a better look at this new weird topic by looking at its keywords. Remember that an LDA model assigns each word a probability of being in a paradigm article in the topic. We can compare that to the frequency of that word in the whole data set to get a sense of what’s characteristic of the topic. (Again, it’s necessary to restrict attention to the five thousand most common words here to prevent too much focus on words that appear just a handful of times.) And here’s what we get. The second column here is the ratio of the probability of the word being in a (paradigm) article in this topic to the word’s overall frequency.

Table 9.33: Table 9.34: Top words in the norms topic in the bad LDA.
Word Ratio
accounts 15.2
role 14.7
commitment 13.8
commitments 13.4
account 12.9
proposal 11.6
constitutive 11.2
practices 10.9
challenge 10.8
typically 10.6
claims 10.2
worry 10.1
approach 9.9
relevant 9.9
project 9.6
focus 9.4
features 9.4
issue 9.2
appeal 9.0
provide 9.0

This I think is the clue to what’s happened. The “topic” here is just the distinctive vocabulary of twenty-first-century philosophy. The topic appears in all journals because these words have become more and more prevalent in all journals.

I’ll come back to direct evidence for this hypothesis in a minute, but first I wanted to show what a table like this looks like for normal topics. Here’s the same table for topic 90 in the original model.

Table 9.35: Table 9.36: Top words in the norms topic in the good LDA.
Word Ratio
norms 54.4
norm 52.7
normative 46.4
gibbard 45.2
normativity 39.6
practices 37.0
attitudes 36.3
commitments 35.3
commitment 33.3
constitutive 29.2
resentment 27.7
disagreement 26.9
attitude 23.2
blackburn 22.0
practice 18.5
evaluative 17.1
correctness 17.0
disagreements 16.0
challenge 15.7
accounts 14.9

It’s possible to see some of the twenty-first-century vocabulary, like challenge and accounts turning up at the bottom. But this is what things mostly should look like for a topic about normativity and objectivity.

Just to get a sense of the appropriate scale for what’s being measured here, here’s what the top of the table for the Kant topic looks like.

Table 9.37: Table 9.38: Top words in the Kant topic in the good LDA.
Word Ratio
kant 163.9
kantian 110.9
maxim 107.8
maxims 104.7
transcendental 103.7
intuition 86.7
berkeley 79.7
critique 63.2
categorical 61.0
sensibility 48.0

That makes sense; the word Kant is 163 times more likely to appear in a paradigm Kant article than in philosophy in general. A ratio like this of 163 is high, but having the highest ratio be fifteen is a sign something has gone wrong. The only topic that is really like this in the original model is ordinary language philosophy.

Table 9.39: Table 9.40: Top words in the ordinary language philosophy topic in the good LDA.
Word Ratio
think 10.9
really 10.0
answer 9.4
something 9.3
perhaps 9.1
quite 9.0
sort 8.5
anything 8.1
ask 7.9
course 7.6
want 7.5
certainly 7.4
seem 7.3
saying 7.3
said 7.3
thing 7.2
question 7.2
get 7.1
things 7.1
much 7.0

Like the new topic 90, this topic in the original model really tracked a style and not a content. It’s really striking that the distinctive words in ordinary language philosophy are so much shorter than the distinctive words in contemporary philosophy.23 But it’s probably time to start actually proving that words like commitment, challange and approach are distinctively twenty-first-century words. So rather than just look at the outputs of complicated models, I’m going to end with some simple graphs of word frequency over time.

The data set I’m using for the graphs to follow is the word lists as provided by JSTOR. That excludes some stop words, and all one and two letter words, but not all the other words that I filtered out before building the LDA. I’ll start with some graphs of the keywords from this new topic 90.

A scatterplot showing the frequency of words about theories (_account_, _accounts_, _claim_, and _claims_) in journal articles from 1880 to after 2000. The frequency of all four words begins to increase more rapidly beginning around 1960.

Figure 9.9: Words about theories.

A scatterplot showing the frequency of words about plans (_appeal_, _focus_, _project_, and _role_) in journal articles from 1880 to after 2000. The frequency of all four words begins to increase more rapidly after 1960.

Figure 9.10: Words about plans.

A scatterplot showing the frequency of words about views (_commitment_, _commitments_, _proposal_, and _proposals_) in journal articles from 1880 to after 2000. The frequency for all four is quite low early on, and begins to increase significantly around 1930-1940.

Figure 9.11: Words about views.

A scatterplot showing the frequency of words about objections (_challenge_, _challenges_, _worries_, and _worry_) in jounal articles from 1880 to after 2000. The frequency for all four begins to rise by around 1950, though _challenge_ starts to rise earlier, around 1920.

Figure 9.12: Words about objections.

A scatterplot showing the frequency of words about what's common (_practices_, _relevant_, _typically_) in journal articles from 1880 to after 2000. The frequency of _relevant_ begins to rise dramatically around 1900; _practices_ and _typically_ begin to rise around 1960.

Figure 9.13: Words about what’s common.

Not all of these words are shooting upwards, but many of them are. I had originally drawn these graphs with trend lines, but they aren’t needed to see the pattern. At this rate we’ll soon see articles made up of just the words account, typically, relevant and challenge, plus perhaps their plurals.

So this is why I just used fifteen refinements of the model rather than one hundred. The language of early twenty-first-century philosophy is distinctive enough that if you push a text-based analysis too hard, it ends up just tracking form rather than content.

But didn’t we have this already back in ordinary language philosophy? We did, though fortunately the binary sort helped find a couple of natural topics within it. Still, it would be nice to confirm that these words really were being used more frequently in midcentury. So let’s look at the same graphs for the keywords from ordinary language philosophy.

A scatterplot showing the frequency of words about speech acts (_answer_, _ask_, _question_, and _said_) in journal articles from 1880 to after 2000. _Question_ is always the most frequent and _ask' is always the least frequent. All four words show a similar pattern with frequency peaks around the 1960s.

Figure 9.14: Words about speech acts.

A scatterplot showing the frequency of words about epistemic modality (_certainly_, _course_, _perhaps_, and _really_) in journal articles from 1880 to after 2000. _Course_ is generally the most frequent, and _certainly_ is the least frequent. All four words show a peak in frequency around 1960.

Figure 9.15: Words about epistemic modality.

A scatterplot showing the frequency of words about quantity (_much_, _quite_, _seem_, and _sort_) in journal articles from 1880 to after 2000. _Much_ generally has the highest frequency, though its frequency declines steadily with time. _Sort_ has the lowest frequency in the 1880s through mid-1900s. The words somewhat converge in frequency around 1970.

Figure 9.16: Words about quantity.

A scatterplot showing the frequency of words about mental state attribution (_get_, _think_, and _want_) in journal articles from 1880 to after 2000. _Think_ appears much more frequently than the other words throughout the time span. All three words have moderate peaks in frequency around the 1970s.

Figure 9.17: Words about mental state attribution.

A scatterplot showing the frequency of words about quantification (_anything_, _something_, and _things_) in journal articles from 1880 to after 2000. _Anything_ is the least frequent throughout most of the time span. All three words show a peak in frequency around 1960, though the peak is more pronounced for _something_ and _things.'

Figure 9.18: Words about quantification.

The first three have roughly the pattern I was expecting, but the last two don’t. I think there is a sense in which some of the stylistic changes that the ordinary language philosophers brought in persisted. And there is also a sense in which they were the last holdouts against the move to a more scientific philosophy. As is so often the case, it helps to look at a distinctive era as both the end of what came before it and the start of what came after it.

There is another puzzle that I left open above that I want to return to. How could we square the low ratio between the maximal and average topic probabilities for some journal-year pairs? The obvious answer is that every article is in the topic to some nontrivial degree. Let’s see how true that is. So for a few journal-year pairs, I’m going to go through every article and list the probability that it is in this new topic. I’ll start with Philosophical Review in 2004.

Table 9.41: Philosophical Review, 2004, probability that each article is in the bad topic.
Article Topic Probability
Abraham Sesshu Roth, 2004, “Shared Agency and Contralateral Commitments,” Philosophical Review 113:359–410. 0.2733
Jonathan Cohen, 2004, “Color Properties and Color Ascriptions: A Relationalist Manifesto,” Philosophical Review 113:451–506. 0.2178
Sebastian Gardner, 2004, “Critical Notice of Richard Moran, Authority and Estrangement: An Essay on Self-Knowledge,” Philosophical Review 113:249–67. 0.1906
Sukjae Lee, 2004, “Leibniz on Divine Concurrence,” Philosophical Review 113:203–48. 0.1868
Eric Lormand, 2004, “The Explanatory Stopgap,” Philosophical Review 113:303–57. 0.1602
Richard Holton, 2004, “Rational Resolve,” Philosophical Review 113:507–35. 0.1402
Robert Pasnau, 2004, “Form, Substance, and Mechanism,” Philosophical Review 113:31–88. 0.1301
Lex Newman, 2004, “Rocking the Foundations of Cartesian Knowledge: Critical Notice of Janet Broughton,”Descartes’s Method of Doubt”,” Philosophical Review 113:101–25. 0.1292
Frederick Kroon, 2004, “Descriptivism, Pretense, and the Frege-Russell Problems,” Philosophical Review 113:1–30. 0.1089
Daniel Sutherland, 2004, “Kant’s Philosophy of Mathematics and the Greek Mathematical Tradition,” Philosophical Review 113:157–201. 0.0975
Janet Broughton, 2004, “The Inquiry in Hume’s Treatise,” Philosophical Review 113:537–56. 0.0739
David Barnett, 2004, “Some Stuffs are not Sums of Stuff,” Philosophical Review 113:89–100. 0.0403
Michael Huemer, 2004, “Elusive Freedom? a Reply to Helen Beebee,” Philosophical Review 113:411–6. 0.0001

The Roth paper really is about commitments, so it isn’t surprising that it’s a little higher than the others. But look how much this spreads around other articles. The model thinks there is something that all but one of these articles have seriously in common. And I think there isn’t anything substantive (as opposed to stylistic) this could be. Let’s move on to Ethics in 2010.

Table 9.42: Ethics, 2010, probability that each article is in the bad topic.
Article Topic Probability
Joshua Gert, 2010, “Color Constancy and the ColorValue Analogy,” Ethics 121:58–87. 0.3368
Japa Pallikkathayil, 2010, “Deriving Morality from Politics: Rethinking the Formula of Humanity,” Ethics 121:116–47. 0.2149
John Tasioulas, 2010, “Taking Rights Out of Human Rights,” Ethics 120:647–78. 0.1951
Mark Van Roojen, 2010, “A Fork in the Road for Expressivism,” Ethics 120:357–81. 0.1901
Edward S. Hinchman, 2010, “Conspiracy, Commitment, and the Self,” Ethics 120:526–56. 0.1756
Allen Buchanan, 2010, “The Egalitarianism of Human Rights,” Ethics 120:679–710. 0.1623
Louis‐Philippe Hodgson, 2010, “Kant on the Right to Freedom: A Defense,” Ethics 120:791–819. 0.1596
Gunnar Björnsson and Stephen Finlay, 2010, “Metaethical Contextualism Defended,” Ethics 121:7–36. 0.1520
Leslie Green, 2010, “Two Worries About Respect for Persons,” Ethics 120:212–31. 0.1519
Mark Van Roojen, 2010, “Moral Rationalism and Rational Amoralism,” Ethics 120:495–525. 0.1372
Rebecca Stangl, 2010, “Asymmetrical Virtue Particularism,” Ethics 121:37–57. 0.1353
James Griffin, 2010, “Human Rights: Questions of Aim and Approach,” Ethics 120:741–60. 0.1309
Joseph Raz, 2010, “On Respect, Authority, and Neutrality: A Response,” Ethics 120:279–301. 0.1025
Mikhail Valdman, 2010, “Outsourcing Self‐Government,” Ethics 120:761–90. 0.0949
Gopal Sreenivasan, 2010, “Duties and Their Direction,” Ethics 120:465–94. 0.0908
Stephen Darwall, 2010, “Authority and Reasons: Exclusionary and Second‐Personal,” Ethics 120:257–78. 0.0849
Rainer Forst, 2010, “The Justification of Human Rights and the Basic Right to Justification: A Reflexive Approach,” Ethics 120:711–40. 0.0808
Steven Wall, 2010, “Neutralism for Perfectionists: The Case of Restricted State Neutrality,” Ethics 120:232–56. 0.0726
Erik J. Wielenberg, 2010, “On the Evolutionary Debunking of Morality,” Ethics 120:441–64. 0.0687
Elizabeth Brake, 2010, “Minimal Marriage: What Political Liberalism Implies for Marriage Law,” Ethics 120:302–37. 0.0494
Peter A. Graham, 2010, “In Defense of Objectivism About Moral Obligation,” Ethics 121:88–115. 0.0340
Judith Lichtenberg, 2010, “Negative Duties, Positive Duties, and the “new Harms”,” Ethics 120:557–78. 0.0267
John Brunero, 2010, “Self‐Governance, Means‐Ends Coherence, and Unalterable Ends,” Ethics 120:579–91. 0.0138
Sarah Fine, 2010, “Freedom of Association is not the Answer,” Ethics 120:338–56. 0.0000
Ben Saunders, 2010, “Democracy, Political Equality, and Majority Rule,” Ethics 121:148–77. 0.0000

The same pattern shows up; almost all the articles are in the topic at a 5 percent probability or higher. To the extent that there was anything substantive in the topic, it was in normative ethics, so maybe the topic being so visible in Ethics isn’t too surprising. But let’s see what happens when we do the same thing for British Journal for the Philosophy of Science in 2011.

Table 9.43: BJPS, 2011, probability that each article is in the bad topic.
Article Topic Probability

Here we do get more articles that are clearly excluded. The difference between the last six articles is unimportant. Once you get below 0.1 percent, the probabilities are functions of how confident the model is in its central classifications. But it’s still striking how many of these are above 1.1 percent. There are ninety topics, so if the model had no idea it would put each probability at 1.1 percent. The vast majority of the articles here are above that.

There is another study I could imagine running here, but it would take so long that I’m going to leave it to later work. Repeatedly refining the model broke because of the distinctive language of twenty-first-century philosophy. There are two possible explanations for that.

  1. There has been a linguistic revolution over the last generation, and philosophers now write in a very different style to how they wrote a generation ago.
  2. This is an artifact of model building, and if the model was stopped at any time, and ran the same study I did, there would be results like this. That is, doing what I did will get weird results whenever there is linguistic drift, and there is always linguistic drift.

I actually could test these by running the study I did for this book but stopping in, say, 1993. But I don’t think spending several hundred hours processing time on teasing apart these two explanations would be worthwhile.

That’s in part because this question will resolve itself over time naturally. Hopefully more studies like mine (or preferably better designed studies than mine) will be run on data that goes through 2020 and beyond. Those will tell us even more about where philosophy is going, and answer several questions that I’ve left open as pleasant side effects.


  1. Most of the graphs in chapter 2 are proportional, not the absolute graph I just showed you.↩︎

  2. I wanted to include here some graphs about average word lengths over time, but they don’t really show very much. There is a very gentle increase, focussed on the philosophy of science journals, but on the whole the distinctively short keywords don’t really track anything about average word length. There was a notable drop in average word length in Proceedings of the Aristotelian Society in the 1950s, but it didn’t show up elsewhere. Most notably, there was no similar drop in average word length in Mind or Philosophical Quarterly. That suggested it was something about the journal, perhaps connected to the fact that papers are read to the society, rather than about British philosophical culture more broadly.↩︎