9.1 Classic Books

In this section I look at 8 philosophy books that are available on gutenberg.org. The books are:

Democracy and Education, by John Dewey
The Economic Consequences of the Peace, by J. M. Keynes²⁰
On Liberty, by John Stuart Mill
Principia Ethica, by G. E. Moore
Our Knowledge of the External World, by Bertrand Russell
The Analysis of Mind, by Bertrand Russell
The Problems of Philosophy, by Bertrand Russell
The Methods of Ethics, by Henry Sidgwick

I then applied the model to these eight books, one chapter at a time. That is, I asked the model what probability it gave to each chapter from each of these books to being in each of the ninety topics. The outputs looked like this. (I’m just including the ten topics with the highest probability.)

Table 9.1: Table 9.2: *The Problems of Philosophy*, chapter 5.
Subject	Probability
Ordinary language	0.1929
Denoting	0.1804
Knowledge	0.0925
Perception	0.0557
Idealism	0.0535
Physicalism	0.0486
Sense and reference	0.0360
Universals and particulars	0.0337
Propositions and implications	0.0259
Verification	0.0228

There are three things to note about this table.

One is that the probabilities are very widely spread around. This is what normally happens when doing these out-of-sample applications. The model is much more confident about the data it was trained on than it is about other data. And even in the training data, the average maximal probability was around 0.4. Here it is more usually 0.2 or lower.

The second is that ordinary language plays an outside role in these models. The fact that it really isn’t like the other topics, that it is a style as much as a subject matter, keeps complicating the analysis.

And the third is that the topics here are old. Chapter 5 of The Problems of Philosophy reminds the model a little of “On Denoting”, which makes sense, and a little of Frege, which also makes sense, but the other subjects are very old. In fact, what’s surprising about this chapter is that it reminds the model of two relatively modern topics, not that it has eight or more old topics mixed in.

Let’s look at the top topic across each of the books. Since I am averaging the chapter probabilities, these numbers will be even lower than for individual chapters.

Table 9.3: Table 9.4: Democracy and Education
Subject	Probability
Life and Value	0.1188903
Psychology	0.0928031
Other History	0.0651241
Marx	0.0579922
Idealism	0.0392438
Methodology of Science	0.0361702
Egalitarianism	0.0275609
Ordinary Language	0.0245771
Liberal Democracy	0.0245334
Feminism	0.0229338

It’s a bit surprising that the model doesn’t identify this with pragmatism, and even more surprising that feminism turns up here. But otherwise this broadly makes sense.

Table 9.5: Table 9.6: The Economic Consequences of the Peace
Subject	Probability
War	0.1496192
Other History	0.1138113
Marx	0.0871626
Egalitarianism	0.0689745
History and Culture	0.0520126
Psychology	0.0445217
Liberal Democracy	0.0354317
Life and Value	0.0338558
Medical Ethics and Freud	0.0298874
Crime and Punishment	0.0251323

This, on the other hand, doesn’t look quite right. It’s about World War I, so I guess it looks like war. But it isn’t really a history book. And it’s certainly not a Marx book. This does look like it pushed the model way past its comfort level.

Table 9.7: Table 9.8: On Liberty
Subject	Probability
Life and Value	0.0902568
Liberal Democracy	0.0898736
Crime and Punishment	0.0693310
Other History	0.0535161
Ordinary Language	0.0503155
Psychology	0.0491301
Social Contract Theory	0.0454340
Freedom and Free Will	0.0422134
Population Ethics	0.0359232
Marx	0.0335514

Putting this in with other history does make some sense because a few of the papers in that topic are about Mill. But it’s striking to me that the model thinks of this book as going with social work of its time, even more than it sees it as going with topics on liberalism, or freedom. This is bringing up a limit of this approach that we’ve seen a few times before. Literary styles change over time, and the model doesn’t do well with that kind of change.

Table 9.9: Table 9.10: Principia Ethica
Subject	Probability
Ordinary Language	0.1784152
Value	0.1450726
Idealism	0.0892730
Psychology	0.0605699
Moral Conscience	0.0402476
Life and Value	0.0294026
Propositions and Implications	0.0292766
Emotions	0.0272526
Ontological Argument	0.0261256
Promises and Imperatives	0.0258684

One way to look at this data is that it’s bringing out the extent to which the ordinary language movement wasn’t a repudiation of the philosophy that had gone before them, but a return to the way of doing philosophy exemplified by Moore and Russell. The model does not typically think papers from 1903 are ordinary language papers, but it does think that Principia Ethica is ordinary language.

Table 9.11: Table 9.12: Our Knowledge of the External World as a Field for Scientific Method in Philosophy
Subject	Probability
Ordinary Language	0.0936422
Temporal Paradoxes	0.0902183
Idealism	0.0877616
Mathematics	0.0645098
Psychology	0.0617430
Other History	0.0424668
Classical Space and Time	0.0368795
Deduction	0.0290318
Life and Value	0.0274414
Propositions and Implications	0.0237543

The model really doesn’t identify Our Knowledge of the External World with any of the contemporary topics. The closest is mathematics, but this book seems surprisingly dated.

Table 9.13: Table 9.14: The Analysis of Mind
Subject	Probability
Psychology	0.2235166
Ordinary Language	0.0851796
Depiction	0.0382893
Idealism	0.0372668
Emotions	0.0309470
Meaning and Use	0.0300003
Time	0.0283572
Causation	0.0280609
Physicalism	0.0237825
Other History	0.0226199

This is part of why I was happy to include psychology as a philosophy of mind topic. It is a bit different to how we now do philosophy of mind. But it includes a lot of what Russell does in The Analysis of Mind. And that’s a paradigm of a philosophy of mind book.

Table 9.15: Table 9.16: The Problems of Philosophy
Subject	Probability
Ordinary Language	0.1520525
Idealism	0.1072594
Psychology	0.0566472
Knowledge	0.0383409
Physicalism	0.0351719
Denoting	0.0312222
Other History	0.0310846
Universals and Particulars	0.0300335
Perception	0.0283178
Justification	0.0272398

Again, we see that Moore and Russell were precursors as much as opponents of ordinary language philosophy. And between perception, denoting, knowledge and justification, we see flickers of contemporary philosophy entering into the picture.

Table 9.17: Table 9.18: The Methods of Ethics
Subject	Probability
Psychology	0.0970969
Moral Conscience	0.0778821
Ordinary Language	0.0732375
Emotions	0.0668353
Life and Value	0.0639257
Idealism	0.0526327
Value	0.0447126
Virtues	0.0343360
Promises and Imperatives	0.0319940
Duties	0.0305731

This, on the other hand, is a bit disappointing. I would have thought it would have done a better job of identifying The Methods of Ethics as, well, a work of ethics. And we do see a few topics from Ethics here, but also a lot of others.

Let’s turn to chapters. I’m not going to go through every chapter and display the topics for it. But it is interesting to look at the chapters the model is most confident about.

Table 9.19: Ten chapters with the highest topic probabilities.
Book	Chapter	Subject	Probability
The Analysis of Mind	14	Psychology	0.3837
The Analysis of Mind	6	Psychology	0.3136
The Analysis of Mind	8	Psychology	0.3088
The Analysis of Mind	15	Psychology	0.2865
The Analysis of Mind	4	Psychology	0.2815
Our Knowledge of the External World	7	Mathematics	0.2717
The Problems of Philosophy	14	Idealism	0.2710
The Methods of Ethics	15	Psychology	0.2545
The Problems of Philosophy	1	Ordinary language	0.2539
The Analysis of Mind	2	Psychology	0.2472

It really is confident about Russellian works, relatively speaking. Let’s see what happens if we leave off psychology.

Table 9.20: Ten chapters with the highest topic probabilities (excluding psychology).
Book	Chapter	Subject	Probability
Our Knowledge of the External World	7	Mathematics	0.2717
The Problems of Philosophy	14	Idealism	0.2710
The Problems of Philosophy	1	Ordinary language	0.2539
The Methods of Ethics	31	Moral conscience	0.2420
Our Knowledge of the External World	5	Temporal paradoxes	0.2416
The Methods of Ethics	4	Emotions	0.2395
Our Knowledge of the External World	6	Temporal paradoxes	0.2371
Principia Ethica	1	Ordinary language	0.2370
Principia Ethica	3	Value	0.2248
The Methods of Ethics	11	Emotions	0.2148

And pushing further forward, let’s see what happens if we leave off all of topics 1–30.

Table 9.21: Ten chapters with the highest topic probabilities (excluding first thirty topics).
Book	Chapter	Subject	Probability
Our Knowledge of the External World	7	Mathematics	0.2717
The Economic Consequences of the Peace	7	War	0.2056
The Economic Consequences of the Peace	5	War	0.1883
The Problems of Philosophy	5	Denoting	0.1804
The Economic Consequences of the Peace	4	War	0.1790
The Methods of Ethics	20	Egalitarianism	0.1680
The Analysis of Mind	9	Time	0.1615
On Liberty	1	Liberal democracy	0.1614
The Problems of Philosophy	12	Denoting	0.1577
Democracy and Education	7	Liberal democracy	0.1570

This doesn’t look like the model is doing too bad a job. Russell does talk about philosophy of mathematics, Keynes about war, and Mill about liberal democracy. And remember that egalitarianism is largely about Parfit, and hence about consequentialism, so it isn’t surprising Sidgwick ends up there. What if we restrict things to topics from 61–90?

Table 9.22: Ten chapters with the highest topic probabilities (excluding the first sixty topics)
Book	Chapter	Subject	Probability
The Methods of Ethics	20	Egalitarianism	0.1680
The Problems of Philosophy	12	Justification	0.1221
The Analysis of Mind	12	Justification	0.1133
The Economic Consequences of the Peace	5	Egalitarianism	0.1109
The Problems of Philosophy	13	Knowledge	0.0962
The Problems of Philosophy	10	Knowledge	0.0938
The Problems of Philosophy	5	Knowledge	0.0925
The Methods of Ethics	30	Population ethics	0.0881
The Economic Consequences of the Peace	2	Egalitarianism	0.0877
The Methods of Ethics	19	Feminism	0.0822

And while the numbers are low, Russell on epistemology reminds the model much more of contemporary philosophy than most of the other books. This makes sense, I think.

To end this little inquiry, I want to look a bit at how much these books resemble the articles of their time. The next six graphs compare the eight books (collectively) to the journal articles published between 1876–1925. I’ll compare the average topic probability, and the maximum topic probability, for each of the ninety topics in the journals and in the books.²¹ I’ll do this thirty topics at a time, because otherwise we get a bunch of dots clustered together in the bottom left corner of the graph. So on this graph the x axis measures the average probability of a topic in journal articles up to 1925, and the y axis measures the average probability of a topic in the eight books.

A scatterplot comparing the distribution of topics 1–30 in journal articles up to 1925 and in the books being discussed. Most topics are present to roughly the same amount, but idealism is much less prevalent in the books, and ordinary language philosophy is more prevalent.

Figure 9.1: Average probability for the first thirty topics in journals and books.

The books I’ve chosen are more like ordinary language, and less like idealism, than the journals. And they have a little more ethics in them. We get a similar story if we look at the maximum values instead of the average values.

A scatterplot comparing the maximum probability distribution of topics 1–30 in journal articles up to 1925 and in the books being discussed. Maximum probabilities are generally higher for journal articles than for books. Ordinary language and Value are much more present in books.

Figure 9.2: Maximum probability for the first thirty topics in journals and books.

There are a lot more journal articles, and some of them are very short, so the maximum probabilities go much higher for the journals than the chapters. But otherwise there isn’t much of a pattern here. Let’s move on to the middle thirty topics.

A scatterplot comparing the average probability distribution of topics 31-60 in journal articles up to 1925 and in the books being discussed. War and liberal democracy are much more prevalent in books, and perception and Kant are more prevalent in journals. Otherwise, most topics are present roughly the same.

Figure 9.3: Average probability for the second thirty topics in journals and books.

This perhaps tells us more about the books I chose than the difference between philosophy in books and philosophy in journals. I’m sure there was discussion of Kant and perception in books at the time; just not so much in these eight books. Maybe war and liberal democracy are under-represented in the journals relative to their importance to philosophy at the time; I would need more information. No one is talking about radical translation before 1925. The same patterns hold, more or less, if we look at maximum values.

A scatterplot comparing the maximum probability distribution of topics 31-60 in journal articles up to 1925 and in the books being discussed. Maximum probabilities are generally higher for journals than books. War is much more prevalent in books.

Figure 9.4: Maximum probability for the second thirty topics in journals and books.

The maximum probabilities are, as always, higher for the journals than for the book chapters. And there are some articles that are really about color, or about philosophy of mathematics. There is, as I’ve already mentioned, one book chapter that’s also about philosophy of mathematics. Onto the last thirty.

A scatterplot comparing the average probability distribution of topics 61-90 in journal articles up to 1925 and in the books being discussed. Average probabilities are generally lower for journals. Egalitarianism and population ethics are much higher for books; knowledge is also higher in books.

Figure 9.5: Average probability for the last thirty topics in journals and books.

Note that the scales here are very different. The numbers are all low, but they are much lower for the journals than the books. So even though knowledge is way over to the right of the graph, the numbers for it are actually bigger in the books than the journals. And it’s a little bigger in the journals than I had quite realised; there are no articles primarily in epistemology, but it isn’t at zero like some other topics. Let’s end this section with looking at the maximum probabilities in each topic.

A scatterplot comparing the maximum probability distribution of topics 61-90 in journal articles up to 1925 and in the books being discussed. The maximum probablity for journals is higher for all topics than for the maximum probability for books. Abortion and self-defense is the most extreme data-point, and is much higher in journal articles than in books.

Figure 9.6: Maximum probability for the last thirty topics in journals and books.

For all thirty of these, the highest probability in the journals is higher than the highest probability in any book chapter. I was wondering whether the epistemology chapters would be the counterexamples to this claim, but they didn’t come that close. Where we did get close to a counterexample was that Sidgwick almost sounds more like a modern Parfitian than any journal author. But not quite—when there are three thousand journal articles, there will usually be one counterexample to any generalisation in there somewhere.

Is this a philosophy book? Well, I think it’s an important work of applied political philosophy.↩︎
Small note on methodology. When I talk about the average topic probability for the books, this is something that gets calculated in two steps. First, I calculate the average for each book, across its chapters. Then I average the books. I’m doing this rather than averaging the chapters because that approach would mean that the books with more chapters would swamp the books with fewer.↩︎