8.4 Correlations
The model assigns a probability to each topic-article pair. So across the articles, we can ask how tightly correlated those probabilities are. Which of them tend to go up when the other goes up? There are 8010 pairs of distinct topics, so there is too much data here to usefully examine, or even visualise. But I wanted to go over the extremes. First, here are the thirty-two strongest correlations. (Why thirty-two? Because these seemed particularly interesting.)
Subject One | Subject Two | Correlation |
---|---|---|
Knowledge | Justification | 0.2214 |
Chance | Theory testing | 0.2102 |
Idealism | Self-consciousness | 0.1956 |
Faith and theism | Ontological argument | 0.1852 |
Idealism | Life and value | 0.1713 |
Propositions and implications | Deduction | 0.1708 |
Moral conscience | Virtues | 0.1615 |
Moral conscience | Promises and imperatives | 0.1595 |
Laws | Causation | 0.1550 |
Physicalism | Perception | 0.1533 |
Methodology of science | Theories and realism | 0.1525 |
Mechanisms | Cognitive science | 0.1495 |
Temporal paradoxes | Classical space and time | 0.1490 |
Concepts | Wide content | 0.1468 |
Laws | Explanation | 0.1454 |
Promises and imperatives | Intention | 0.1445 |
Color/colour | Perception | 0.1422 |
Modality | Composition and constitution | 0.1408 |
Reasons | Norms | 0.1397 |
Life and value | Marx | 0.1369 |
Sense and reference | Belief ascriptions | 0.1347 |
Meaning and use | Ordinary language | 0.1345 |
Promises and imperatives | Duties | 0.1343 |
Other history | History and culture | 0.1336 |
Denoting | Sense and reference | 0.1319 |
Temporal paradoxes | Time | 0.1292 |
Dewey and pragmatism | Moral conscience | 0.1289 |
Dewey and pragmatism | Value | 0.1282 |
Definitions | Meaning and use | 0.1268 |
Life and value | Faith and theism | 0.1261 |
Marx | Liberal democracy | 0.1243 |
History and culture | Marx | 0.1243 |
I think these mostly make sense. The two epistemology topics are very tightly connected. The two topics that are about formal methods in scientific reasoning are correlated. (Remember that chance included a lot of work on formal models of inference.) The philosophy of religion articles are correlated. Idealism is correlated with the other early topics. Topics about time are correlated. denoting and sense and Reference are correlated; Frege and Russell aren’t that far apart.
The bottom few here are particularly interesting. Moral Conscience and value include some very analytic ethics; it’s interesting that it they are so close to Dewey and pragmatism. Marx the topic plays well with life and value, i.e., idealist ethics, with liberal democracy, and with history and culture. This is a bit surprising since Marx himself didn’t play well with any of them. But life and value also plays well with faith and theism, the core philosophy of religion topic. That mildly surprised me, but perhaps it should not have given how important the Absolute is to idealists.
Let’s turn to the strongest negative correlations. These are a little less interesting.
Subject One | Subject Two | Correlation |
---|---|---|
Life and value | Arguments | -0.1298 |
Idealism | Arguments | -0.1167 |
Ordinary language | Sets and grue | -0.0962 |
Idealism | Norms | -0.0902 |
Life and value | Sets and grue | -0.0891 |
Life and value | Verification | -0.0888 |
Life and value | Propositions and implications | -0.0869 |
Life and value | Truth | -0.0841 |
Psychology | Arguments | -0.0789 |
Other history | Arguments | -0.0765 |
Mechanisms | Arguments | -0.0752 |
Idealism | Sets and grue | -0.0749 |
Life and value | Theories and realism | -0.0747 |
Ordinary language | Theories and realism | -0.0746 |
Methodology of science | Arguments | -0.0736 |
Life and value | Sense and reference | -0.0718 |
Methodology of science | Promises and imperatives | -0.0716 |
Ordinary language | Models | -0.0716 |
Life and value | Composition and constitution | -0.0714 |
Life and value | Deduction | -0.0712 |
Idealism | Justification | -0.0700 |
Life and value | Justification | -0.0693 |
Physicalism | Moral conscience | -0.0692 |
Life and value | Modality | -0.0685 |
Definitions | Arguments | -0.0684 |
The early topics and the late topics aren’t correlated. The Idealists aren’t correlated with anyone who isn’t sympathetic to idealism. No one was offering arguments, at least not as such, in the early going. Let’s come back to this table and see what we can find that’s more interesting.
What about the topics that are perfectly independent? These topics are not correlated with each other at all.
Subject One | Subject Two | Correlation |
---|---|---|
Personal identity | Wide content | 0e+00 |
Ordinary language | Crime and punishment | 0e+00 |
Decision theory | Models | 0e+00 |
Explanation | Reasons | 1e-04 |
Origins and purposes | Races and DNA | 1e-04 |
Intention | Knowledge | 1e-04 |
Beauty | Meaning and use | 1e-04 |
Beauty | Functions | 1e-04 |
Psychology | Minds and machines | -1e-04 |
Origins and purposes | Abortion and self-defence | -1e-04 |
Origins and purposes | Cognitive science | 2e-04 |
Abortion and self-defence | Formal epistemology | 2e-04 |
Denoting | Arguments | 2e-04 |
Theory testing | Evolutionary biology | 2e-04 |
Hume | Personal identity | -2e-04 |
Universals and particulars | Functions | -2e-04 |
Explanation | Quantum physics | -2e-04 |
Deduction | Thermodynamics | -2e-04 |
Value | Liberal democracy | 3e-04 |
Arguments | Sense and reference | -3e-04 |
History and culture | Mechanisms | -3e-04 |
Chance | Mathematics | -3e-04 |
Mechanisms | Theory testing | -3e-04 |
Physicalism | Wide content | -3e-04 |
Deduction | Modality | 4e-04 |
I don’t know what I expected here, but I don’t think it was this. Some of these felt like they should be positively correlated. I guess just on timing grounds I expected personal Identity to correlate with wide content. But I would have guessed beauty to be negatively correlated with meaning and use. Maybe there isn’t anything to be found here; this mostly looks like noise to me.
The low correlation table featured mostly topics from the first half of the topics. (Indeed, every pair featured at least one such topic.) So let’s do the high and low correlation tables again but restricted to topics 46–90.
Subject One | Subject Two | Correlation |
---|---|---|
Knowledge | Justification | 0.2214 |
Laws | Causation | 0.1550 |
Concepts | Wide content | 0.1468 |
Laws | Explanation | 0.1454 |
Modality | Composition and constitution | 0.1408 |
Reasons | Norms | 0.1397 |
Sense and reference | Belief ascriptions | 0.1347 |
Speech acts | Sense and reference | 0.1234 |
Theory testing | Theories and realism | 0.1224 |
Liberal democracy | Egalitarianism | 0.1110 |
Decision theory | Game theory | 0.1097 |
Truth | Vagueness | 0.1047 |
Quantum physics | Thermodynamics | 0.0991 |
Thermodynamics | Models | 0.0989 |
Causation | Models | 0.0987 |
Space and time | Quantum physics | 0.0983 |
Wide content | Cognitive science | 0.0960 |
Truth | Radical translation | 0.0960 |
Personal identity | Composition and constitution | 0.0946 |
Minds and machines | Wide content | 0.0899 |
Justification | Norms | 0.0898 |
Minds and machines | Cognitive science | 0.0892 |
Justification | Reasons | 0.0889 |
Liberal democracy | Duties | 0.0881 |
Theory testing | Models | 0.0877 |
Those all seem to make sense. That isn’t totally surprising, but it’s reassuring to see that the model seems to have not messed up here. Let’s look at the other end of the table.
Subject One | Subject Two | Correlation |
---|---|---|
Perception | Truth | -0.0595 |
Decision theory | Concepts | -0.0490 |
Perception | Decision theory | -0.0488 |
Liberal democracy | Truth | -0.0463 |
Perception | Liberal democracy | -0.0458 |
Causation | Truth | -0.0437 |
Perception | Duties | -0.0434 |
Laws | Perception | -0.0432 |
Truth | Reasons | -0.0431 |
Knowledge | Composition and constitution | -0.0430 |
Theories and realism | Knowledge | -0.0423 |
Perception | Reasons | -0.0422 |
Concepts | Formal epistemology | -0.0421 |
Truth | Egalitarianism | -0.0419 |
Duties | Truth | -0.0415 |
Arguments | Thermodynamics | -0.0404 |
Decision theory | Composition and constitution | -0.0402 |
Perception | Models | -0.0400 |
Liberal democracy | Composition and constitution | -0.0396 |
Theory testing | Composition and constitution | -0.0395 |
Truth | Evolutionary biology | -0.0394 |
Perception | Mathematics | -0.0393 |
Perception | Egalitarianism | -0.0390 |
Mathematics | Reasons | -0.0389 |
Liberal democracy | Concepts | -0.0387 |
This is a bit surprising. I thought I’d see pairs like Liberal Democracy and composition and Constitution turning up a lot here. That is, I thought what we’d find would recreate the famiilar ethics versus M&E divide. But pairs like that are not the bulk of the table. Instead, we get a lot of negatively correlated pairs that are on the same side of this (alleged) divide.
Some such pairs are not surprising. concepts and formal epistemology are negatively correlated, but this makes perfect sense because virtually all the work in formal epistemology uses unstructured contents.
But one might worry that the lack of an Ethics versus M&E divide here shows that the model has missed something important. I think a better conclusion is that the model is correctly detecting that M&E isn’t a useful kind of classification in contemporary philosophy. This feels like something that could do with further study, but I doubt text mining will be the way forward here. It would be interesting, for example, to see whether citation studies show that there is (or is not) a big ethics versus M&E divide.