8.3 High-Confidence Topics
I mentioned in section 8.1 that there were several topics that were appearing frequently among the articles the model was very confident about. Let’s look at those topics on a graph.
This graph looks at the fifty articles the model is most confident about, and asks how many of them are in the various different topics.
As I noted back when talking about space and time, it has a surprising large number of articles the model is very confident about. But as we saw above, a lot of the articles the model is confident about are very short. Let’s focus instead on the articles that are at least ten pages long, and again look at the distribution of the fifty articles the model is most confident about.
And this isn’t surprising; the model gets really confident that evolutionary biology articles are properly placed. The same thing happens when we increase the length to twenty pages.
There are still ten evolutionary biology articles, though mostly not the same ten. And there are fewer categories here. Just eighteen categories are represented in these fifty articles. And the purples and reds indicate that the articles are getting much later. These trends extend when we raise the floor to thirty pages, though now the topics start to shift.
There is more quantum physics, and more political philosophy. And when we move to forty pages, which means we’re just looking at the longest two percent of articles, these trends really accelerate.
By this stage the graph is measuring less which articles the model is really confident in, and more which kinds of philosophers write articles that long. The answer is, apparently, philosophers of (quantum) physics, political philosophers, and early modern historians.