As a follow up to this previous post with an image of the relationship between people mentioned in the news, I’ve been asked to provide more detail.
First, why bother at all? Exploring the implied relationships may tell us about the individuals in question, but certainly provide more context to each of the other topics at hand. This context not only provides additional understanding the of topic, but can also be a valuable research tool in quickly determining which other topics may impact the one at hand.
What are the relationships shown? Shown are names occurring in the same news articles, which implies a relationship. This relationship may a formal relationship, e.g. the working relationship of Bush (8) and Condoleezza Rice (10). Or, the individuals may be related to a common topic such as Michael Phelps (4) and Babe Ruth.
Following are the top 20 names, by centrality, and the number of different implied relationships for each.
- Barack Obama 1128
- John McCain 902
- Sarah Palin 405
- Michael Phelps 237
- Pervez Musharraf 95
- Kwame Kilpatrick 103
- Hillary Clinton 270
- Bush 158
- Joe Biden 218
- Condoleezza Rice 107
- Steve Jobs 160
- John Edwards 101
- Clark Rockefeller 69
- Britney Spears 122
- Brett Favre 65
- Bernie Mac 60
- Miley Cyrus 70
- Bill Clinton 148
- Anwar Ibrahim 34
- Stephenie Meyer 27
What’s the data set? A random sampling of news sites including NYTimes, Google, Yahoo!, CNN, Drudge, and the like.
Is this an accurate reflection of news? I am polling a number of the big news sites, so hopefully it’s not far off.
Any surprises? Miley Cyrus!