Category: Research

  • Anything But Random: 1,500,000 Blogs

    Before Google’s Social Graph API closed last month, I was able to gain access to a reasonable subset: 7,100,000 relationships across 1,500,000 sites (shown below). To be honest, it wasn’t what I was expecting. The attach rate $latex \left(P\left(k\right) \sim{} k^{-\gamma}\right)$ is pretty close to what Barabasi, Albert, and Jeong found in Scale-free characteristics of random networks.…

  • Maximizing Cliques

    Finding largest cliques (completely connected sub-graph components) is hard. The best algorithms have run times proportional to 3^(#nodes/3). Triple the nodes = nine times the run length. So what’s a guy without a supercomputer to do but start looking for shortcuts? Shortcut 1: Maximize for your goals My interest is in characterizing a network, in…

  • How I Stopped Worrying About Visualizing Networks – 1.5 Billion Edges

  • Brand Conversations and Stock Performance

    About a year ago, we comparatively visualized conversations between two competitive brands of major sport apparel companies.  The network of communications of Brand A showed better potential characteristics for healthy and robust interaction. One year later, and more than 1,000,000 people talking about each brand, what do we see? Several million conversations later, we still see a deeply…

  • Gov Palin’s Email Network (new visualization)

    Cleaned up the data a little, and created a new visualization to better demonstrate the split between the two connected clusters.  The center of the smaller one is a Gov Palin email address that has the “Gov Sponsored” qualifier. It looks like this email address was used for her constituents to get in touch with her. [huge…

  • Palin’s Email Network

    Lots of cleanup left to do in the code parsing/cleaning up the emails, but here’s a first pass.  Seems like at least two connected networks, and surprisingly both the yahoo and the Gov’t email addresses are both in the larger one.  I wonder what the smaller one comprises of? A very big thanks to the…

  • Visualizing Conversation Clusters on Twitter

    2.5 million tweeters having 11 million conversations.  Pay attention to the clustering. Song: Dance on Vaseline (Thievery Corporation Remix) by David Byrne (YouTube link here)

  • Comparing Online Brand Conversations (Sports Apparel)

    Over a long enough period of time, maps of who is talking with whom mostly look the same.  Many conversations start to overlap with each other, and eventually you see a large central core and any number of outliers. However if you look over short enough periods, you can see patterns of how those conversations…

  • Relationships Behind Chicago’s Bid for the Olympics

    The two clusters in the core are roughly split into Democrats (l) and Republicans (r). Built from data by LittleSis.org.  As always, click for larger image.

  • Mathematicians Do It Randomly

    What it look like if you took all of the Mathematics articles from JSTOR, the digital journal archive, and mapped co-authorship of the papers? It would look something like this.  Interesting to note, that while the distribution does hold to the small world network distribution exponent, there’s some “peakiness” about it that may suggest it’s…