Mathematicians Do It Randomly

What it look like if you took all of the Mathematics articles from JSTOR, the digital journal archive, and mapped co-authorship of the papers? It would look something like this.  Interesting to note, that while the distribution does hold to the small world network distribution exponent, there’s some “peakiness” about it that may suggest it’s not really one network, but the merging of several.  Given the role of mathematics on so many other subjects, that would not be a surprise.

JSTOR Mathematics Authors
Largest cluster of co-authorship

Zoomable image with names, after the jump.

The Never Ending Quest for Data

Luc Legay's Social Network
Radial Representation of a Social Network

Finding good data in this field is difficult, even most of the academic literature references relatively small networks of less than 100 or so individuals. I suggest that the academic research is just starting to take off now (although the field is very far from new), because of availability of large real world datasets available in the social networking sites.

Nathan Eagle (Reality Mining at MIT) was kind enough to share 330,000 hours of proximity and cell phone communications data he and the team collected from volunteers over the course of the project. To say I am quite excited about digging into it, would be an dramatic understatement.

For other large data sets, Duncan Watts is spending his sabbatical over at Yahoo!, and I can only hope there are other people looking really hard at the data available there, Facebook, Hi5, Google, and many more. Research into people’s behavior, especially in a commercial setting is not only a great thing for the unprecedented data, but at least equally as important, this also brings to front the ethical implications.

Luc Legay's Facebook network

Friendship: #1 factor in whom we spend time with

Mobiles & Communication
Like all good science, analyzing social networks sometimes works out to proving things we always thought were true. Sometimes, we never even had any idea how right we were. For example, we really do spend more time with people we like.

A few really bright folks from MIT and the Kennedy School, have a paper pending publishing:

[analyzing] 330,000 hours of continuous behavioral data logged by the mobile phones of 94 subjects, and compar[ing] these observations with self reported relational data.

Three significant conclusions:

  1. Self-reported data shows a mildly positive relationship with observed data, but is exceptionally noisy.
  2. Friendship outside of work is the best indicator of who spends time with whom at work.
  3. Physical proximity is a good indicator, and predictor, of friendship (and not-friendship).

So, what do these conclusions suggest for practitioners?

Observed vs Reported Data: Surveys are great for all the reasons surrounding explicit participation, but the bias effects are significant. Find a way to marry active participation with empirical exploration and analysis of social networks.

Friendship and After-hours: Don’t under estimate the power of emotion on business decisions. Since we’re more likely to agree with data that confirms any already held thoughts, let’s be realistic and recognize the impact that, viewed through friendship, has on communication in our firms.

Proximity and Friendship: While I was unable to tease out any correlation/causation relationship from this paper, if we consider friendship as a proxy both for trust and ease of ability to work with (through shared history, goals, culture, etc.), there are some solid implications on the upper limit to the value of outsourcing.

