- Average 3.5 days between posts
- Many people post infrequently
- Few people post very frequently
- Distribution fits a power curve
- I'd like to see intervals measured in minutes or hours; better for understanding posting intervals within a day
- I'd also like to see if some bloggers spurt with a cluster of posts followed by a comparatively long pause. If so, size and interval of the spurts, frequency dist of the pauses.
- The data was collected from the Blogalia blog hosting service. Do Spanish language bloggers behave differently than others?
- I wonder how intervals vary over time. Is there a natural life cycle to blogging? Do more experienced bloggers blog more frequently? Is there seasonality to the interval?
Why should you care? It's more than Tufte style data visualization. It's creating tools for understanding what connects us, how ideas bind us, and a continuation of tools that help us extract meaning from our blogging behavior. I'm eager for the maturation of these models into widely accessible tools. Areas for evolution:
Websites of a particular class form increasingly complex networks, and new tools are needed to map and understand them. A way of visualizing this complex network is by mapping it. A map highlights which members of the community have similar interests, and reveals the underlying social network. In this paper, we will map a network of websites using Kohonen’s self-organizing map (SOM), a neural-net like method generally used for clustering and visualization of complex data sets. The set of websites considered has been the Blogalia weblog hosting site (based at http://www.blogalia.com/), a thriving community of around 200 members, created in January 2002. In this paper we show how SOM discovers interesting community features, its relation with other community-discovering algorithms, and the way it highlights the set of communities formed over the network. Keywords: Weblogs, neural networks, self-organizing maps, clustering, web-based communities, social networks.
- Reflect change over time. People go through longer term life changes and day-to-day variations in attitude, interest, connectivity, social connections, and competitors for their blogging time. Develop tools that identify cohorts sharing life cycle changes or life/career events. Help contrast clustering of social networks with correllated geography and other personal metadata.
- Scale from hundreds to millions of blogs. You can fit 200 blogs on a size A piece of paper, 2000 on a flip chart poster. But how do you visualize the 100k French bloggers? Or Deanspace bloggers? Or everyone using Xanga? We need to understand the kinds of methods that work well with identifying and visualizing very large communities, both their edges and their structures.
- Explore how out-of-system links correllate with clustering. As I understand it, the research in the paper used internal cross-references, excluding links away from the Blogalia service. I'd like to see more of that information used to inform clustering algorithms.
- Border effects. Having discovered communities, what other information can you visualize about the relationships between each pair and among groups of communities?