Stack Overflow: Badge Analysis Over Time

The 87/18 Rule Applied to Stack Overflow Badges as Awarded Over the Past Nine Months

Another day, and another Stack Overflow database dump XML to play with.  Some quick statistics from the badges.xml file:

  • 62 distinct badges
  • 239,005 user badges awarded
  • 49,261 users have received at least one badge
  • The Top 11 badges (of 62, making 18% of distinct badges) make up 87% of badges awarded
    • Teacher (13.1% of all badges awarded)
    • Student (12.4%)
    • Supporter (10.6%)
    • Scholar (10.1%)
    • Editor (9.8%)
    • Nice Answer (9.6%)
    • Autobiographer (5.3%)
    • Critic (4.8%)
    • Commentator (4.1%)
    • Popular Question (3.6%)
    • Organizer (3.2%)

Sorta interesting, but it is worth noting that most of these are handed out like parking tickets in Venice Beach. (Easy badges in italics.)  What was more interesting was the anomalies that only become visible when graphed over time.  This graph is of the top eleven badges over time awarded as a percentage of the same top eleven.  This is better as showing relative increases or decreases in badge awarding events.

Larger Image

A few things caught my eye:

  • Beta Days: Things are pretty erratic in the beta days, but that is to be expected with a significantly smaller user base who were actively trying features out as they come online.
  • Days Long Outage: There is a days-long gap in the data in mid-April.  No badges were handed out for about four or five days, but they were eventually awarded when the problem was fixed.  I did not see a mention of any failures on  blog.stackoverflow.com, so the cause of this outage is a mystery to me.
  • Drastic drop new Organizer badges: Once the outage was resolved, the relative amount of Organizer badges drops permanently by two-thirds!  Clearly an Illuminati conspiracy to keep us SOpedians down.
    • ~28 Organizer badges are awarded per day for the three weeks prior to the outage
    • ~8 Organizer badges are awarded per day for the following three weeks

    (UPDATE: Geoff Dalgas , a coder at Stack Overflow, posts on The (unofficial) StackOverflow meta-Discussion Forum that the reason for this behavior is due to a database refactoring that allowed them to distinguish between Question edits and tag edits.)

  • Number of Popular Questions badges awarded daily grows over time: It starts out at near zero, and grows over time to be a considerable fraction of the total.  I guess this is to be expected as questions pick up more and more views over time.
  • No Popular Question badges awarded for 27 May:And, unlike the above outage, they do not seem to have been awarded retroactively.  There the missing badges show in both the absolute graph (not shown, trust me) and the relative graph (close up below.)

    (UPDATE: I asked about this anomaly on the new Meta Stack Overflow site (No Popular Question badges awarded for 27 May? ), and it apparently (to be confirmed) the systems view counter was down that day…so not questions incremented over the threshold for a badge.)