Monday, June 13, 2011

It's official: developers get better with age. And scarcer.

As a senior developer I get asked sometimes if constant change of technology is making me, well, obsolete. Personally I don't have problem with high pace of new technologies coming. I actually enjoy learning new stuff.

But the question remains: how do developers cope with onslaught of new technologies with age?

This kind of data is hard to come by, but thanks to almighty Stackoverflow ands their wise decision (thanks Joel), to make this data publicly available we can mine this data to our collective benefit.

With a simple bash script to download the data, a small Java program to extract the stats and Google Docs to make the graphs, I was able to produce some interesting stats.

I pulled in data of about 70.000 developers whose Stackoverflow reputation is over 100. On average 53% of them have their age listed in their profiles. So the sample was 37.400 users.

In the graphs I only included data if there were at least 100 developers in the age group. Full stats and interactive graphs are available here.



First interesting statistic is how users are distributed by age. On the graph we can see a textbook example of a bell distribution curve. I knew that with age coders tend to switch careers, but I was surprised to see the size of the drop. After the peak age of 27, number of developers halves every 6 to 7 years.

Second stat that I find most interesting is how Stackoverflow reputation relates to age. There is a near-linear increasing trend: the older the developers are the higher their SO reputation is. To see the reason behind this let's look at another graph:



Senior developers ask less questions and provide more answers. A 40-year old coder provides about 100 answers, roughly double the answers of his half younger colleague.

Now, does quality of posts change with age? Do senior developers provide better answers?


Stackoverflow awards each answer upvote with 10 reputation points, while questions get only 5 points per upvote. To similarly calculate upvotes per post I took this formula: upvotes per post = total rep / (10 x no. answers + 5 x no. questions ).

With this we get a graph of upvotes per post:



From this graph we see that quality of posts does not significantly change with age. Number of upvotes varies about 10% across all ages. So, senior coders earn their higher reputation by providing more answers, not by having answers of (significantly) higher quality.

Coder stats - highlights:


  • Number of coders drops significantly with age. Top developer numbers, at age 27, drop by half every 6-7 years.
  • Developers in their 40s answer roughly twice as much and ask half the questions compared to colleagues in their 20s. It seems younger generation learns and older generation teaches.
  • Quality of posts, i.e. upvotes earned by post, only slightly increases with age.
  • Seniors earn their high reputation by being more active than younger developers.

I'd like to see your take on this subject. Comments are welcome.

Until next time,

Peter Knego

77 comments:

  1. Maybe seniors have more spare time to answer questions in StackOverflow... ;)

    ReplyDelete
  2. Spot on. A much better judge of a developer is their ratio of upvotes to total questions answered. You've shown that they answer more often but what would be interesting is to see the number of high quality developers per age group. Why not set a bar of 3x or above per answer as a discriminant.

    ReplyDelete
  3. Once you have a family, spare time becomes scarce. Older developers are more likely to have a family. Personally I cut down on TV time.

    ReplyDelete
  4. I can think of a slight skew that may be occurring. Younger developers are more likely to seek out a site like Stackoverflow. Many older developers seem to stay farther way from interactions on the web. So you may be comparing a wider set of young developers, to a smaller (and more motivated than average) set of older ones.

    Just a thought.

    Paul.

    ReplyDelete
  5. I agree with Paul W. I think the demographics of the Stack Overflow crowd are skewed toward users who know about the site... I know plenty of older developers who never use SO.

    @Peter: TV is the only thing that separates us from the savages!

    ReplyDelete
    Replies
    1. I've been a developer for almost 30 years and can honestly say, at the ripe old age of 51, that I consider SO to be one of my most valuable resources.

      Delete
    2. I agree with John M. If anything SO is for the old codgers. What I think happens, sadly, is that good coders move on to become (often, bad) managers because that's where the respect is.

      Delete
    3. Is it about the respect as much as the money and opportunity (to do the projects you want) for some?

      Delete
    4. If you are working as a developer and not getting respect, you need to change companies. Developers in many cases make more than managers or at least they should. Companies that depend on quality developers will give them respect.

      Delete
  6. you might want to read the book 'young geniuses and old masters' i think it's relevant to bifurcate developers into two groups like the book does.

    In one group, the artists develop their craft and over decades become very good at what they do. They innovate with the construction of color and how that represents images.

    In the other group, they are more experimental and play with the conceptualization of ideas more than their craft.

    37signals/ apple are definitively in the master craftsmen camp, while

    YC companies seem to fall into the young genius camp.

    ReplyDelete
  7. Can you share the code? It will be useful to learn how it was done than just see a nice looking chart.

    ReplyDelete
  8. Interesting stats, but I think there are several issues with these results.

    First, by using average you're giving too much weight to users like Jon Skeet. If you removed him from his age bracket, the average drops from about 2300 to about 2000. Using median would give better results I think.

    Second, you seem to think that the primary reason why there is less older programmers is that they changed jobs. I think that the primary reason is that there was much smaller demand for programmers 30 years ago, and so much less people studied it then.

    Also, I'd like to see similar statistics based on tags, is your code available or are you planning on writing more posts like this one?

    ReplyDelete
  9. Why did you chop the data off at age 49, given that the data was so strong for ages 45-49? What shows up for age 50-60?

    ReplyDelete
  10. You forgot to take into account the creation date of the account. I believe that senior dev registered sooner, statistically, than newer devs.

    ReplyDelete
  11. This is a quote from someone on the web, and it's the best quote I've heard so far:

    There are some programmers with 10 years of experience, but there are also those with 1 year experience repeated 10 times. The latter ones are the scary ones


    Still people who enjoy learning (like most developers should) will always get smarter with age.

    ReplyDelete
  12. @Bud - I only included age groups with 100 or more developers. Things really get spikey beyond age 49. There is a link to full data (just above first image).

    ReplyDelete
    Replies
    1. I'm 47, and was just at the right age to encounter computers at high school (13), so I took Computer Science O Level at age 14. The home computer boom, with Commodore Pet, BBC, Sinclair etc probably is one of the reasons that there are not too many developers over 50. One reason why there are probably fewer developers in the older half of the demographic, is that people move into management positions, which is easier than actually working :)

      A lot of my experience is out of date, there is not much call for 6502 or Z80 code now, but the skills learnt in programming applications in assembly language mean that I can produce efficient code in C (pointers are no problem) although I am now doing a lot of Python which is even higher level than C so even that has less relevance. To keep programming for 30 years, means you have to keep changing what you are working on, I started with assembly, moved to C/C++, then Java and now Python.

      Delete
  13. @Sundar, @svick - yes, I'll post the code in few minutes.

    ReplyDelete
  14. The age distribution looks a bit more like a lognormal pdf than normal (bell-curve) one: http://www.wolframalpha.com/input/?i=lognormal+distribution+-1.0+0.6

    ReplyDelete
  15. CODE:

    Simple bash scraper: http://clippy.cz.cc/index.php?show=462

    Simple stats in Java: http://clippy.cz.cc/index.php?show=464

    Java code produces csv, that I then copy-pasted to Google Docs.

    Note: SO only allows user data downloads of max 100 users per request. So I had to chunk it up and redirecti it to file separated by "#-#-#-#-" on a new line.

    Code was mean to be written fast and for this particular task, and not as an exercise in "correct" code writing.

    ReplyDelete
    Replies
    1. The links seem to be obsolete, could you link the code again please?

      Delete
  16. It's a Poisson distribution, not the Gaussian bell curve. A lot of older programmers come from non-CS backgrounds, like math or physics and they notice such mistakes ;-)

    ReplyDelete
  17. I agree with most of the comments above, but would also add that the developers more likely to accept promotion are those with less passion for it, so the higher the age group, the better it is filtered. What do the results look like if you look at only the top 50 from each age bracket, ignoring the ones pulling the age group's average down?

    ReplyDelete
  18. I am taking slight umbrage at the fact that you only consider data points where age <= 49

    ReplyDelete
  19. It's actually a very simple case of selection bias: average developers aren't passionate about what they are doing, and are not good with keeping up with demands of this rather hard job - so they shift to the much easier career path in management.

    This leaves guys who were good to start with, and there aren't that many of them.

    ReplyDelete
  20. @leed25d: You have to draw the line somewhere...

    ReplyDelete
  21. Should really be using median rather than mean for something like "average reputation" where the maximum is unbounded

    ReplyDelete
  22. Looking at the stats in the spreadsheet, what's up with age 91?

    ReplyDelete
  23. To me (43) older just means more experience. When I do get involved in on-line forums, I answer far more questions than I ask...I just want to help people out with the same questions that someone answered for me 10 years ago.

    But I rarely code anymore. Now I supervise. My knowledge gets more and more out of date. This actually puts me into the position of answering MORE questions, because there are fewer and fewer experts.

    ReplyDelete
  24. stack overflow is nothing but a bunch of tools. crybaby pansies.. running around, enforcing that opinions are verboten, I mean what is this BS?

    ReplyDelete
  25. I wonder if there is some self-selection bias at play. I'm in the older category, and don't tend to fill in my age on sites like this. Younger people may be more comfortable sharing that kind of this.

    ReplyDelete
  26. @aaron, SO is built around the idea of *answering* questions. If you want to *discuss* something, SO is not for you and you should find some forum. They are made specifically for this purpose.

    ReplyDelete
  27. That is a really cool dataset to play with! I think for a quick basic pass there's nothing wrong with the normal averages Peter took. I'd really like to tell my statistician friends about this. You can do a lot of research on that data...

    ReplyDelete
  28. Another thing to take into consideration for a site like SO is that a HUGE number of users at sites like this are students there in hopes to pull in a programmer or two from the web to write code for them for assignments. I don't know how many questions I have seen where the user is clearly a student just looking to get an assignment written for him/her (or at least a good percentage of said assignment). These users often meet the 100 mean reputation that you have selected as a cut off (which appears to be very low considering the lowest point of any group is nearly 8x that high at 770...) but still ask several questions per class per semester in comparison to answering only a few.

    Another commenter noted above that older devs probably have less time due to families and other obligations that younger generations don't have. In my experience the exact opposite has been true. The younger generations (especially college students) that I have come into contact with have, generally speaking, much more excessive schedules than the older generations. For the college students that I mentioned, you have multiple classes, multiple homework assignments and projects for each of those classes, studying for tests, most now work at least part-time but probably a majority are required to work full-time out of necessity to pay for the college education with tuition costs inflating so rapidly these days to "balance budgets", new girlfriends/boyfriends (lets be honest, new relationships are far more demanding than their established counterparts), etc. Most aged developers that I come into contact with work their 40 hours (avg) a week and then go home to spend time with the family, something that in my experience is much more easy to postpone or reschedule in comparison to studying for a midterm or knocking out some code for a project due the following day. On top of this, as several have pointed out, the older developers are generally more passionate about their craft and much more willing to take time out of their own personal life to help less experienced peers.

    Just my .02...

    ReplyDelete
  29. That's interesting data, however selection bias is involved: this sample is formed only by developers who use stack overflow.

    ReplyDelete
  30. Possible bias aside, it's reassuring to know you're going to get better with age, especially if you're halfway through the graph. ;)

    ReplyDelete
  31. Does reputation accumulate over time on SO? If so, should "average reputation" be taken as "average increase in reputation over the last N years" or "average average reputation"? Otherwise there'd be an immediate bias towards long term users.

    ReplyDelete
  32. this is a comment from developped countries,
    in my area most people aged 40 and more didn't do a lot in technical field, they worked on business or management side, so they will make less answer and questions than the younger

    ReplyDelete
  33. See now feel guilty, in that I find it quite a hard site to get into. I'm 35, and been doing this developer stuff for a long time, but every 2-4 years you find to progress you need to move jobs, and very often that requires a new language. So, perhaps that's the reason I've not got a rating of more than 1. Rubbish me.

    ReplyDelete
  34. Worrying: that sudden drop in all stats at age 49 (my age.) Uh oh! ;-)

    ReplyDelete
  35. By far the most important reason there are fewer older developers, is that there simply weren't many careers in software 20+ years ago.

    I'm a 42 year old dev, and when I took my first job in 1991, there was no internet, I didn't have a PC on my desktop, and most business tasks were still done on paper. Things changed pretty quickly in a few years, but it wasn't until the late 90's and the first web boom that "coding" went from something esoteric that nerds did, to something that was a mainstream career options.

    There were few jobs back then - I actually wanted to get a job programming C/C++, but ended up doing number crunching with SAS for a few years because that's what there was. In 1991, I was the only person I knew who programmed computers for a living. Now, even the smallest company needs someone to do IT stuff at some level, either manage a network, or work on a web site, or create ways to manage information.

    ReplyDelete
  36. What Jamietre said is certainly true. I'm 37, and remember seeing Mosaic for the first time while in college. People graduating in the mid-nineties were the first group of programmers to be exposed to the web while in school.

    But what I'm really commenting to say is that if I knew nothing about the topic, but only saw the graphs with labels what I would say is that with the data you've presented what really seems to be the case is that older people know about a broader range of topics. That would explain why they answer more questions, but that the answeres aren't upvoted with any significantly greater percentage.

    ReplyDelete
  37. Regarding @a's note on upvote percentage, i'd be interested in the average upvote for ANY question. What i mean is I don't think that the average # of upvotes is a good measure of someone's talent or knowledge, because the number of answers that get more than five or so upvotes is a tiny minority and often has to do with the answer being funny, or a particular topic that gets lots of views. It's a popularity contest, not an evaluation of quality.

    ReplyDelete
  38. I agree with all your conclusions except for the first. The first data may very well mean that older programmers tend to stay away from stackoverflow or at least do not set their age. People tend to stumble on Stackoverflow while seeking answers and younger ones are always in dire need for that. ;-)

    ReplyDelete
  39. I didn't see anyone say this (sorry if it was said), but you can earn 15 reputation if your answer is selected as the correct one. So, this formula:

    upvotes per post = total rep / (10 x no. answers + 5 x no. questions )

    Is not going to give accurate results, because it ignores that statistic. You might modify the formula like this:

    upvotes per post = (TotalRep-(TotalCorrectAnswersx15)/((10xNo of answers) + (5xno of questions))

    My suspicion is that even this approach is flawed.

    I think there are some basic things you can do to get reputation w/o making any posts, too. Everyone starts at 1--not zero, for example. And for filling out basic profile information you can get some rep...

    ReplyDelete
  40. Everyone knows the real age for answers is 30-35 and not 45-50....Like..cmon...Jon Skeet has his age as 33.

    ReplyDelete
  41. @bartoszmilewski It's not a Poisson distribution. A Poisson distribution is discrete, and age is a continuous variable. Some of us work with failure rates, and notice things like that :-)

    There are a few PDFs that give that sort of skewed bell curve bounded at zero but rising slowly from zero -- lognormal, as has been mentioned, but also the Wald (inverse Gaussian) and F distributions. Hmm, I wanted to learn the R package; I feel some statistical testing coming on...

    (Age 55, by the way).

    ReplyDelete
  42. Very interesting. FWIW, like Kief, I'm an older person who did not provide her age information to SO, although I'm going to add it now.

    ReplyDelete
  43. ** What is the distribution of age in your data-set without any other considerations? Are there more people of older age ready to enter in their age in the profile?

    ReplyDelete
  44. That is sooooo interesting.
    We people (youngsters) never think a 50 years old man can develop anything, and we deal with these guys like dinosaurs, but statistics proves that we are the dinosaurs, and they are way better than us. :))

    I have some apologies to make now. :)

    ReplyDelete
  45. In my opinion, one of the reasons why older developers spend more time teaching than younger ones is because younger ones are simply too busy with writing code.
    Interesting observations, thanks.

    ReplyDelete
  46. In programmers forum, I answer if I can answer right now.
    I don't answer if I can answer after a little study.

    In my workspace,
    I must answer for my coders even if I needs more time.
    However, most of the questions are able to be answered much quicker than young coders.

    I am not sure my guess is right because I am based on one sample(me), I guess old coders have wide range of knowledge and know how to point out and guide other coders very quickly.

    In my office, Old coders study or research continuously, say what coders must make, assess them, and shift out their coding and debugging time to youngers :)

    ReplyDelete
  47. @Hyunik Yes I'm sure we all fully agree.It's all about experience and passing on what you know regardles of Age.

    I am 29yrs, programming 11yrs,i still ask questions, learn from My Seniors at an automation Firm of 4yrs and use google as my guide in my Business Solutions Projects.

    However,i am mentoring 2 21yr. CS students and i am able to answer and assist in their questions and train them on techniques i Know|Used|Need in my Development projects.

    It should always be two way

    ReplyDelete
  48. Average or below average developers quitting the job (which is evident by the decreasing number of devs) explains almost all the graphs and findings here. With time, only better programmers stay in programming, hence the average increases, even if the quality of those devs isn't improving.

    ReplyDelete
  49. Nice drop.
    Given a big set of developpers, only the very best continue developping 20 years later.
    The more we do something, the better, fast and robust we succed downing similar things : that apply perfectly to coding activity.

    ReplyDelete
  50. Hey, what about software engineers like myself with 40 years of experience? I’m almost 70, in good health, on Social security and Medicare. Health benefit costs would be low for a potential employer. I am very good at the annoying product details such as front panel control. Easy low cost hardware interface between panel & “The Big Important Code That Earns the Big Bux.”

    Just for fun and giggles I wrote an ARM Cortex-M3 RTOS. It’s fast and has a low RAM/ROM footprint. I am the software/firmware side of http://themicrokit.com/ (Use contact into on that site for more info.)

    I am exploring using a $0.75 microcontroller and some ~$0.10 tri-color LEDs to make a new-age trinket. Not much into woo myself but if interesting color patterns moving around a pendent will sell I can do the firmware. Raw PCB should be nicely under $3 including button cell holder.

    ReplyDelete
  51. One huge problem; the issue with age that everyone eludes to applies to why they wouldn't be answering questions. Developers in the prime of their career have very little interest in helping others. Seasoned developers in "safe" jobs have spare time.

    This obviously sounds awful; but i believe it to be the truth

    ReplyDelete
  52. Hmmm... How many highly-competent folk left the field due to ageism? This one did. Hard enough to compete when you're young and female, but just wait until you're past 30. I went from being a shining star to having young male newhires just assume I was moronic, and headhunters tell me to drop my hottest skills off my resume, get plastic surgery, and lie about my age (35). I remember assumptions of dimwittedness re: much older males, as well.

    ReplyDelete
  53. As a 55 year old developer with 7900 points at the moment on SO, I think that the reason older developers answer more questions is that we are more likely to come across questions that we know the answers too. That's because of broader experience. It would be interesting to see some analysis of tags, i.e. do older developers really answer across a broader selection of tags than younger ones. Also, it would be nice to see some analysis by date because I believe that it is getting harder and harder to earn high scores on SO. There are too many questions vying for attention, and the later adopters use SO differently than early adopters and are probably less likely to upvote.

    ReplyDelete
    Replies
    1. I agree with this. I'm 54 and enjoy coding (I've been through Lisp, Fortran, Forth, C, C++, Java, Perl, and now Scala).

      Delete
  54. The title of your post does not follow from your analysis. More accurate title:
    "Of developers who use stack overflow as they age, older developers answer more questions"

    Alternative title:
    "Developers show no measurable increase in answer quality on stack overflow as they age"

    ReplyDelete
  55. I want to use that data to examine the presence of female programmers in SO. I don't know what I'm looking for, but I think it would be interesting.

    ReplyDelete
  56. I think this is an obvious situation. Naturally older developers have higher scores; they know and understand MORE. Experience Counts, it is simple as that.

    Another reason the respondents 'drop off'as they age is that they are more concerned with career and family than these types of statistics. Maturity and EXPERIENCE gives them more self confidence.

    Also, as 'developers' move on in life and career, lets face it, they code/develop less, consult and mentor more.

    Cheers,
    Skip Stein
    An Old (65) developer/code who has moved on!
    Management Systems Consulting, Inc.
    http://www.msc-inc.net

    ReplyDelete
  57. The age range of interest to me doesn't appear in your study. I'm 68 and still writing code, and I love it.

    ReplyDelete
  58. Could I use these graphs to introduce your post into Korean on my blog?

    ReplyDelete
  59. Nice experiment.

    Could you give us the raw dataset ?

    ReplyDelete
  60. My comment on June 13th links to source. You can simply use that to download the data yourself.

    ReplyDelete
  61. This comment has been removed by the author.

    ReplyDelete
  62. My view: Younger developers have young families they have to take care of more intensively, and only focus on getting info. Older devs spend part of their free time sharing experience with others.

    ReplyDelete
  63. This one is really interesting. I used to think the younger ones are more active. I guess I never tried to see the age of the people who answered the questions on SO. My bad.

    Now that I know where the wisdom of sites like SO comes from, I feel a deep sense of respect :)

    ReplyDelete
  64. The older folks do know better plus they have a lot more experience under their belt so that's an advantage for them.

    ReplyDelete
  65. This comment has been removed by the author.

    ReplyDelete
  66. Older developers do not disclose their age, so this statistics is not really relevant

    ReplyDelete
  67. Thx a lot for this article Peter, I would be glad if employers in Russia take it as granted, finally - this is numbers, nobody can`t argue with numbers =)
    PS I have a little doughter too, and a big family to feed.

    ReplyDelete
  68. The number of new developers that are entering the industry is growing. That's the reason that there are less older developers.

    ReplyDelete