Archive

Archive for the ‘Programming’ Category

We Hire the Best, Just Like Everyone Else

March 4th, 2016 No comments

One of the most common pieces of advice you’ll get as a startup is this:

Only hire the best. The quality of the people that work at your company will be one of the biggest factors in your success – or failure.

I’ve heard this advice over and over and over at startup events, to the point that I got a little sick of hearing it. It’s not wrong. Putting aside the fact that every single other startup in the world who heard this same advice before you is already out there frantically doing everything they can to hire all the best people out from under you and everyone else, it is superficially true. A company staffed by a bunch of people who don’t care about their work and aren’t good at their jobs isn’t exactly poised for success. But in a room full of people giving advice to startups, nobody wants to talk about the elephant in that room:

It doesn’t matter how good the people are at your company when you happen to be working on the wrong problem, at the wrong time, using the wrong approach.

Most startups, statistically speaking, are going to fail.

And they will fail regardless of whether they hired “the best” due to circumstances largely beyond their control. So in that context does maximizing for the best possible hires really make sense?

Given the risks, I think maybe “hire the nuttiest risk junkie adrenaline addicted has-ideas-so-crazy-they-will-never-work people you can find” might actually be more practical startup advice. (Actually, now that I think about it, if that describes you, and you have serious Linux, Ruby, and JavaScript chops, perhaps you should email me.)

I told that person the same thing I tell all prospective job candidates: “come with me if you want to live”

— Jeff Atwood (@codinghorror) May 24, 2015

Okay, the goal is to increase your chance of success, however small it may be, therefore you should strive to hire the best. Seems reasonable, even noble in its way. But this pursuit of the best unfortunately comes with a serious dark side. Can anyone even tell me what “best” is? By what metrics? What results? How do we measure this? Who among us is suitable to judge others as the best at … what, exactly? Best is an extreme. Not pretty good, not very good, not excellent, but aiming for the crème de la crème, the top 1% in the industry.

The real trouble with using a lot of mediocre programmers instead of a couple of good ones is that no matter how long they work, they never produce something as good as what the great programmers can produce.

Pursuit of this extreme means hiring anyone less than the best becomes unacceptable, even harmful:

In the Macintosh Division, we had a saying, “A players hire A players; B players hire C players” – meaning that great people hire great people. On the other hand, mediocre people hire candidates who are not as good as they are, so they can feel superior to them. (If you start down this slippery slope, you’ll soon end up with Z players; this is called The Bozo Explosion. It is followed by The Layoff.) — Guy Kawasaki

There is an opportunity cost to keeping someone when you could do better. At a startup, that opportunity cost may be the difference between success and failure. Do you give less than full effort to make your enterprise a success? As an entrepreneur, you sweat blood to succeed. Shouldn’t you have a team that performs like you do? Every person you hire who is not a top player is like having a leak in the hull. Eventually you will sink. — Jon Soberg

Why am I so hardnosed about this? It’s because it is much, much better to reject a good candidate than to accept a bad candidate. A bad candidate will cost a lot of money and effort and waste other people’s time fixing all their bugs. Firing someone you hired by mistake can take months and be nightmarishly difficult, especially if they decide to be litigious about it. In some situations it may be completely impossible to fire anyone. Bad employees demoralize the good employees. And they might be bad programmers but really nice people or maybe they really need this job, so you can’t bear to fire them, or you can’t fire them without pissing everybody off, or whatever. It’s just a bad scene.

On the other hand, if you reject a good candidate, I mean, I guess in some existential sense an injustice has been done, but, hey, if they’re so smart, don’t worry, they’ll get lots of good job offers. Don’t be afraid that you’re going to reject too many people and you won’t be able to find anyone to hire. During the interview, it’s not your problem. Of course, it’s important to seek out good candidates. But once you’re actually interviewing someone, pretend that you’ve got 900 more people lined up outside the door. Don’t lower your standards no matter how hard it seems to find those great candidates. — Joel Spolsky

I don’t mean to be critical of anyone I’ve quoted. I love Joel, we founded Stack Overflow together, and his advice about interviewing and hiring remains some of the best in the industry. It’s hardly unique to express these sort of opinions in the software and startup field. I could have cited two dozen different articles and treatises about hiring that say the exact same thing: aim high and set out to hire the best, or don’t bother.

This risk of hiring not-the-best is so severe, so existential a crisis to the very survival of your company or startup, the hiring process has to become highly selective, even arduous. It is better to reject a good applicant every single time than accidentally accept one single mediocre applicant. If the interview process produces literally anything other than unequivocal “Oh my God, this person is unbelievably talented, we have to hire them”, from every single person they interviewed with, right down the line, then it’s an automatic NO HIRE. Every time.

This level of strictness always made me uncomfortable. And I’m not going to lie, it starts with my own selfishness. I’m pretty sure I wouldn’t get hired at big, famous companies with legendarily difficult technical interview processes because, you know, they only hire the best. I don’t think I am one of the best. More like cranky, tenacious, and outspoken, to the point that I wake up most days not even wanting to work with myself.

If your hiring attitude is that it’s better to be possibly wrong a hundred times so you can be absolutely right one time, you’re going to be primed to throw away a lot of candidates on pretty thin evidence.

Before cofounding GitHub I applied for an engineering job at Yahoo and didn’t get it. Don’t let other people discourage you.

— Chris Wanstrath (@defunkt) May 22, 2014

I’ve been twitter following the careers of people we interviewed but passed on at my last gig.

Turns out we were almost always wrong.

— Trek Glowacki (@trek) January 26, 2016

Perhaps worst of all, if the interview process is predicated on zero doubt, total confidence aren’t you accidentally maximizing for hidden bias? Perhaps the reason this candidate doesn’t feel right is because they don’t look like you, dress like you, think like you, speak like you, or come from a similar background as you.

One of the best programmers I ever worked with was Susan Warren, an ex-Microsoft engineer who taught me about the People Like Us problem, way back in 2004:

I think there is a real issue around diversity in technology (and most other places in life). I tend to think of it as the PLU problem. Folk (including MVPs) tend to connect best with folks most like them (“People Like Us”). In this case, male MVPs pick other men to become MVPs. It’s just human nature.

As one reply notes, diversity is good. I’d go as far as to say it’s awesome, amazing, priceless. But it’s hard to get to — the classic chicken and egg problem — if you rely on your natural tendencies alone. In that case, if you want more female MVPs to be invited you need more female MVPs. If you want more Asian-American MVPs to be invited you need more Asian-American MVPs, etc. And the (cheap) way to break a new group in is via quotas.

IMO, building diversity via quotas is bad because they are unfair. Educating folks on why diversity is awesome and how to build it is the right way to go, but also far more costly.

Susan was (and is) amazing. I learned so much working under her, and a big part of what made her awesome was that she was very much Not Like Me. But how could I have appreciated that before meeting her? The fact is that as human beings, we tend to prefer what’s comfortable, and what’s most comfortable of all is … well, People Like Us. The effect can be shocking because it’s so subtle, so unconscious – and yet, surprisingly strong:

  • Baseball cards held by a black hand consistently sold for twenty percent less than those held by a white hand.

  • Using screens to hide the identity of auditioning musicians increased women’s probability of advancing from preliminary orchestra auditions by fifty percent.

  • Denver police officers and community members were shown rapidly displayed photos of black and white men, some holding guns, some holding harmless objects like wallets, and asked to press either the “Shoot” or “Don’t Shoot” button as fast as they could for each image. Both the police and community members were three times more likely to shoot black men.

It’s not intentional, it’s never intentional. That’s the problem. I think our industry needs to shed this old idea that it’s OK, even encouraged to turn away technical candidates for anything less than absolute 100% confidence at every step of the interview process. Because when you do, you are accidentally optimizing for implicit bias. Even as a white guy who probably fulfills every stereotype you can think of about programmers, and who is in fact wearing an “I Rock at Basic” t-shirt while writing this very blog post*, that’s what has always bothered me about it, more than the strictness. If you care at all about diversity in programming and tech, on any level, this hiring approach is not doing anyone any favors, and hasn’t been. For years.

I know what you’re thinking.

Fine, Jeff, if you’re so smart, and “hiring the best” isn’t the right strategy for startups, and maybe even harmful to our field as a whole, what should be doing?

Well, I don’t know, exactly. I may be the wrong person to ask because I’m also a big believer in geographic diversity on top of everything else. Here’s what the composition of the current Discourse team looks like:

I would argue, quite strongly and at some length, that if you want better diversity in the field, perhaps a good starting point is not demanding that all your employees live within a tiny 30 mile radius of San Francisco or Palo Alto. There’s a whole wide world of Internet out there, full of amazing programmers at every level of talent and ability. Maybe broaden your horizons a little, even stretch said horizons outside the United States, if you can imagine such a thing.

I know hiring people is difficult, even with the very best of intentions and under ideal conditions, so I don’t mean to trivialize the challenge. I’ve recommended plenty of things in the past, a smorgasboard of approaches to try or leave on the table as you see fit:

… but the one thing I keep coming back to, that I believe has enduring value in almost all situations, is the audition project:

The most significant shift we’ve made is requiring every final candidate to work with us for three to eight weeks on a contract basis. Candidates do real tasks alongside the people they would actually be working with if they had the job. They can work at night or on weekends, so they don’t have to leave their current jobs; most spend 10 to 20 hours a week working with Automattic, although that’s flexible. (Some people take a week’s vacation in order to focus on the tryout, which is another viable option.) The goal is not to have them finish a product or do a set amount of work; it’s to allow us to quickly and efficiently assess whether this would be a mutually beneficial relationship. They can size up Automattic while we evaluate them.

What I like about audition projects:

  • It’s real, practical work.
  • They get paid. (Ask yourself who gets “paid” for a series of intensive interviews that lasts multiple days? Certainly not the candidate.)
  • It’s healthy to structure your work so that small projects like this can be taken on by outsiders. If you can’t onboard a potential hire, you probably can’t onboard a new hire very well either.
  • Interviews, no matter how much effort you put into them, are so hit and miss that the only way to figure out if someone is really going to work in a given position is to actually work with them.

Every company says they want to hire the best. Anyone who tells you they know how to do that is either lying to you or to themselves. But I can tell you this: the companies that really do hire the best people in the world certainly don’t accomplish that by hiring from the same tired “only the best” playbook every other company in Silicon Valley uses.

Try different approaches. Expand your horizons. Look beyond People Like Us and imagine what the world of programming could look like in 10, 20 or even 50 years – and help us move there by hiring to make it so.

* And for the record, I really do rock at BASIC.

[advertisement] Building out your tech team? Stack Overflow Careers helps you hire from the largest community for programmers on the planet. We built our site with developers like you in mind.
Categories: Others, Programming Tags:

We Hire the Best, Just Like Everyone Else

March 4th, 2016 No comments

One of the most common pieces of advice you’ll get as a startup is this:

Only hire the best. The quality of the people that work at your company will be one of the biggest factors in your success – or failure.

I’ve heard this advice over and over and over at startup events, to the point that I got a little sick of hearing it. It’s not wrong. Putting aside the fact that every single other startup in the world who heard this same advice before you is already out there frantically doing everything they can to hire all the best people out from under you and everyone else, it is superficially true. A company staffed by a bunch of people who don’t care about their work and aren’t good at their jobs isn’t exactly poised for success. But in a room full of people giving advice to startups, nobody wants to talk about the elephant in that room:

It doesn’t matter how good the people are at your company when you happen to be working on the wrong problem, at the wrong time, using the wrong approach.

Most startups, statistically speaking, are going to fail.

And they will fail regardless of whether they hired “the best” due to circumstances largely beyond their control. So in that context does maximizing for the best possible hires really make sense?

Given the risks, I think maybe “hire the nuttiest risk junkie adrenaline addicted has-ideas-so-crazy-they-will-never-work people you can find” might actually be more practical startup advice. (Actually, now that I think about it, if that describes you, and you have serious Linux, Ruby, and JavaScript chops, perhaps you should email me.)

I told that person the same thing I tell all prospective job candidates: “come with me if you want to live”

— Jeff Atwood (@codinghorror) May 24, 2015

Okay, the goal is to increase your chance of success, however small it may be, therefore you should strive to hire the best. Seems reasonable, even noble in its way. But this pursuit of the best unfortunately comes with a serious dark side. Can anyone even tell me what “best” is? By what metrics? Judged by which results? How do we measure this? Who among us is suitable to judge others as the best at … what, exactly? Best is an extreme. Not pretty good, not very good, not excellent, but aiming for the crème de la crème, the top 1% in the industry.

The real trouble with using a lot of mediocre programmers instead of a couple of good ones is that no matter how long they work, they never produce something as good as what the great programmers can produce.

Pursuit of this extreme means hiring anyone less than the best becomes unacceptable, even harmful:

In the Macintosh Division, we had a saying, “A players hire A players; B players hire C players” – meaning that great people hire great people. On the other hand, mediocre people hire candidates who are not as good as they are, so they can feel superior to them. (If you start down this slippery slope, you’ll soon end up with Z players; this is called The Bozo Explosion. It is followed by The Layoff.) — Guy Kawasaki

There is an opportunity cost to keeping someone when you could do better. At a startup, that opportunity cost may be the difference between success and failure. Do you give less than full effort to make your enterprise a success? As an entrepreneur, you sweat blood to succeed. Shouldn’t you have a team that performs like you do? Every person you hire who is not a top player is like having a leak in the hull. Eventually you will sink. — Jon Soberg

Why am I so hardnosed about this? It’s because it is much, much better to reject a good candidate than to accept a bad candidate. A bad candidate will cost a lot of money and effort and waste other people’s time fixing all their bugs. Firing someone you hired by mistake can take months and be nightmarishly difficult, especially if they decide to be litigious about it. In some situations it may be completely impossible to fire anyone. Bad employees demoralize the good employees. And they might be bad programmers but really nice people or maybe they really need this job, so you can’t bear to fire them, or you can’t fire them without pissing everybody off, or whatever. It’s just a bad scene.

On the other hand, if you reject a good candidate, I mean, I guess in some existential sense an injustice has been done, but, hey, if they’re so smart, don’t worry, they’ll get lots of good job offers. Don’t be afraid that you’re going to reject too many people and you won’t be able to find anyone to hire. During the interview, it’s not your problem. Of course, it’s important to seek out good candidates. But once you’re actually interviewing someone, pretend that you’ve got 900 more people lined up outside the door. Don’t lower your standards no matter how hard it seems to find those great candidates. — Joel Spolsky

I don’t mean to be critical of anyone I’ve quoted. I love Joel, we founded Stack Overflow together, and his advice about interviewing and hiring remains some of the best in the industry. It’s hardly unique to express these sort of opinions in the software and startup field. I could have cited two dozen different articles and treatises about hiring that say the exact same thing: aim high and set out to hire the best, or don’t bother.

This risk of hiring not-the-best is so severe, so existential a crisis to the very survival of your company or startup, the hiring process has to become highly selective, even arduous. It is better to reject a good applicant every single time than accidentally accept one single mediocre applicant. If the interview process produces literally anything other than unequivocal “Oh my God, this person is unbelievably talented, we have to hire them”, from every single person they interviewed with, right down the line, then it’s an automatic NO HIRE. Every time.

This level of strictness always made me uncomfortable. I’m not going to lie, it starts with my own selfishness. I’m pretty sure I wouldn’t get hired at big, famous companies with legendarily difficult technical interview processes because, you know, they only hire the best. I don’t think I am one of the best. More like cranky, tenacious, and outspoken, to the point that I wake up most days not even wanting to work with myself.

If your hiring attitude is that it’s better to be possibly wrong a hundred times so you can be absolutely right one time, you’re going to be primed to throw away a lot of candidates on pretty thin evidence.

Before cofounding GitHub I applied for an engineering job at Yahoo and didn’t get it. Don’t let other people discourage you.

— Chris Wanstrath (@defunkt) May 22, 2014

I’ve been twitter following the careers of people we interviewed but passed on at my last gig.

Turns out we were almost always wrong.

— Trek Glowacki (@trek) January 26, 2016

Perhaps worst of all, if the interview process is predicated on zero doubt, total confidence … maybe this candidate doesn’t feel right because they don’t look like you, dress like you, think like you, speak like you, or come from a similar background as you? Are you accidentally maximizing for hidden bias?

One of the best programmers I ever worked with was Susan Warren, an ex-Microsoft engineer who taught me about the People Like Us problem, way back in 2004:

I think there is a real issue around diversity in technology (and most other places in life). I tend to think of it as the PLU problem. Folk (including MVPs) tend to connect best with folks most like them (“People Like Us”). In this case, male MVPs pick other men to become MVPs. It’s just human nature.

As one reply notes, diversity is good. I’d go as far as to say it’s awesome, amazing, priceless. But it’s hard to get to — the classic chicken and egg problem — if you rely on your natural tendencies alone. In that case, if you want more female MVPs to be invited you need more female MVPs. If you want more Asian-American MVPs to be invited you need more Asian-American MVPs, etc. And the (cheap) way to break a new group in is via quotas.

IMO, building diversity via quotas is bad because they are unfair. Educating folks on why diversity is awesome and how to build it is the right way to go, but also far more costly.

Susan was (and is) amazing. I learned so much working under her, and a big part of what made her awesome was that she was very much Not Like Me. But how could I have appreciated that before meeting her? The fact is that as human beings, we tend to prefer what’s comfortable, and what’s most comfortable of all is … well, People Like Us. The effect can be shocking because it’s so subtle, so unconscious – and yet, surprisingly strong:

  • Baseball cards held by a black hand consistently sold for twenty percent less than those held by a white hand.

  • Using screens to hide the identity of auditioning musicians increased women’s probability of advancing from preliminary orchestra auditions by fifty percent.

  • Denver police officers and community members were shown rapidly displayed photos of black and white men, some holding guns, some holding harmless objects like wallets, and asked to press either the “Shoot” or “Don’t Shoot” button as fast as they could for each image. Both the police and community members were three times more likely to shoot black men.

It’s not intentional, it’s never intentional. That’s the problem. I think our industry needs to shed this old idea that it’s OK, even encouraged to turn away technical candidates for anything less than absolute 100% confidence at every step of the interview process. Because when you do, you are accidentally optimizing for implicit bias. Even as a white guy who probably fulfills every stereotype you can think of about programmers, and who is in fact wearing an “I Rock at Basic” t-shirt while writing this very blog post*, that’s what has always bothered me about it, more than the strictness. If you care at all about diversity in programming and tech, on any level, this hiring approach is not doing anyone any favors, and hasn’t been. For years.

I know what you’re thinking.

Fine, Jeff, if you’re so smart, and “hiring the best” isn’t the right strategy for startups, and maybe even harmful to our field as a whole, what should be doing?

Well, I don’t know, exactly. I may be the wrong person to ask because I’m also a big believer in geographic diversity on top of everything else. Here’s what the composition of the current Discourse team looks like:

I would argue, quite strongly and at some length, that if you want better diversity in the field, perhaps a good starting point is not demanding that all your employees live within a tiny 30 mile radius of San Francisco or Palo Alto. There’s a whole wide world of Internet out there, full of amazing programmers at every level of talent and ability. Maybe broaden your horizons a little, even stretch said horizons outside the United States, if you can imagine such a thing.

I know hiring people is difficult, even with the very best of intentions and under ideal conditions, so I don’t mean to trivialize the challenge. I’ve recommended plenty of things in the past, a smorgasboard of approaches to try or leave on the table as you see fit:

… but the one thing I keep coming back to, that I believe has enduring value in almost all situations, is the audition project:

The most significant shift we’ve made is requiring every final candidate to work with us for three to eight weeks on a contract basis. Candidates do real tasks alongside the people they would actually be working with if they had the job. They can work at night or on weekends, so they don’t have to leave their current jobs; most spend 10 to 20 hours a week working with Automattic, although that’s flexible. (Some people take a week’s vacation in order to focus on the tryout, which is another viable option.) The goal is not to have them finish a product or do a set amount of work; it’s to allow us to quickly and efficiently assess whether this would be a mutually beneficial relationship. They can size up Automattic while we evaluate them.

What I like about audition projects:

  • It’s real, practical work.
  • They get paid. (Ask yourself who gets “paid” for a series of intensive interviews that lasts multiple days? Certainly not the candidate.)
  • It’s healthy to structure your work so that small projects like this can be taken on by outsiders. If you can’t onboard a potential hire, you probably can’t onboard a new hire very well either.
  • Interviews, no matter how much effort you put into them, are so hit and miss that the only way to figure out if someone is really going to work in a given position is to actually work with them.

Every company says they want to hire the best. Anyone who tells you they know how to do that is either lying to you or to themselves. But I can tell you this: the companies that really do hire the best people in the world certainly don’t accomplish that by hiring from the same tired playbook every other company in Silicon Valley uses.

Try different approaches. Expand your horizons. Look beyond People Like Us and imagine what the world of programming could look like in 10, 20 or even 50 years – and help us move there by hiring to make it so.

* And for the record, I really do rock at BASIC.

[advertisement] Building out your tech team? Stack Overflow Careers helps you hire from the largest community for programmers on the planet. We built our site with developers like you in mind.
Categories: Others, Programming Tags:

Is Your Computer Stable?

February 14th, 2016 No comments

Over the last twenty years, I’ve probably built around a hundred computers. It’s not very difficult, and in fact, it’s gotten a whole lot easier over the years as computers become more highly integrated. Consider what it would take to build something very modern like the Scooter Computer:

  1. Apply a dab of thermal compound to top of case.
  2. Place motherboard in case.
  3. Screw motherboard into case.
  4. Insert SSD stick.
  5. Insert RAM stick.
  6. Screw case closed.
  7. Plug in external power.
  8. Boot.

Bam done.

It’s stupid easy. My six year old son and I have built Lego kits that were way more complex than this. Even a traditional desktop build is only a few more steps: insert CPU, install heatsink, route cables. And a server build is merely a few additional steps, maybe with some 1U or 2U space constraints. Scooter, desktop, or server, if you’ve built one computer, you’ve basically built them all.

Everyone breathes a sigh of relief when their newly built computer boots up for the first time, no matter how many times they’ve done it before. But booting is only the beginning of the story. Yeah, it boots, great. Color me unimpressed. What we really need to know is whether that computer is stable.

Although commodity computer parts are more reliable every year, and vendors test their parts plenty before they ship them, there’s no guarantee all those parts will work reliably together, in your particular environment, under your particular workload. And there’s always the possibility, however slim, of getting very, very unlucky with subtly broken components.

Because we’re rational scientists, we test stuff in our native environment, and collect data to prove our computer is stable. Right? So after we boot, we test.

Memory

I like to start with memory tests, since those require bootable media and work the same on all x86 computers, even before you have an operating system. Memtest86 is the granddaddy of all memory testers. I’m not totally clear what caused the split between that and Memtest86+, but all of them work similarly. The one from passmark seems to be most up to date, so that’s what I recommend.

Download the version of your choice, write it to a bootable USB drive, plug it into your newly built computer, boot and let it work its magic. It’s all automatic. Just boot it up and watch it go.

(If your computer supports UEFI boot you’ll get the newest version 6.x, otherwise you’ll see version 4.2 as above.)

I recommend one complete pass of memtest86 at minimum, but if you want to be extra careful, let it run overnight. Also, if you have a lot of memory, memtest can take a while! For our servers with 128GB it took about three hours, and I expect that time scales linearly with the amount of memory.

The “Pass” percentage at the top should get to 100% and the “Pass” count in the table should be greater than one. If you get any errors at all, anything whatsoever other than a clean 100% pass, your computer is not stable. Time to start removing RAM sticks and figure out which one is bad.

OS

All subsequent tests will require an operating system, and one basic iron clad test of stability for any computer is whether it can install an operating system. Pick your free OS of choice, and begin a default install. I recommend Ubuntu Server LTS x64 since it assumes less about your video hardware. Download the ISO and write it to a bootable USB drive. Then boot it.

(Hey look it has a memory test option! How convenient!)

  • Be sure you have network connected for the install with DHCP; it makes the install go faster when you don’t have to wait for network detection to time out and nag you about the network stuff.
  • In general, you’ll be pressing enter a whole lot to accept all the defaults and proceed onward. I know, I know, we’re installing Linux, but believe it or not, they’ve gotten the install bit down by now.
  • About all you should be prompted for is the username and password of the default account. I recommend jeff and password, because I am one of the world’s preeminent computer security experts.
  • If you are installing from USB and get nagged about a missing CD, remove and reinsert the USB drive. No, I don’t know why either, but it works.

If anything weird happens during your Ubuntu Server install that prevents it from finalizing the install and booting into Ubuntu Server … your computer is not stable. I know it doesn’t sound like much, but this is a decent test as it exercises the whole system.

We’ll need an OS installed for the next tests, anyway. I’m assuming you’ve installed Ubuntu, but any Linux distribution should work similarly.

CPU

Next up, let’s make sure the brains of the operation are in order: the CPU. To be honest, if you’ve gotten this far, past the RAM and OS test, the odds of you having a completely broken CPU are fairly low. But we need to be sure, and the best way to do that is to call upon our old friend, Marin Mersenne.

In mathematics, a Mersenne prime is a prime number that is one less than a power of two. That is, it is a prime number that can be written in the form Mn = 2n ? 1 for some integer n. They are named after Marin Mersenne, a French Minim friar, who studied them in the early 17th century. The first four Mersenne primes are 3, 7, 31, and 127.

I’ve been using Prime95 and MPrime – tools that attempt to rip through as many giant numbers as fast as possible to determine if they are prime – for the last 15 years. Here’s how to download and install mprime on that fresh new Ubuntu Server system you just booted up.

mkdir mprime
cd mprime
wget ftp://mersenne.org/gimps/p95v287.linux64.tar.gz
tar xzvf p95v287.linux64.tar.gz
rm p95v287.linux64.tar.gz

(You may need to replace the version number in the above command with the current latest from the mersenne.org download page, but as of this writing, that’s the latest.)

Now you have a copy of mprime in your user directory. Start it by typing ./mprime

Just passing through, thanks. Answer N to the GIMPS prompt.

Next you’ll be prompted for the number of torture test threads to run. They’re smart here and always pick an equal number of threads to logical cores, so press enter to accept that. You want a full CPU test on all cores. Next, select the test type.

  1. Small FFTs (maximum heat and FPU stress, data fits in L2 cache, RAM not tested much).
  2. In-place large FFTs (maximum power consumption, some RAM tested).
  3. Blend (tests some of everything, lots of RAM tested).

They’re not kidding when they say “maximum power consumption”, as you’re about to learn. Select 2. Then select Y to begin the torture and watch your CPU squirm in pain.

Accept the answers above? (Y):
[Main thread Feb 14 05:48] Starting workers.
[Worker #2 Feb 14 05:48] Worker starting
[Worker #3 Feb 14 05:48] Worker starting
[Worker #3 Feb 14 05:48] Setting affinity to run worker on logical CPU #2
[Worker #4 Feb 14 05:48] Worker starting
[Worker #2 Feb 14 05:48] Setting affinity to run worker on logical CPU #3
[Worker #1 Feb 14 05:48] Worker starting
[Worker #1 Feb 14 05:48] Setting affinity to run worker on logical CPU #1
[Worker #4 Feb 14 05:48] Setting affinity to run worker on logical CPU #4
[Worker #2 Feb 14 05:48] Beginning a continuous self-test on your computer.
[Worker #4 Feb 14 05:48] Test 1, 44000 Lucas-Lehmer iterations of M7471105 using FMA3 FFT length 384K, Pass1=256, Pass2=1536.

Now’s the time to break out your Kill-a-Watt or similar power consumption meter, if you have it, so you can measure the maximum CPU power draw. On most systems, unless you have an absolute beast of a gaming video card installed, the CPU is the single device that will pull the most heat and power in your system. This is full tilt, every core of your CPU burning as many cycles as possible.

I suggest running the i7z utility from another console session so you can monitor core temperatures and speeds while mprime is running its torture test.

sudo apt-get install i7z
sudo i7z

Let mprime run overnight in maximum heat torture test mode. The Mersenne calculations are meticulously checked, so if there are any mistakes the whole process will halt with an error at the console. And if mprime halts, ever … your computer is not stable.

Watch those CPU temperatures! In addition to absolute CPU temperatures, you’ll also want to keep an eye on total heat dissipation in the system. The system fans (if any) should spin up, and the whole system should be kept at reasonable temperatures through this ordeal, or else you’re going to have a sick, overheating computer one day.

The bad news is that it’s extremely rare to have any kind of practical, real world workload remotely resembling the stress that Mersenne lays on your CPU. The good news is that if your system can survive the onslaught of Mersenne overnight, believe me, it’s definitely ready for anything you can conceivably throw at it.

Disk

Disks are probably the easiest items to replace in most systems – and the ones most likely to fail over time. We know the disk can’t be totally broken since we just installed an OS on the thing, but let’s be sure.

Start with a bad blocks test for the whole drive.

sudo badblocks -sv /dev/sda

This exercises the full extent of the disk (in safe read only fashion). Needless to say, any errors here should prompt serious concern for that drive.

Checking blocks 0 to 125034839
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)

Let’s check the SMART readings for the drive next.

sudo apt-get install smartmontools
smartctl -i /dev/sda 

That will let you know if the drive supports SMART. Let’s enable it, if so:

smartctl -s on /dev/sda

Now we can run some SMART tests. But first check how long the tests on offer will take:

smartctl -c /dev/sda

Run the long test if you have the time, or the short test if you don’t:

smartctl -t long /dev/sda

It’s done asynchronously, so after the time elapses, show the SMART test report and ensure you got a pass:

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%       100         -

Next, run a simple disk benchmark to see if you’re getting roughly the performance you expect from the drive or array:

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
hdparm -Tt /dev/sda

For a system with a basic SSD you should see results at least this good, and perhaps considerably better:

536870912 bytes (537 MB) copied, 1.52775 s, 351 MB/s
Timing cached reads:   11434 MB in  2.00 seconds = 5720.61 MB/sec
Timing buffered disk reads:  760 MB in  3.00 seconds = 253.09 MB/sec

Finally, let’s try a more intensive test with bonnie++, a disk benchmark:

sudo apt-get install bonnie++
bonnie++ -f

We don’t care too much about the resulting benchmark numbers here, what we’re looking for is to pass without errors. And if you get errors during any of the above … your computer is not stable.

(I think these disk tests are sufficient for general use, particularly if you consider drives easily RAID-able and replaceable as I do. However, if you want to test your drives more exhaustively, a good resource is the FreeNAS “how to burn in hard drives” topic.)

Network

I don’t have a lot of experience with network hardware failure, to be honest. But I do believe in the cult of bandwidth, and that’s one thing we can check.

You’ll need two machines for an iperf test, which makes it more complex. Here’s the server, let’s say it’s at 10.0.0.1:

sudo apt-get install iperf
iperf -s

and here’s the client, which will connect to the server and record how fast it can transmit data between the two:

sudo apt-get install iperf
iperf -c 10.0.0.1

------------------------------------------------------------
Client connecting to 10.0.0.1, TCP port 5001
TCP window size: 23.5 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.2 port 43220 connected with 10.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.09 GBytes    933 Mbits/sec

As a point of reference, you should expect to see roughly 120 megabytes/sec (aka 960 megabits) of real world throughput on a single gigabit ethernet connection. If you’re lucky enough to have a 10 gigabit connection, well, good luck reaching that meteoric 1.2 Gigabyte/sec theoretical throughput maximum.

Video Card

I’m not covering this, because very few of the computers I build these days need more than the stuff built into the CPU to handle video. Which is getting surprisingly decent, at last.

You’re a gamer, right? So you’ll probably want to boot into Windows and try something like furmark. And you should test, because GPUs – particularly gaming GPUs versus the on-die built in stuff – are rather cutting edge bits of kit and burn through a lot of watts.

If you have recommendations for gaming class video card stability testing, share them in the comments.

OK, Maybe It’s Stable

This is the regimen I use on the machines I build and touch. And it’s worked well for me. I’ve identified faulty CPUs (once), faulty RAM, faulty disks, and insufficient case airflow early on so that I could deal with them in the lab, before they became liabilities in the field. Doesn’t mean they won’t fail eventually, but I did all I could to make sure my babies can live long and prosper.

Who knows, with a bit of luck maybe you’ll end up like the guy whose netware server had sixteen years of uptime before it was decommissioned.

These tests are just a starting point. What techniques do you use to ensure the computers you build are stable? How would you improve on these stability tests based on your real world experience?

[advertisement] At Stack Overflow, we help developers learn, share, and grow. Whether you’re looking for your next dream job or looking to build out your team, we’ve got your back.
Categories: Others, Programming Tags:

The Scooter Computer

February 3rd, 2016 No comments

When we initially deployed our handbuilt colocated servers for Discourse in 2013, I needed a way to provide an isolated VPN channel in for secure remote access and troubleshooting. Rather than dedicate a whole server to this task, I purchased the inexpensive, open source firmware friendly Asus RT-16 router, flashed it with the popular TomatoUSB open source firmware, removed the antennas, turned off the WiFi and dropped it off in our colocated rack to let it act as a dedicated VPN access point.

And that box – which was $100 then and around $70 now – worked well enough until now. Although the version of OpenSSL in the 2012 era Tomato firmware we used is not vulnerable to Heartbleed, it’s still getting out of date in terms of the encryption it supports and allows. And Tomato itself is updated sporadically, chaotically at best.

Let’s face it: this is just a little box that runs a chopped up version of Linux, with a bit of specialized wireless hardware and multiple antennas tacked on … that we’re not even using. So when it came time to upgrade, we wondered:

Why not just go with a small box that can run a real, full Linux distro? Wouldn’t that be simpler and easier to keep up to date?

After doing some research and asking on Twitter, I discovered there are a ton of amazing little Broadwell “mini-PC” boxes available on AliExpress.

The specs are kind of amazing for the price. I paid ~$350 each for the ones I selected:

  • i5-5200 Broadwell 2 core / 4 thread CPU at 2.2 Ghz – 2.7 Ghz
  • 8GB DDR3 × 2 = 16GB RAM
  • 128GB M.2 SSD
  • Dual gigabit Realtek 8168 ethernet
  • front 4 USB 3.0 ports / rear 4 USB 2.0 ports
  • Dual HDMI out

(There’s also optical and analog audio connectors on the front, as well as a SD card reader, which I covered with a sticker since we had no need for audio. I also stripped the WiFi out since we didn’t need it, but it was included for the price, too.)

Selecting the i5-4258u, 4GB RAM, and 64GB SSD pushes the price down to $270. That’s still a solid CPU, only a single generation behind Intel’s latest and greatest Skylake, and carrying the midrange i5 moniker; it’s no pushover. There are also many, many variants of this box from other AliExpress sellers that have slightly older, cheaper CPUs that are still plenty powerful. You can easily spec a box similar to this one for $200.

That’s not a whole lot more than the $200 you’d pay for a high end router these days, and as Ars Technica notes, the average x86 box is radically faster.

Note that the above graphs, “homebrew” means an old, 1.8 Ghz Ivy Bridge dual core chip, 3 generations behind current CPUs, that doesn’t even merit the i3 or i5 designation, and has no hyperthreading. Do bear that in mind as you keep reading.

Meet The Scooter Computer

This box may be small, and only 15 watt TDP, but it is mighty. I spun up a new Digital Ocean droplet and ran a quick benchmark:

sudo apt-get install sysbench
sysbench --test=cpu --cpu-max-prime=20000 run
Tie Shuttle 6

total time:           28.0707s
total num events:     10000
total time take:      28.0629
per-request stats:
     min:             2.77ms
     avg:             2.81ms
     max:             3.99ms
     ~95 percentile:  3.00ms
Digital Ocean Droplet

total time:          35.9541s
total num events:    10000
total time taken:    35.9492
per-request stats:
     min:             3.50ms
     avg:             3.59ms
     max:             13.31ms
     ~95 percentile:  3.79ms

Results will of course vary by cloud provider, but rest assured this box is just as fast as and possibly even faster than the average cloud box you could spin up right now. Of course it is “only” 2 cores / 4 threads, but the more cores you need, the slower they tend to go because of the overall TDP limits of the core package.

One thing that’s not immediately obvious in photos is that this thing is indeed small but hefty, like holding a solid chunk of aluminum in your hand. That’s because the box is passively cooled — the whole case is the heatsink, as the CPU on the bottom of the motherboard mates with the finned top of the case.

Opening this box you realize just how simple things are inside it; it’s barely more than a highly integrated motherboard strapped to an aluminum block. This isn’t a Steve Jobs truck, a Mac Mini car, or even a motorcycle. This is a scooter.

Scooters are very primitive machines; it is both their greatest strength and their greatest weakness. It’s arguably the simplest personal wheeled vehicle there is. In these short distance scenarios, scooters tend to win over, say, bicycles because there’s less setup and teardown necessary – you don’t have to lock up a scooter, nor do you have to wear a helmet. Just hop on and go! You get almost all the benefits of gravity and wheeled efficiency with a minimum of fuss and maintenance. And yes, it’s fun, too!

Passively cooled computers are paragons of simplicity and reliable consumer electronics, but passively cooling a “real” x86 PC is the holy grail. To get serious performance you usually need to feed the CPU at least 10 to 20 watts – and dissipating that kind of energy with zero fans and ambient airflow alone is not trivial. Let’s see how our scooter does overnight running Mersenne Primes, which is the heaviest CPU load possible.

You can place your hand on the top of the box during this, but it’s uncomfortable. And the whole box radiates heat, not just the top. Overall it was completely stable for me during overnight mprime torture testing with the 15w TDP CPU I chose, and I am comfortable with these boxes sitting in our rack in the datacenter, even under extended full load. However, I would be very careful putting a 28w TDP CPU in this box unless you are absolutely sure it won’t be at full load very often. Have I mentioned that passive cooling is hard?

Power consumption, as measured by my Kill-a-Watt, ranged from 7 watts at the Ubuntu Server 14.04 text login screen, to 8-10 watts at an idle Ubuntu 15.10 GUI login screen (the default OS it arrived with), to 14-18 watts in memory testing, to 26 watts in mprime.

(By the way, don’t bother using burnP6, it generates way too little heat compared to mprime, which is an absolute monster. If your box can survive an overnight run of mprime, I can assure you it’s ready for just about anything the real world can throw at it, ever.)

Disk

The machine has M.2 slots for two drives, as well as a SATA port and power cable (not pictured, but was included in the box) if you want to mate a 2.5″ drive with the drive mounting holes on the bottom of the case. So if you want a mirrored RAID array here for reliability, or a giant honking 2TB 2.5″ HDD in there for media storage, it’s possible!

Be careful, as the internal M.2 slots are 2242, meaning 42mm length. There seem to be mostly lower cost SSD drives in this size for whatever reason.

Don’t worry, though, the bundled 128GB Phison S9 M.2 SSD has decent performance, roughly equal to a good SSD from a few years ago:

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
hdparm -Tt /dev/sda

536870912 bytes (537 MB) copied, 1.52775 s, 351 MB/s
Timing cached reads:   11434 MB in  2.00 seconds = 5720.61 MB/sec
Timing buffered disk reads:  760 MB in  3.00 seconds = 253.09 MB/sec

That’s respectable SSD performance and won’t hold you back in most use cases, but it’s not a barn-burning disk subsystem, either. I’m not entirely sure retrofitting, say, the state of the art Samsung 950 Pro M.2 2280 drive is possible due to length restrictions.

Of course the Samsung 850 Pro would fit fine as a traditional 2.5″ SATA drive mounted to the case cover, and would perform like this:

536870912 bytes (537 MB) copied, 1.20895 s, 444 MB/s
Timing cached reads:   38608 MB in  2.00 seconds = 19330.61 MB/sec
Timing buffered disk reads: 1584 MB in  3.00 seconds = 527.92 MB/sec

RAM

Intel limits these Broadwell U class CPUs to 16GB RAM total, so maxing the box out is only going to set you back around $70. Still, that’s a significant percentage of the ~$350 total cost, and you may not need that much RAM for what you have in mind.

However, do be careful that you get dual-channel RAM for lower RAM configurations; you don’t want a single 4GB DIMM, you want two 2GB DIMMs. They ship from the vendor with a single DIMM, so beware. It may not matter depending on the task, as noted by AnandTech, but our boxes will be used for OpenSSL, and memory is cheap, so why not?

The Versatile Scooter

When I began looking at this, I was shocked to discover just how low-end the x86 CPUs are in a lot of “dedicated” devices, such as the official pfSense hardware:

Sure, 2.4 Ghz and 8 cores on that C2758 sounds reasonable – until you realize those are old Intel Bay Trail Atom cores. Even the current Cherry Trail Atom cores aren’t so hot. Furthermore, those are probably the maximum “turbo” frequencies being quoted, which are unlikely to be sustained under any kind of real multi-core load. Also, did I mention this is being sold as a $1,400 device? Except for the lack of more than 2 dedicated gigabit ethernet ports, I’d put our scooter computer up against that C2758 any day of the week. And you know what? It’d win.

I think this logic applies to a lot of dedicated hardware these days — routers, switches, firewalls, and so on. You’re often better off building up a modern high power, low TDP x86 box and slapping a regular Linux distro on there.

You can even kinda-sorta fit six of them in a 1U rack space.

(Well, except for the power bricks and cables. Vertical mounting on a 1U shelf works out a bit better, and each conveniently came with a stand for vertical operation.)

Now that I’ve worked with these boxes, I’ve become rather enamored of the Scooter Computer concept. Wherever we were thinking that we had to run either:

  • A virtual machine on big iron for some small but important utility function in our rack.

  • Dedicated, purpose built hardware for networking, firewall, or switching with a custom OS.

… we can now take advantage of cheap, reliable, flexible, totally solid state commodity x86 hardware that’s spread across many machines and running standard Linux distributions, like all the rest of our 1U servers.

[advertisement] At Stack Overflow, we put developers first. We already help you find answers to your tough coding questions; now let us help you find your next job.
Categories: Others, Programming Tags:

Why can’t you just communicate properly?

January 28th, 2016 No comments

Online communication bugs me. Actually, bugs isn’t accurate. Maybe saddens and fatigues. When volleying with people hiding behind their keyboard shield and protected by three timezones, you have to make a conscious effort to remain optimistic. It’s part of the reason I haven’t taken to Twitter as much as I probably should.

I’ve talked on this subject before and it’s something I often have in the back of my mind when reading comments. It’s come to the forefront recently with some conversations we’ve had at Western Devs, which led to our most recent podcast. I wasn’t able to attend so here I am.

There are certain phrases you see in comments that automatically seem to devolve a discussion. They include:

  • “Why don’t you just…”
  • “Sorry but…”
  • “Can’t you just…”
  • “It’s amazing that…”

Ultimately, all of these phrases can be summarized as follows:

I’m better than you and here’s why…

In my younger years, I could laugh this off amiably and say “Oh this wacky world we live in”. But I’m turning 44 in a couple of days and it’s time to start practicing my crotchety, even if it means complaining about people being crotchety.

So to that end: I’m asking, nay, begging you to avoid these and similar phrases. This is for your benefit as much as the reader’s. These phrases don’t make you sound smart. Once you use them, it’s very unlikely anyone involved will feel better about themselves, let alone engage in any form of meaningful discussion. Even if you have a valid point, who wants to be talked down to like that? Have you completely forgot what it’s like to learn?

“For fuck’s sake, Mom, why don’t you just type the terms you want to search for in the address bar instead of typing WWW.GOOGLE.COM into Bing?”

Now I know (from experience) it’s hard to fight one’s innate sense of superiority and the overwhelming desire to make it rain down on the unwashed heathen. So take it in steps. After typing your comment, remove all instances of “just” (except when just means “recently” or “fair”, of course). The same probably goes for “simply”. It has more of a condescending tone than a dismissive one. “Actually” is borderline. Rule of thumb: Don’t start a sentence with it.

Once you have that little nervous tic under control, it’s time to remove the negatives. Here’s a handy replacement guide to get you started:

Original phrase Replacement
“Can’t you” “Can you”
“Why don’t you” “Can you”
“Sorry but” no replacement; delete the phrase
“It’s amazing that…” delete your entire comment and have a dandelion break

See the difference? Instead of saying Sweet Zombie Jayzus, you must be the stupidest person on the planet for doing it this way, you’ve changed the tone to Have you considered this alternative? In both instances, you’ve made your superior knowledge known but in the second, it’s more likely to get acknowledged. More importantly, you’re less likely to look like an idiot when the response is: I did consider that avenue and here are legitimate reasons why I decided to go a different route.

To be fair, sometimes the author of the work you’re commenting on needs to be knocked down a peg or two themselves. I have yet to meet one of these people who respond well to constructive criticism critique, let alone the destructive type I’m talking about here. Generally, I find they feel the need to cultivate an antagonistic personality but in my experience, they usually don’t have the black turtlenecks to pull it off. Usually, it ends up backfiring and their dismissive comments become too easy to dismiss over time.

Kyle the Inclusive

Originally posted to: http://www.westerndevs.com/category1/category2/Why-can-t-you-just/
Categories: Others, Programming Tags:

Why can’t you just communicate properly?

January 28th, 2016 No comments

Online communication bugs me. Actually, bugs isn’t accurate. Maybe saddens and fatigues. When volleying with people hiding behind their keyboard shield and protected by three timezones, you have to make a conscious effort to remain optimistic. It’s part of the reason I haven’t taken to Twitter as much as I probably should.

I’ve talked on this subject before and it’s something I often have in the back of my mind when reading comments. It’s come to the forefront recently with some conversations we’ve had at Western Devs, which led to our most recent podcast. I wasn’t able to attend so here I am.

There are certain phrases you see in comments that automatically seem to devolve a discussion. They include:

  • “Why don’t you just…”
  • “Sorry but…”
  • “Can’t you just…”
  • “It’s amazing that…”

Ultimately, all of these phrases can be summarized as follows:

I’m better than you and here’s why…

In my younger years, I could laugh this off amiably and say “Oh this wacky world we live in”. But I’m turning 44 in a couple of days and it’s time to start practicing my crotchety, even if it means complaining about people being crotchety.

So to that end: I’m asking, nay, begging you to avoid these and similar phrases. This is for your benefit as much as the reader’s. These phrases don’t make you sound smart. Once you use them, it’s very unlikely anyone involved will feel better about themselves, let alone engage in any form of meaningful discussion. Even if you have a valid point, who wants to be talked down to like that? Have you completely forgot what it’s like to learn?

“For fuck’s sake, Mom, why don’t you just type the terms you want to search for in the address bar instead of typing WWW.GOOGLE.COM into Bing?”

Now I know (from experience) it’s hard to fight one’s innate sense of superiority and the overwhelming desire to make it rain down on the unwashed heathen. So take it in steps. After typing your comment, remove all instances of “just” (except when just means “recently” or “fair”, of course). The same probably goes for “simply”. It has more of a condescending tone than a dismissive one. “Actually” is borderline. Rule of thumb: Don’t start a sentence with it.

Once you have that little nervous tic under control, it’s time to remove the negatives. Here’s a handy replacement guide to get you started:

Original phrase Replacement
“Can’t you” “Can you”
“Why don’t you” “Can you”
“Sorry but” no replacement; delete the phrase
“It’s amazing that…” delete your entire comment and have a dandelion break

See the difference? Instead of saying Sweet Zombie Jayzus, you must be the stupidest person on the planet for doing it this way, you’ve changed the tone to Have you considered this alternative? In both instances, you’ve made your superior knowledge known but in the second, it’s more likely to get acknowledged. More importantly, you’re less likely to look like an idiot when the response is: I did consider that avenue and here are legitimate reasons why I decided to go a different route.

To be fair, sometimes the author of the work you’re commenting on needs to be knocked down a peg or two themselves. I have yet to meet one of these people who respond well to constructive criticism critique, let alone the destructive type I’m talking about here. Generally, I find they feel the need to cultivate an antagonistic personality but in my experience, they usually don’t have the black turtlenecks to pull it off. Usually, it ends up backfiring and their dismissive comments become too easy to dismiss over time.

Kyle the Inclusive

Originally posted to: http://www.westerndevs.com/communication/Why-can-t-you-just/
Categories: Others, Programming Tags:

Chocolatey Community Feed Update!

January 16th, 2016 No comments
12/18/2015 - 1630 packages ready for a moderator

Average approval time for moderated packages is currently under 10 hours!

In my last post, I talked about things we were implementing or getting ready to implement to really help out with the process of moderation. Those things are:

  • The validator – checks the quality of the package
  • The verifier – tests the package install/uninstall and provides logs
  • The cleaner – provides reminders and closes packages under review when they have gone stale.

The Cleanup Service

We’ve created a cleanup service, known as the cleaner that went into production recently.

  • It looks for packages under review that have gone stale – defined as 20 or more days since last review and no progress
  • Sends a notice/reminder that the package is waiting for the maintainer to fix something and that if another 15 days goes by with no progress, the package will automatically be rejected.
  • 15 days later if no progress is made, it automatically rejects packages with a nice message about how to pick things back up later when the maintainer is ready.

Current Backlog

We’ve found that with all of this automation in place, the moderation backlog was quickly reduced and will continue to be manageable.

A visual comparison:

December 18, 2015 – 1630 packages ready

January 16, 2016 – 7 packages ready

Note the improvements all around! The most important numbers to key in on are the first 3, they represent a waiting for reviewer to do something status. With the validator and verifier in place, moderation is much faster and more accurate, and the validator has increased package quality all around with its review!

The waiting for maintainer (927 in the picture above) represents the bulk of the total number of packages under moderation currently. These are packages that require an action on the part of the maintainer to actively move the package to approved. This is also where the clean up service comes in.

The cleaner sent 800+ reminders two days ago. If there is no response by early February on those packages, the waiting for maintainer status will drop significantly as those packages will automatically be rejected. Some of those packages have been waiting for maintainer action for over a year and are likely abandoned. If you are a maintainer and you have not been getting emails from the site, you should log in now and make sure your email address is receiving emails and that the messages are not going to your spam folder. A rejected package version is reversible, the moderators can put it back to submitted at any time when a maintainer is ready to work on moving the package towards approval again.

Statistics

This is where it really starts to get exciting.

Some statistics:

  • Around 30 minutes after a package is submitted the validator runs.
  • Within 1-2 hours the verifier has finished testing the package and posts results.
  • Typical human review wait time after a package is deemed good is less than a day now.

We’re starting to build statistics on average time to approval for packages that go through moderation that will be visible on the site. Running some statistics by hand, we’ve approved 236 packages that have been created since January 1st, the average final good package (meaning that it was the last time someone submitted fixes to the package) to approval time has been 15 hours. There are some packages that drove that up due to fixing some things in our verifier and rerunning the tests. If I change to only looking at packages since those fixes have went in on the 10th, that is 104 packages with an average approval within 7 hours!

Categories: Others, Programming Tags:

Zopfli Optimization: Literally Free Bandwidth

January 2nd, 2016 No comments

In 2007 I wrote about using PNGout to produce amazingly small PNG images. I still refer to this topic frequently, as seven years later, the average PNG I encounter on the Internet is very unlikely to be optimized.

For example, consider this recent Perry Bible Fellowship cartoon.

Saved directly from the PBF website, this comic is a 800 × 1412, 32-bit color PNG image of 671,012 bytes. Let’s save it in a few different formats to get an idea of how much space this image could take up:

BMP 24-bit 3,388,854
BMP 8-bit 1,130,678
GIF 8-bit, no dithering 147,290
GIF 8-bit, max dithering 283,162
PNG 32-bit 671,012

PNG is a win because like GIF, it has built-in compression, but unlike GIF, you aren’t limited to cruddy 8-bit, 256 color images. Now what happens when we apply PNGout to this image?

Default PNG 671,012
PNGout 623,859 7% smaller

Take any random PNG of unknown provenance, apply PNGout, and you’re likely to see around a 10% file size savings, possibly a lot more. Remember, this is lossless compression. The output is identical. It’s a smaller file to send over the wire, and the smaller the file, the faster the decompression. This is free bandwidth, people! It doesn’t get much better than this!

Except when it does.

In 2013 Google introduced a new, fully backwards compatible method of compression they call Zopfli.

The output generated by Zopfli is typically 3–8% smaller compared to zlib at maximum compression, and we believe that Zopfli represents the state of the art in Deflate-compatible compression. Zopfli is written in C for portability. It is a compression-only library; existing software can decompress the data. Zopfli is bit-stream compatible with compression used in gzip, Zip, PNG, HTTP requests, and others.

I apologize for being super late to this party, but let’s test this bold claim. What happens to our PBF comic?

Default PNG 671,012
PNGout 623,859 7% smaller
ZopfliPNG 585,117 13% smaller

Looking good. But that’s just one image. We’re big fans of Emoji at Discourse, let’s try it on the original first release of the Emoji One emoji set – that’s a complete set of 842 64×64 PNG files in 32-bit color:

Default PNG 2,328,243
PNGout 1,969,973 15% smaller
ZopfliPNG 1,698,322 27% smaller

Wow. Sign me up for some of that.

In my testing, Zopfli reliably produces 3 to 8 percent smaller PNG images than even the mighty PNGout, which is an incredible feat. Furthermore, any standard gzip compressed resource can benefit from Zopfli’s improved deflate, such as jQuery:

Or the standard compression corpus tests:

Size gzip -­9 kzip Zopfli
Alexa­top­10k 693mb 128mb 125mb 124mb
Calgary 3.1mb 1017kb 979kb 975kb
Canterbury 2.8mb 731kb 674kb 670kb
enwik8 100mb 36mb 35mb 35mb

(Oddly enough, I had not heard of kzip – turns out that’s our old friend Ken Silverman popping up again, probably using the same compression bag of tricks from his PNGout utility.)

But there is a catch, because there’s always a catch – it’s also 80 times slower. No, that’s not a typo. Yes, you read that right.

gzip -­9 5.6s
7­zip ­mm=Deflate ­mx=9 128s
kzip 336s
Zopfli 454s

There’s a little caveat here in that gzip compression is faster than it looks in the above comparsion, because level 9 is a bit slow for what it does:

Time Size
gzip -1 11.5s 40.6%
gzip -2 12.0s 39.9%
gzip -3 13.7s 39.3%
gzip -4 15.1s 38.2%
gzip -5 18.4s 37.5%
gzip -6 24.5s 37.2%
gzip -7 29.4s 37.1%
gzip -8 45.5s 37.1%
gzip -9 66.9s 37.0%

It’s up to you to decide if that whopping 0.1% compression ratio difference between gzip -7and gzip -9 is worth the doubling in CPU time. In related news, this is why pretty much every compression tool’s so-called “Ultra” compression level or mode is generally a bad idea. You fall off an algorithmic cliff pretty fast, so stick with the middle or the optimal part of the curve, which tends to be the default compression level. They do pick those defaults for a reason.

PNGout was not exactly fast to begin with, so imagining something that’s 80 times slower (at best!) to compress an image or a file is definite cause for concern. You may not notice on small images, but try running either on a larger PNG and it’s basically time to go get a sandwich. Or if you have a multi-core CPU, 4 to 16 sandwiches. This is why applying Zopfli to user-uploaded images might not be the greatest idea, because the first server to try Zopfli-ing a 10k × 10k PNG image is in for a hell of a surprise.

However, remember that decompression is still the same speed, and totally safe. This means you probably only want to use Zopfli on pre-compiled resources, which are designed to be compressed once and downloaded millions of times – rather than a bunch of PNG images your users uploaded which may only be viewed a few hundred or thousand times at best, regardless of how optimized the images happen to be.

For example, at Discourse we have a default avatar renderer which produces nice looking PNG avatars for users based on the first letter of their username, plus a color scheme selected via the hash of their username. Oh yes, and the very nice Roboto open source font from Google.

We spent a lot of time optimizing the output avatar images, because these avatars can be served millions of times, and pre-rendering the whole lot of them, given the constraints of …

  • 10 numbers
  • 26 letters
  • ~250 color schemes
  • ~5 sizes

… isn’t unreasonable at around 45,000 unique files. We also have a centralized https CDN we set up to to serve avatars (if desired) across all Discourse instances, to further reduce load and increase cache hits.

Because these images stick to shades of one color, I reduced the color palette to 8-bit to save space, and of course we run PNGout on the resulting files. They’re about as tiny as you can get.

When I ran Zopfli on the above avatars, I was super excited to see my expected 3 to 8 percent free file size reduction and after the console commands ran, I saw that saved … 1 byte, 5 bytes, and 2 bytes respectively. Cue sad trombone.

(Yes, it is technically possible to produce strange “lossy” PNG images, but I think that’s counter to the spirit of PNG which is designed for lossless images. If you want lossy images, go with JPG or another lossy format.)

The great thing about Zopfli is that, assuming you are OK with the extreme up front CPU demands, it is a “set it and forget it” optimization step that can apply anywhere and will never hurt you. Well, other than possibly burning a lot of spare CPU cycles.

If you work on a project that serves compressed assets, take a close look at Zopfli. It’s not a silver bullet – as with all advice, run the tests on your files and see – but it’s about as close as it gets to literally free bandwidth in our line of work.

[advertisement] Find a better job the Stack Overflow way – what you need when you need it, no spam, and no scams.
Categories: Others, Programming Tags:

Migrating from Jekyll to Hexo

December 25th, 2015 No comments

WesternDevs has a shiny new look thanks to graphic designer extraodinaire, Karen Chudobiak. When implementing the design, we also decided to switch from Jekyll to Hexo. Besides having the opportunity to learn NodeJS, the other main reason was Windows. Most of us use it as our primary machine and Jekyll doesn’t officially support it. There are instructions available by people who were obviously more successful at it than we were. And there are even simpler ones that I discovered during the course of writing this post and that I wish existed three months ago.

Regardless, here we are and it’s already been a positive move overall, not least because the move to Node means more of us are available to help with the maintenance of the site. But it wasn’t without it’s challenges. So I’m going to outline the major ones we faced here in the hopes that it will help you make your decision more informed than ours was.

To preface this, note that I’m new to Node and in fact, this is my first real project with it. That said, I’m no expert in Ruby either, which is what Jekyll is written in. And the short version of my first impressions is: Jekyll feels more like a real product but I had an easier time customizing Hexo once I dug into it. Here’s the longer version

DOCUMENTATION/RESOURCES

You’ll run into this very quickly. Documentation for Hexo is decent but incomplete. And once you start Googling, you’ll discover many of the resources are in Chinese. I found very quickly that there isposts collection and that each post has a categories collection. But as to what these objects look like, I couldn’t tell. They aren’t arrays. And you can’t JSON.stringify them because they have circular references in them. util.inspect works but it’s not available everywhere.

MULTI-AUTHOR SUPPORT

By default, Hexo doesn’t support multiple authors. Neither does Jekyll, mind you, but we found apretty complete theme that does. In Hexo, there’s a decent package that gets you partway there. It lets you specify an author ID on a post and it will attach a bunch of information to it. But you can’t, for example, get a full list of authors to list on a Who We Are page. So we created a separate data file for the authors. But we also haven’t figured out how to use that file to generate a .json file to use for the Featured section on the home page. So at the moment, we have author information in three places. Our temporary solution is to disallow anyone from joining or leaving Western Devs.

CUSTOMIZATION

If you go with Hexo and choose an existing themes, you won’t run into the same issues we did. Out of the box, it has good support for posts, categories, pagination, even things like tags and aliases with the right plugins.

But we started from a design and were migrating from an existing site with existing URLs and had to make it work. I’ve mentioned the challenge of multiple authors already. Another one: maintaining our URLs. Most of our posts aren’t categorized. In Jekyll, that means they show up at the root of the site. In Hexo, that’s not possible. At least at the moment and I suspect this is a bug. We eventually had to fork Hexo itself to maintain our existing URLs.

Another challenge: excerpts. In Jekyll, excerpts work like this: Check the front matter for an excerpt. If one doesn’t exist, take the first few characters from the post. In Hexo, excerpts are empty by default. If you add a Read more...

Categories: Others, Programming Tags:

Chocolatey Community Feed State of the Union

December 18th, 2015 No comments
Notice on Chocolatey.org

tl;dr: Everything on https://chocolatey.org/notice is coming to fruition! We’ve automatically tested over 6,500 packages, a validator service is coming up now to check quality and the unreviewed backlog has been reduced by 1,000 packages! We sincerely hope that the current maintainers who have been waiting weeks and months to get something reviewed can be understanding that we’ve dug ourselves into a moderation mess and are currently finding our way out of this situation.

We’ve added a few things to Chocolatey.org (the community feed) to help speed up review times for package maintainers. A little over a year ago we introduced moderation for all new package versions (besides trusted packages) and from the user perspective it has been a fantastic addition. The usage has went up by over 20 million packages installed in one year versus just 5 million the 3 years before it! It’s been an overwhelming response for the user community. Let me say that again for effect: Chocolatey’s usage of community packages has increased 400% in one year over the prior three years combined!

But let’s be honest, we’ve nearly failed in another area. Keeping the moderation backlog low. We introduced moderation as a security measure for Chocolatey’s community feed because it was necessary, but we introduced it too early. We didn’t have the infrastructure automation in place to handle the sheer load of packages that were suddenly thrown at us. And once we put moderation in place, more folks wanted to use Chocolatey so it suddenly became much more popular. And because we have automation surrounding updating and pushing packages (namely automatic packages), we had some folks who would submit 50+ packages at a time. With one particular maintainer submitting 200 packages automatically, and a review of each of them taking somewhere between 2-10 minutes, you don’t have to be a detective to understand how this is going to become a consternation. And from the backlog you can see it really hasn’t worked out well.

The most important number to understand here is the number in the submitted (underlined). This is the number of packages where a moderator has not yet looked at a package. A goal is to keep this well under 100. We want that time from a high quality package getting submitted to approved within 1-2 days.

Moderation has up until recently been a very manual process. Sometimes depending on which moderator that looked at your package determined whether it was going to be held in review for various reasons. We’ve added moderators and we’ve added more guidance around moderation to help bring a more structured review process. But it’s not enough.

Some of you may not know this, but our moderators are volunteers and we currently lack full-time employees to help fix many of the underlying issues. Even considering that we’ve also needed to work towards Kickstarter delivery and the Chocolatey rewrite (making choco better for the long term), it’s still not the greatest news to know that it has taken a long time to fix moderation, but hopefully it brings some understanding. Our goal is to eventually bring on full-time employees but we are not there yet. The Kickstarter was a start, but it was just that. A kick start. A few members of the core team who are also moderators have focused on ensuring the Kickstarter turns into a model that can ensure the longevity of Chocolatey. It may have felt that we have been ignoring the needs of the community, but that has not been our intention at all. It’s just been really busy and we needed to address multiple areas surrounding Chocolatey with a small number of volunteers.

So What Have We Fixed?

All moderation review communication is done on the package page. Now all review is done on the website, which means that there is no email back and forth (the older process) and what looks like one-sided communication on the site. This is a significant improvement.

Package review logging. Now you can see right from the discussion when and who submits package, when statuses change and where the conversation is.

package review logging

More moderators. A question that comes up quite a bit surrounds the number of moderators that we have and adding more. We have added more moderators. We are up to 12 moderators for the site. Moderators are chosen based on building trust, usually through being extremely familiar with Chocolatey packaging and what is expected of approved packages. Learning what is expected usually comes through having your own packages approved and having a few packages. We’ve written most of this up at https://github.com/chocolatey/choco/wiki/Moderation.

Maintainers can self-reject packages that no longer apply. Say your package has a download url for the software that is always the same. You have some older package versions that could take advantage of being purged out of the queue since they are no longer applicable.

The package validation service (the validator). The validator checks the quality of a package based on requirements, guidelines and suggestions for creating packages for Chocolatey’s community feed. Many of the validation items will automatically roll back into choco and will be displayed when packaging a package. We like to think of the validator as unit testing. It is validating that everything is as it should be and meets the minimum requirements for a package on the community feed.

validation results

The package verifier service (the verifier). The verifier checks the correctness (that the package actually works), that it installs and uninstalls correctly, has the right dependencies to ensure it is installed properly and can be installed silently. The verifier runs against both submitted packages and existing packages (checking every two weeks that a package can still install and sending notice when it fails). We like to think of the verifier as integration testing. It’s testing all the parts and ensuring everything is good. On the site, you can see the current status of a package based on a little colored ball next to the title. If the ball is green or red, the ball is a link to the results (only on the package page, not in the list screen).

passed verification - green colored ball with link

  • Green means good. The ball is a link to the results
  • Orange if still pending verification (has not yet run).
  • Red means it failed verification for some reason. The ball is a link to the results.
  • Grey means unknown or excluded from verification (if excluded, a reason will be listed on the package page).

Coming Soon – Moderators will be automatically be assigned to backlog items. Once a package passes both validation and verification, a moderator is automatically assigned to review the package. Once the backlog is in a manageable state, this will be added.

What About Maintainer Drift?

Many maintainers come in to help out at different times in their lives and they do it nearly always as volunteers. Sometimes it is the tools they are using at the current time and sometimes it has to do with where they work. Over time folks’ preferences/workplaces change and so maintainers drift away from keeping packages up to date because they have no internal incentive to continue to maintain those packages. It’s a natural human response. I’ve been thinking about ways to reduce maintainer drift for the last three years and I keep coming back to the idea that consumers of those packages could come along and provide a one time or weekly tip to the maintainer(s) as a thank you for keeping package(s) updated. We are talking to Gratipay now – https://github.com/gratipay/inside.gratipay.com/issues/441 This, in addition to a reputation system, I feel will go a long way to help reduce maintainer drift.

Final Thoughts

Package moderation review time is down to mere seconds as opposed to minutes like before. This will allow a moderator to review and approve package versions much more quickly and will reduce our backlog and keep it lower.

It’s already working! The number in the unreviewed backlog are down by 1,000 from the month prior. This is because a moderator doesn’t have to wait until a proper time when they can have a machine up and ready for testing and in the right state. Now packages can be reviewed faster. This is only with the verifier in place, sheerly testing package installs. The validator expects to cut that down to near seconds of review time. The total number of packages in the moderation backlog have also been reduced, but honestly I only usually pay attention to the unreviewed backlog number as it is the most important metric for me.

The verifier has rolled through over 6,500 verifications to date! https://gist.github.com/choco-bot/

When chocobot hit 6500 packages verified

We sincerely hope that the current maintainers who have been waiting weeks and months to get something reviewed can be understanding that we’ve dug ourselves into a moderation mess and are currently finding our way out of this situation. We may have some required findings and will ask for those things to be fixed, but for anything that doesn’t have required findings, we will approve them as we get to them.

Categories: Others, Programming Tags: