Archives for HCoder.org

Book summary: See What I Mean (II)

This is the second half of my summary of “See What I Mean“, by Kevin Cheng. It covers from chapter 6 until the end. See the first half on this blog.

Laying out the comic

Once the script is ready, you sketch the comic storyboard to answers these questions:

  • Composition of each panel (where characters go). See example on p.108. Tips: rule of thirds, writing speech bubbles first to use space better, avoid intersecting lines in foreground and background.
  • Perspective (how the audience will look at the characters). Use and be aware of perspective and distance (where the camera is). For inspiration, have a look at Wally Wood’s “22 panels that always work”.
  • Flow & progression (change of locations, how time passes, …). What happens between panels should be obvious. Take care of small details like which hand is used, or the side of something.

Drawing and refining

Resources to make higher-quality art, faster:

  • Reference materials: tracing over stuff is easy, quick and gives good results (eg. photographs, incl. made by yourself for the purpose, or avatar generators like IMVU or XBox).
  • Templates: a couple available on the net, but tend to be limiting. Create your own templates?
  • Comic creation software: several, seem too complex and/or expensive.
  • Online creation tools: websites like bitstrips.com and pixton.com seem interesting.

Applying comics

Possible uses of comics:

  • Requirements/vision: documents don’t get read, and if they do, they’re ambiguous. Comics are easy to read and explaining requirements through real use-cases often works better.
  • Good start for projects/companies: comics help you validate your ideas before you build anything, or decide exactly what to build. In these cases, make the person read the comic on her own, then explain with her own words as she reads again. That way, misunderstandings are easier to spot. Also, make people say how it relates to them: if they or someone they know would use it.
  • Marketing materials. Explaining your product, or why it’s special, through comics.
  • Certain kinds of documentation.

It’s generally easier to get people to read comics than to read text descriptions of the same content.

Breaking Down the Barriers

When convincing bosses to approve the use of comics, there’s usually less resistance than what people think. That said, understand who you’re convincing and what arguments to use (eg. some designers think that comics take relatively little time compared to alternatives, or the evidence suggesting that words + pictures help in understanding and memory). Fidelity and polish in comics (and any other medium) needs to be higher for certain audiences, eg. bosses or corporate clients.

Useful templates and references

The appendix has ideas about how to show someone in front of a computer, interesting panels, gesture dictionary and a facial expression dictionary:

Facial expression dictionary

Book summary: See What I Mean (I)

Oh, boy. It’s been a while, hasn’t it? This is the first post of the year, and it will be about the first book I’ve finished, “See What I Mean” by Kevin Cheng (which, by the way, I got from O’Reilly’s Blogger Review Program). It’s a book about using comics “for work”, to communicate ideas like product concepts, user stories, and such, more effectively.

This post will cover about half the book, from chapters 2 to 5. These notes are much more useful if you have the book to refer to the pictures, but hey, this is mostly for me ;-)

Properties of comics

Basic vocabulary for the anatomy of comics:

Anatomy of a comic

Properties of comics:

  1. Communication: comics don’t need words, or can express a lot without them (p. 23). They’re universal!
  2. Imagination: characters with few features make more readers relate. This can be applied to UI mockups/sketches, too: people focus less on design if it’s abstract (p. 25,26).
  3. Expression: give interpretation to words (“I’m sorry”/”Thank you” examples with different facial expressions on p.27). When combining text and pictures, the sum is bigger than the parts.
  4. Time: comics can express time passing by using empty or repeated panels. Also, via words in “narration” and reference objects (like burning candles, clocks, or day/night).

Drawing 101

Drawing faces is easy! Eyebrows and mouth go a long way to express mood. Body language helps quite a bit, too, and it’s easy to represent. See examples of combinations of eyebrows and mouths on p.47, 48. In faces, eyes go in the middle, and dividing the bottom half in three gives bottom of nose and mouth. Also see http://www.howtodrawit.com for tips on how to draw different things.

Approx. proportions for a person are two heads for torso, 1 for hips/groin, and 3 for the legs. Body language basic guidelines: leaning forward shows interest, concentration or anger (depends on arm position and context; curling the spine works, too); arm position can tell a lot (lifting one or both, on chin, in front of body); head positions don’t change much, but facial expressions or where the person is looking, does. When drawing body language, try to imagine the situation and exaggerate it. It often helps to start with a stick figure, then add detail.

Steps to create a comic

There’s no single correct way to create a comic. One possible approach:

  1. What’s your comic about? Why you’re using comics, what to include, who’s the product and comic for. This chapter is about this step.
  2. Writing the story: create scripts in words, an outline, define characters, setting and dialogue.
  3. Laying out the comic: plan each panel, what to show and how much of it.
  4. Drawing/refining the comic.

What’s your comic about?

Don’t approach the question broadly and vaguely, break it down! Define goals (what to accomplish), length (3-8 panels encouraged; should fit on site homepage, a postcard or e-mail; if longer, consider physical prints), audience (expertise level, context), and representative use case (help your readers understand why they should care).

Writing the story

When writing a script, you can use a similar format as that of film scripts. Each panel’s needs four primary elements:

  1.  Setting (defined up front, usually in bold). It can be time of day, location, or maybe what happens in the background. It depends heavily on the audience. The first panel can help with the setting (“establishing shot”). There are different graphical ways to convey a setting: the description of it describes a concrete way (eg. exterior of coffee shop vs. interior of coffee shop vs. close-up of coffee cup being served).
  2. Characters (all caps, bold). There are several types: target audience, people who interfact with them, and objects/locations that play a significant role (eg. the solution). Target audience is typically based on personas, go make them if you don’t have them already.
  3. Dialogue (regular font). It’s defined by more than the text itself: fonts, sizes, colours, bubble shapes or the split into different bubbles are very important, too! The text can be hard to get right: make it fit the character, keep it realistic (avoid marketing jargon and overly enthusiastic conversation). Captions can communicate time, setting, action, and even dialogue, but don’t add unnecessary information in them, and always try to speak from the character’s voice.
  4. Actions (usually italics). It’s what characters do, depicted in the panel art.

How to tell a story: remove all unnecessary. You can combine several points in a single panel. Show, don’t tell. See examples on p.98-100.

And that’s it for today. In a few days I’ll publish the rest of the summary.

Writing music, printing .gpx files

UPDATE 2012-10-27: I have updated the .gpx viewer to recognise silences! Download the updated version.

Note: if you’re only interested in printing .gpx files, download guitarpro-printer.zip and follow the instructions in the README.txt file.

I have been playing with my last band for over half a year now. From the beginning the idea was to write and play our own material, but actually we had been mostly doing song covers. After some time practising and after having our first gig, we decided to start focusing more on writing our own songs. That’s something I had never done, so it sounded intriguing, challenging and a bit scary (as I essentially don’t know any music theory, and I don’t even own a guitar or keyboard, it seemed extra difficult for me to come up with ideas for new songs). So I decided to try it out, and that meant looking for a way to try it out ideas and record them.

I tried many different programs both for Linux and for Android (I even tried a couple of Windows programs under emulation, but they seemed terribly complex), but nothing was really satisfactory. After searching a lot, I found Guitar Pro for Android (a tablature editor). It wasn’t really what I was thinking about at first, but I found that it was actually the best for my needs: thinking in terms of tabs is easier for me, as I don’t really know music but I have played a bit of guitar. Guitar Pro for Android is supposed to be mostly a tab viewer, but it does have a small feature for “notes”. The idea is that you’re on the bus or whatever, and come up with some musical idea: in that case, Guitar Pro allows you to quickly write it down. As you can listen to what you’re writing, I don’t need an actual guitar/bass to check if what I’m writing really sounds like I think.

Guitar Pro for Android works fairly well for my needs, but something really bugged me: you can only export the music you have written to the .gpx format, which didn’t seem supported by any open source program I knew of. That really pissed me off because it looked like I would be forced to buy Guitar Pro for Linux in order to print the music I had written (I wanted to do so in order to distribute it to my bandmates). After searching the net for a while I found the excellent alphaTab library, but it seemed to not recognise many of the parts I had written, which was a disappointment. See below for the nerdy details, but long story short I improved slightly alphaTab to support Guitar Pro mobile’s .gpx files so now I can print all the music I write, w00t! You can download a little Guitar Pro file viewer/printer I wrote using alphaTab. It’s basically a webpage that can show Guitar Pro files, see the README.txt for details.

Now on to the technical details. You can skip the rest of the blog post if you’re not interested ;-) alphaTab is an open source library that can read several Guitar Pro formats. It can play them, render them in the browser, and do other things. It’s written in a language called Haxe, which compiles to, among others, Javascript. If you download the alphaTab distribution you’ll get a Javascript version of the software that you can use directly in your browser, which is really cool and already does a bunch of stuff, but there were two changes I wanted to do: fix the bug that made it not understand most of the files Guitar Pro mobile produced, and add an option to upload files from the browser (instead, the example programs read tabs directly from a path relative to the programs).

For the first, I debugged a bit and quickly realised that the problem was that alphaTab was being too strict: Guitar Pro mobile was producing some “empty” notes and beats, and that made alphaTab consider the file corrupt and not show it at all. Adding some code to ignore those empty notes seemed enough to make alphaTab render the file. I filed bug #31 on GitHub and added the ugly patch I made :-)

For the second, as the alphaTab API needed a URL to load the tablature file, I had to learn a bit more about the Javascript File API and be a bit creative, replacing the file loader in alphaTab with something that would load the local file using the FileReader object, as you can see here:

function handleFileSelect(evt) {
  // Fake the getLoader function so make it support data
  // URIs: apparently jQuery can't make an Ajax request to
  // a data URI so this was the best I could think of.
  alphatab.platform.PlatformFactory.getLoader = function() {
    return {
      loadBinary: function(method, file, success, error) {
        var reader = new FileReader();
        reader.onload = function(evt) {
          var r = new alphatab.platform.BinaryReader();
          r.initialize(evt.target.result);
          success(r);
        };
        reader.readAsBinaryString(file);
      }
    };
  };
  document.getElementById('files').style.display = 'none';
  $('div.alphaTab').alphaTab({file: evt.target.files[0]});
}

With these two changes, alphaTab finally does what I need, so I don’t need to buy Guitar Pro for Linux just to print tabs. I might buy it anyway for other reasons, but it’s nice to not be forced to do so ;-)

I hope this code and small program is useful to someone. If not, at least I have solved a pretty annoying problem for myself.

Exceptions in Node

Whoa, boy. It seems I haven’t written for a good while now. Let’s fix that. One of the things I had in my list of possible posts was my experiments (and frustrations) with Javascript exception classes under Node, so here we go:

I needed to have several exception classes in Javascript (concretely, for RoboHydra, which works under Node). My first attempt looked something like this:

function NaiveException(name, message) {
 Error.call(this, message);
 this.name = name;
 this.message = message;
}
NaiveException.prototype = new Error();

That seemed to work well, except that the stacktrace generated by such a class doesn’t contain the correct name or the message (notice how I even try to set the message after inheriting from Error, to no avail). My second-ish attempt was to try and cheat in the constructor, and not inherit but return the Error object instead:

function ReturnErrorException(name, message) {
    var e = Error.call(this, message);
    e.name = name;
    return e;
}
ReturnErrorException.prototype = new Error();

That did fix the stacktrace problem, but breaks instanceof as the object will be of class Error, not ReturnErrorException. That was kind of a big deal for me, so I kept trying different things until I arrived at this monster:

function WeirdButWorksException(name, message) {
    var e = new Error(message);
    e.name = name;
    this.stack = e.stack;
    this.name = name;
    this.message = message;
}
WeirdButWorksException.prototype = new Error();

This is the only code that seems to do what I want (except that the stack trace is slightly wrong, as it contains an extra line that shouldn’t be there). I tried in both Node 0.6 and Node 0.8 and the behaviour seems to be the same in both. In case you’re interested, here’s my testing code showing the behaviour of the different approaches:

// NO WORKY (breaks stacktrace)
function NaiveException(name, message) {
    Error.call(this, message);
    this.name = name;
    this.message = message;
}
NaiveException.prototype = new Error();
 
// NO WORKY (breaks instanceof; also stacktrace w/ 2 extra lines)
function ReturnErrorException(name, message) {
    var e = Error.call(this, message);
    e.name = name;
    return e;
}
ReturnErrorException.prototype = new Error();
 
// WORKS (but has extra stacktrace line)
function WeirdButWorksException(name, message) {
    var e = new Error(message);
    e.name = name;
    this.stack = e.stack;
    this.name = name;
    this.message = message;
}
WeirdButWorksException.prototype = new Error();
 
[NaiveException,
 ReturnErrorException,
 WeirdButWorksException].forEach(function(eClass) {
    var e = new eClass('foo', 'bar');
 
    console.log(e.stack);
    console.log(e instanceof eClass);
    console.log(e instanceof Error);
});

It feels really strange to have to do this to get more or less proper exceptions under Node, so I wonder if I’m doing anything wrong. Any tips, pointers or ideas welcome!

Gustav Vigeland’s sculpture park

Gustav Vigeland‘s sculpture park in Frognerparken is without doubt my favourite part of Oslo. It’s “simply” a collection of sculptures of people doing different things, but ever since the first time I saw it I fell in love with the park. I have been there many times and I have taken many pictures of the sculptures, and when I went there again about a week ago I remembered how much I like it and decided it was about time I wrote about it and made my personal “ode” to it.

The most famous sculpture is the “Angry Boy”, a sculpture of a little boy crying. While it’s expressive, funny and original, I think it’s a pity that many visitors seem to only pay attention to that one, and miss the dozens of amazing sculptures around it.

The reason why I like these sculptures so much is that, in my view, they represent the essence of what is being human. They are completely stripped down, timeless and lacking unnecessary elements. Adding clothes to these sculptures wouldn’t work because they would make them belong to a concrete time and culture, and thus lose their expressive power. I also like the nakedness because it reminds me of how clothes and many other social conventions often hide how similar we all are, and how we often forget what really matters and what doesn’t. Thus, it’s no surprise I get annoyed when people refer to it as the “park with the naked sculptures” :-) They’re indeed naked, but that’s missing the point of the park miserably.

When I think about why I like these sculptures so much, I can’t help thinking about the book “Technopoly” and the book (and movie) “The Road“. I see all three as being about being human and about stopping for a second, forgetting about all the things you assume (as part of your everyday life in whatever society you live in) and considering what you think is actually important; what is “essentially human” and what is simply a detail of the current culture and time; what is strictly necessary and what “needs” are artificial.

If you have never been in the park, here’s a collection of pictures I have taken (the one above is by Dion Hinchcliffe). You can see the full version of these pictures and more on Flickr:

Book summary: Eating Animals

This is my summary of the book “Eating Animals” by Jonathan Safran Foer. Contrary to what the title may suggest, it’s not a “vegetarian book” defending some variant of the argument “animals are cute, don’t kill them”: it’s a book about factory farming (it’s true that one of the conclusions is that you essentially have to go vegetarian to avoid factory farming meat, but this book is for anyone interested in how food is produced). Sadly I had to skip many interesting stories and data in order to give the summary some continuity.

Edit: corrected statement “total collapse of all fished species in 50-100 years” to read “we have depleted large predatory fish communities worldwide by at least 90% over the past 50–100 years” (the article it linked had the second statement, not the first; although it does say “We conclude that today’s management decisions will determine whether we will enjoy biologically diverse, economically profitable fish communities 20 or 50 years from now, or whether we will have to look back on a history of collapse and extinction that was not reversed in time”). I think the original statement is true, but I couldn’t find a reference for it.

Factory farming

Factory farming (and industrial fishing) is a mindset: reducing production costs to the absolute minimum, ignoring or “externalising” costs such as environmental degradation, human disease or animal suffering. Nature becomes an obstacle to overcome.

Factory farming possibly accounts for more than 99% of all animals used for meat, milk or eggs. As for industrial fishing, we have depleted large predatory fish communities worldwide by at least 90% over the past 50–100 years (see also Sylvia Earle’s TED talk, not mentioned in the book but related). It doesn’t help that the so-called “bycatch” is actually much more than the actual fish: typically 80% to 90% (and up to around 98%), which is tossed back (dead) into the ocean.

Diseases/shit

There is scientific consensus that new viruses, which move between animals and humans, will be a major global threat into the foreseeable future. According to the WHO the “World is ill-prepared for ‘inevitable’ flu pandemic“. The factory farm conditions encourage diseases in animals (some of them virtually unknown outside of factory farming), that end up in the actual food in the supermarkets. It’s even worse considering the animals are constantly fed with antibiotics (livestock gets almost 6 times more antibiotics than humans… if you trust the industry’s own numbers!), making the resulting diseases much harder to fight off for humans. The whole chapter 5 is filled with descriptions of filthy, dangerous and disgusting practices that are absolutely common and normal in (US, at least) factory farming.

Factory farming animal shit is a big problem both because of the quantity and for being so poorly managed: it kills wildlife and pollutes air, water, and land in ways devastating to human health. Its polluting strength is 160 times greater than municipal sewage, and yet there’s almost no waste-treatment infrastructure for farmed animals. Ignoring these problems are part of why factory farming is so “efficient”. The problem, of course, is not the shit in itself but the desire to eat so much meat and pay very little for it.

Environment

Simply put, someone who eats factory farmed animals regularly can’t call herself an environmentalist.

It takes 6 to 26 calories fed to an animal to produce 1 calorie of animal flesh (the vast majority of the food produced in the US is fed to animals). The UN special envoy of food called using 100 million tons of grain and corn a “crime against humanity”, but what about animal agriculture, which uses more than 700 million tons of grain and corn per year, much more than enough to feed the 1.4 billion humans in poverty?

The FAO/UN summarised in “livestock’s long shadow — environmental issues and options” (which has been criticised BTW!):

The livestock sector emerges as one of the top two or three most significant contributors to the most serious environmental problems, at every scale from local to global. The findings of this report suggest that it should be a major policy focus when dealing with problems of land degradation, climate change and air pollution, water shortage and water pollution and loss of biodiversity. Livestock’s contribution to environmental problems is on a massive scale [...]

Factory farmer perspective

Some interesting comments from a factory farmer (note that I don’t find them convincing, but there are some good points that need to be explained or considered when proposing alternatives to factory farming):

In fact, we have a tremendous system. Is it perfect? No. [...] And if you find someone who tells you he has a perfect way to feed billions and billions of people, well, you should take a careful look. [...] If we go away from it, it may improve the welfare of the animal, it may even be better for the environment, but I don’t want to go back to [...] starving people.  [...] Sure, you could say that people should just eat less meat, but I’ve got news for you: people don’t want to eat less meat. [...] What I hate is when consumers act as if farmers want these things, when it’s consumers who tell farmers what to grow. They’ve wanted cheap food. We’ve grown it. [...] It’s efficient and that means it’s more sustainable.

Closing thoughts

We shouldn’t kid ourselves about the number of ethical eating options. There isn’t enough nonfactory pork in the US to serve New York City. Any ethical-meat advocate who is serious is going to be eating a lot of vegetarian fare.

Ending factory farming will help prevent deforestation, curb global warming, reduce pollution, save oil reserves, lessen the burden on rural America, decrease human rights abuses, improve public health, and help eliminate the most systematic animal abuse in world history.

A good number of people seem to be tempted to continue supporting factory farms while also buying meat outside that system when it is available. [...] Any plan that involves funnelling money to the factory farm won’t end factory farming [...] If anyone find in this book encouragement to buy some meat from alternative sources while buying factory farm meat as well, they have found something that isn’t here.

Other quotes

I can’t count the number of times that upon telling someone I am vegetarian, he or she responded by pointing out an inconsistency in my lifestyle or trying to find a flaw in an argument I never made (I have often felt that my vegetarianism matters more to such people than it does to me).

Virtually all of us agree that it matters how we treat animals and the environment, and yet few of us give much thought to our most important relationship to [them]. Odder still, those who do choose to act in accordance to these uncontroversial values by refusing to eat animals [...] are often considered marginal or even radical.

It might sound naive to suggest that whether you order a chicken patty or a veggie burger is a profoundly important decision. Then again, it certainly would have sounded fantastic if in the 1950s you were told that where you sat in a restaurant or on a bus could begin to uproot racism.

We can’t plead ignorance, only indifference [...] We are the ones of whom it will be fairly asked, What did you do when you learned the truth about eating animals?

It shouldn’t be the consumer’s responsibility to figure out what’s cruel and what’s kind, what’s environmentally destructive and what’s sustainable. Cruel and destructive food products should be illegal. We don’t need the option of buying children’s toys made with lead paint, or aerosols with chlorofluorocarbons, or medicines with unlabelled side effects. And we don’t need the option of buying factory-farmed animals.

Small experiments with Cherokee

A couple of weeks ago I decided to move my wiki (see Wiki-Toki on GitHub) and my package repository (see Arepa on CPAN) over to a new machine. The idea was to move it to some infrastructure I “controlled” myself and was paying for (mainly inspired by the blog post “A Set of Tenets I Have Set Myself“). As I was curious about Cherokee and this was an excellent opportunity to learn it, I decided to use it as the web server.

I have to say I was pretty impressed by how easy it was to set it up. Although I did have several small problems, most of them were less related to Cherokee itself, and more related to me not being very familiar with Node application configuration outside of Joyent’s environment, or FastCGI configuration. In particular, the web-based configuration is brilliant: you don’t have to open or know the format of any configuration files, but instead configure everything from a pretty powerful UI (which in the end writes a text configuration file of course, so you can always automate or generate the configuration if you need to). I even knew this already, but seeing it in action was pretty impressive. To avoid security problems with people accessing that configuration interface, there’s this little tool called cherokee-admin that starts another web server with the configuration interface (tip: pass the -b option without parameters if you want to connect to it from a different machine, which is the case unless you’re installing Cherokee in your own PC). On start it generates a random admin password, which you use to login.

Static content serving, CGI, FastCGI, specifying certain error codes for certain paths, and reverse proxying was all very easy to set up. There was only a small problem I bumped into: tweaking URLs in reverse-proxied responses. In my situation, I was doing reverse proxying from port 443 to port 3000. As the final application didn’t know about the proxy, it generated URL redirections to “http://…:3000/” instead of “https://…/”, so part of the process of proxying was fixing those URLs. Cherokee, of course, supports this out of the box, in a section called “URL Rewriting”. Each entry in that section takes a regular expression and a substitution string. My first attempt (“http://example.com:3000/” -> “https://example.com/”) didn’t work: all URL redirections were changed to “https://example.com/”, disregarding the rest of the URL. After some time trying different things, I decided to try with “http://example.com:3000/(.*)” and “https://example.com/$1″. As it turns out, that worked like a charm! The documentation does mention that it uses Perl-compatible regular expressions, but I thought the HTTP reverse proxy documentation could have been more explicit in this regard.

But apart from that detail, everything was very smooth and I’m very, very happy with it :-)

Book summary: Living with Complexity

This is my summary of the book “Living with Complexity”, by Donald A. Norman (of “The Design of Everyday Things” fame, among others). It’s a book about how it’s wrong to consider complexity (in design) a bad thing. I have skipped a fair deal of the text in the summary, mostly because it wasn’t applicable to my job. I have also reorganised the content a lot.

Main message

Complexity in itself is neither good nor bad, it’s confusion that is bad. Good design can help taming complexity, not by making things less complex, but by managing the complexity. There are two keys to cope with complexity: design that show its underlying logic to make it understandable, and our own skills and understanding of that logic. People need to take the time to learn, understand, and practice, because systems that deal with complex activities will not be immediately understandable, no matter how well they’re designed.

The misguided cry for simplicity

The argument between simplicity and features is misguided: people want more capabilities and more ease of use, but not necessarily more features or more simplicity. What people want is understandable devices. It’s not even a trade-off between simplicity and complexity because (a) simplicity is not the goal, and (b) you don’t have to give up something in order to have more simplicity.

We seek rich and satisfying lives, which goes along with complexity. Too simple and it’s dull; too complex and it’s confusing. This happens in every subject (also music, movies, etc). The problem is when complexity is arbitrary and unnecessary.

What makes something simple or complex is the conceptual model of the person using it: many activities we think of “intuitive” now, like swim or ride a bike, have taken us years to learn and master. We can often use something by following instructions, without having a good conceptual model. But then we run into problems, and we complain that it’s too complicated. Comparisons with tools like hammers are wrong, because (1) even if the tool is simple, mastering its usage actually takes a long time; and (2) users of those tools typically need many of them, and each one has to be mastered separately.

On anti-social machines

Machines are often anti-social when things go wrong: they become unresponsive and unhelpful, like bureaucracies. The problem is that the designer typically focuses on correct behaviour (ie. not a lot of work on error conditions). This is worse when systems are designed by engineers who view things from their logical point of view, feel people “get in the way” (the “foolproof” systems syndrome).

We try to add intelligence to machines, but what they need (and this is seldom considered) is communication skills, etiquette and communication. We often need to change our goals in the middle of an operation, or want to substitute or skip some steps, or do them in a different order, or are interrupted and have to finish a task at a later point. Most systems don’t allow this. Isolated, context-free intelligent tools can’t be sociable.

System thinking

System thinking (considering the entire process as a human-centred design) is the secret to success in services. This is part of what has made Apple so successful: (1) design cohesive systems, not isolated products; (2) recognising that a system is only as good as the weakest link, and (3) design for the total experience.

Desire lines (like messy trails besides the designed paths in the real world) can often teach us something about what people want and how the design might have gone wrong. Neglect of usage patterns can turn simple, attractive items into complicated, ugly ones, and also turn simple components into a complicated combination when they are put together. Everyday life is often complex not because of a single complex activity, but because of many simple activities with different requirements or idiosyncrasies.

Waiting lines

Waiting lines have been studied from the efficiency point of view. But what about the experience? Principles to improve the latter:

  1. Provide a conceptual model (perhaps the most important principle): uncertainty is an important cause of emotional irritation; people need assurance when they have problems.
  2. Make the wait seem appropriate: people should know why they wait, and agree that it’s reasonable.
  3. Meet of exceed expectations: always overestimate waiting time.
  4. Keep people occupied: it’s psychological time, not physical, that is important. Give people things to do, keep lines moving and make them appear short.
  5. Be fair: emotion is heavily influenced by perceived causal agents.
  6. End strong, start strong: in order of importance, the end, start, and middle are the most important to take care of.

The memory of the whole event is more important than the details of what happened. When there are positive and negative components, it’s best to: finish strong, segment the pleasure but combine the pain, and getting bad experiences out of the way early.

Principles to manage complexity

Note that this list is heavily edited compared to the book.

  1. Make things understandable. Good conceptual models have to be communicated effectively. Jurg Nievergelt and J. Weydert argued for the importance of three knowledge states: sites, modes and trails, which can be translated into three needs, namely knowledge of the past (knowing how we got into the present state; many systems erase the past so we may not know how we got there), present (knowing the current state, how are we regarding starting point and goals, and what actions are possible now) and future (knowing what to expect).
  2. Avoid error messages. They indicate the system is confused, not the user. Don’t scold her. Good design means never have to tell the user “that was wrong”.
  3. Structure. Divide the task into manageable modules that are easy to learn, or find a different way to frame the problem so it’s less complicated.
  4. Automation. Many tasks can be simplified by automating them, but this only helps if it’s reliable: when there are errors, it can be more complex to switch back and forth between automated and manual than to not have any automation at all.
  5. Nudges and defaults. Sometimes forcing functions (constraints that prevent unwanted behaviour) are too strong, all is needed is a gentle nudge. Defaults are an effective way to simplify the interaction with the world, and a strong tool to drive people’s behaviour.
  6. Learning aids. Instruction manuals are rarely read. When people use a new service or system, they have a goal, they’re not going to read the instructions first. Most people want “just in time” learning and learn better when they need to. The best explanations are in context and show how the task is done (short video demonstrations work well).

And that’s it. Hope you enjoyed  it.

Unit testing advice for seasoned hackers (2/2)

This is the second part of my unit testing advice. See the first part on this blog.

If you need any introduction you should really read the first part. I’ll just present the other three ideas I wanted to cover.

Focusing on common cases

This consists of testing only/mostly common cases. These tests rarely fail and give a false sense of security. Thus, tests are better when they also include less common cases, as they’re much more likely to break inadvertently. Common cases not only break far less often, but will probably be caught reasonably fast once someone tries to use the buggy code, so testing them has comparatively less value than testing less common cases.

The best example I found was in the wrap_string tests. The relevant example was adding the string “A test of string wrapping…”, which wraps not to two lines, but three (the wrapping is done only on spaces, so “wrapping…” is taken as a single unit; in this sense, my test case could have been clearer and use a very long word, instead of a word followed by ellipsis). Most of the cases we’ll deal with will simply wrap a given word in two lines, but wrapping in three must work, too, and it’s much more likely to break if we decide to refactor or rewrite the code in that function, with the intention to keep the functionality intact.

See other examples of this in aa20bce (no tests with more than one consecutive newline, no tests with lines of only non-printable characters), b248b3f (no tests with just dots, no valid cases with more than one consecutive slash, no invalid cases with content other than slashes), 5e771ab (no directories or hidden files), f8ecac5 (invalid hex characters don’t fail, but produce strange behaviour instead; this test actually discovered a bug), 7856643 (broken escaped content) and 87e9f89 (trailing garbage).

Not trying to make the tests fail

This is related to the previous one, but the emphasis is on trying to choose tests that we think will fail (either now or in the future). My impression is that people often fail to do this because they are trying to prove that the code works, which misses the point of testing. The point is trying to prove the code doesn’t work. And hope that you fail at it, if you will.

The only example I could find was in the strcasecmpend tests. Note how there’s a test that checks that the last three characters of string “abcDEf” (ie. “DEf”) is less than “deg” when compared case-insensitively. That’s almost pointless, because if we made that same comparison case-sensitively (in other words, if the “case” part of the function breaks) the test still passes! Thus it’s much better to compare the strings “abcdef” and “Deg”.

Addendum: trying to cover all cases in the tests

There’s another problem I wanted to mention. I have seen several times before, although not in the Tor tests. The problem is making complicated tests that try to cover many/all cases. This seems to stem from the idea that having more test cases is good by itself, when actually more tests are only useful when they increase the chances to catch bugs. For example, if you write tests for a “sum” function and you’re already testing [5, 6, 3, 7], it’s probably pointless to add a test for [1, 4, 6, 5]. A test that would increase the chances of catching bugs would probably look more like [-4, 0, 4, 5.6] or [].

So what’s wrong with having more tests than necessary? The problem is they make the test suite slower, harder to understand at a glance and harder to review. If they don’t contribute anything to the chance of catching bugs anyway, why pay that price? But the biggest problem is when we try to cover so many test cases than the code produces the test data. In this cases, we have all the above problems, plus that the test suite becomes almost as complex as production code. Such tests become much easier to introduce bugs in, harder to follow the flow of, etc. The tests are our safety net, so we should be fairly sure that they work as expected.

And that’s the end of the tips. I hope they were useful :-)

Unit testing advice for seasoned hackers (1/2)

When reviewing tests written by other people I see patterns in the improvements I would make. As I realise that these “mistakes” are also made by experienced hackers, I thought it would be useful to write about them. The extra push to write about this now was having concrete examples from my recent involvement in Tor, that will hopefully illustrate these ideas.

These ideas are presented in no particular order. Each of them has a brief explanation, a concrete example from the Tor tests, and, if applicable, pointers to other commits that illustrate the same idea. Before you read on, let me explicitly acknowledge that (1) I know that many people know these principles, but writing about them is a nice reminder; and (2) I’m fully aware that sometimes I need that reminder, too.

Edit:  see the second part of this blog.

Tests as spec

Tests are more useful if they can show how the code is supposed to behave, including safeguarding against future misunderstandings. Thus, it doesn’t matter if you know the current implementation will pass those tests or that those test cases won’t add more or different “edge” cases. If those test cases show better how the code behaves (and/or could catch errors if you rewrite the code from scratch with a different design), they’re good to have around.

I think the clearest example were the tests for the eat_whitespace* functions. Two of those functions end in _no_nl, and they only eat initial whitespace (except newlines). The other two functions eat initial whitespace, including newlines… but also eat comments. The tests from line 2280 on are clearly targeted at the second group, as they don’t really represent an interesting use case for the first. However, without those tests, a future maintainer could have thought that the _no_nl functions were supposed to eat whitespace too, and break the code. That produces confusing errors and bugs, which in turn make people fear touching the code.

See other examples in commits b7b3b99 (escaped ‘%’, negative numbers, %i format string), 618836b (should an empty string be found at the beginning, or not found at all? does “\n” count as beginning of a line? can “\n” be found by itself? what about a string that expands more than one line? what about a line including the “\n”, with and without the haystack having the “\n” at the end?), 63b018ee (how are errors handled? what happens when a %s gets part of a number?), 2210f18 (is a newline only \r\n or \n, or any combination or \r and \n?) and 46bbf6c (check that all non-printable characters are escaped in octal, even if they were originally in hex; check that characters in octal/hex, when they’re printable, appear directly and not in octal).

Testing boundaries

Boundaries of different kinds are a typical source of bugs, and thus are among the best points of testing we have. It’s also good to test both sides of the boundaries, both as an example and because bugs can appear on both sides (and not necessarily at once!).

The best example are the tor_strtok_r_impl tests (a function that is supposed to be compatible with strtok_r, that is, it chops a given string into “tokens”, separated by one of the given separator characters). In fact, these extra tests discovered an actual bug in the implementation (ie. an incompatibility with strtok_r). Those extra tests asked a couple of interesting questions, including “when a string ends in the token separator, is there an empty token in the end?” in the “howdy!” example. This test can also be considered valuable as in “tests as spec”, if you consider that the answer to be above question is not obvious and both answers could be considered correct.

See other examples in commits 5740e0f (checking if tor_snprintf correctly counts the number of bytes, as opposed the characters, when calculating if something can fit in a string; also note my embarrassing mistake of testing snprintf, and not tor_snprintf, later in the same commit), 46bbf6c (check that character 21 doesn’t make a difference, but 20 does) and 725d6ef (testing 129 is very good, but even better with 128—or, in this case, 7 and 8).

Testing implementation details

Testing implementation details tends to be a bad idea. You can usually argue you’re testing implementation details if you’re not getting the test information from the APIs provided by whatever you’re testing. For example, if you test some API that inserts data in a database by checking the database directly, or if you test the result of a method call was correct by checking the object’s internals or calling protected/private methods. There are two reasons why this is a bad idea: first, the more implementation details you tests depend on, the less implementation details you can change without breaking your tests; second, your tests are typically less readable because they’re cluttered with details, instead of meaningful code.

The only example I encountered of this in Tor were the compression tests. In this case it wasn’t a big deal, really, but I have seen this before in much worse situations and I feel this illustrates the point well enough. The problem with that deleted line is that it’s not clear what’s it’s purpose (it needs a comment), plus it uses a magic number, meaning if someone ever changes that number by mistake, it’s not obvious if the problem is the code or the test. Besides, we are already checking that the magic number is correct, by calling the detect_compression_method. Thus, the deleted memcmp doesn’t add any value, and makes our tests harder to read. Verdict: delete!

I hope you liked the examples so far. My next post will contain the second half of the tips.