Negative Decrease from LinkedIn Not Misunderstood Incorrectly

This is more of a rant, but there is a point regarding cognitive friction in interfaces, in this case a simple email message.

The other day I received this email message, and yes, I’m pedantic enough to insist that they are “email messages”, not “emails”, or heaven forfend, “e-mails”. Oh, and on behalf of the C programmer community, thank you, gcc, for teaching us the word “pedantic”. I’m also pedantic enough to have written an application for spell-checking Javadoc and strings in Java doc.

Back to my rant, here is the message:

So … what is a “-11% decrease”? Isn’t a negative decrease an increase? But that down arrow before the minus sign is bothersome as well. So, is it a down -11% decrease?

The subject of the email message was

Add skills like “Swing” to make your profile easier to find

Just very quickly: “like” doesn’t mean “such as”: “like” means “similar to but not x”, whereas “such as” means “x or similar”.

Second point: “Swing” should be in quotation marks if it’s a quote. Quotation marks are not for emphasizing words. (Yeah, and it’s ironic that this WordPress theme (Reddle) precedes block quotes with an open quote, leading to me wondering whether posting this would fail with the message “error: missing unterminated ‘”‘ character”.)

Oh, and if you’re curious about my LinkedIn profile, here it is.

Efficient Way to Improve Console Productivity

One of the principles of efficiency when programming is to avoid unnecessary repetition (DRY), be it in code, or in physical and mental tasks. My goal is to write at least one tweak per day to my environment, whether a macro in my editor (Emacs), or a shell script or alias in my console environment, which is Z shell. (Note that this applies to other shells also.)

Reading Matthew Might’s article on console hacks inspired me to use that to quickly and easily find candidates for aliases and functions in Z shell.

The following command sorts the history by frequency of use:

% history -1000000 | cut -c8- | sort | uniq -c | sort -n | tail -15

The output from my recent work with DoctorJ is:

     11 cd ~ijdkproj
     12 gitpullall
     12 myprod.rb
     12 rake
     13 ..
     14 gitdfs
     15 c
     15 la
     20 sd
     24 scrub
     27 git status
     28 git push -u origin master
     37 git diff
     39 ls
     51 gct
     52 gitdelta.rb

(If you’re wondering what the “..” command is, that’s an alias for “cd ..”.)

In the frequency list the salient command in terms of length is “git push -u origin master”, executed 28 times. So we can consider an alias for it, such as “gpom”, and see whether it already exists as a function, alias, or program:

% which gpom
gpom not found

Since it doesn’t already exist, we can add it to the list of aliases (mine are in ~/System/Zsh/aliases.zsh):

alias gpom=’git push -u origin master’

On Github I’ve posted my Z shell configuration, with the updated aliases file here.

Unit Testing Principles

Unit testing is highly emphasized for writing the initial version of code, the test-driven design approach. But I think that unit tests are even more important in what I consider to be the most difficult phase of a project: maintenance. As a project ages, its code base grows in size and complexity, and it is absolutely necessary to go back into a project and refactor it, eliminating obsolete code and merging common code.

That is where unit tests are their most valuable, in keeping the project stable. The tests provide something that is often no longer on the project: the original concept. That is, the programmer who wrote the first version years before may now not even be at the same organization/company, so only the unit tests he wrote can “speak” for his intent.

Having rewritten DoctorJ and DiffJ lately, I was very thankful to have a full set of tests, which allowed me to refactor the projects with more confidence.

Unit tests should be written starting by testing the most basic functionality first, with the assumption that even the simplest code does not work. I thought it was overkill when I was writing tests like:

    Foo f = new Foo(x, y);
    assertEquals(y, f.getY());

until the first time I saw something more or less like:

    public int getX() {
        return x;
    }

    public int getY() {
        return x;
    }

That is one reason to avoid copying and pasting.

A guideline that I like is to try to have a test case per main (meaning non-test) classes, and to have the overall test code exceed the size of the code being tested, in number of lines. That makes writing tests a goal in themselves, as opposed to just being a means.

One of the problems with some unit test frameworks – notably, in my experience, Ruby’s Test::Unit and Java’s JUnit – is that assertions tend to be so low level that one has to read the original code to understand the context. For example, this output from JUnit doesn’t provide enough information:

junit.framework.AssertionFailedError: expected:<8> but was:<null>
	at junit.framework.Assert.fail(Assert.java:50)
	at junit.framework.Assert.failNotEquals(Assert.java:287)
	at junit.framework.Assert.assertEquals(Assert.java:67)
	at junit.framework.Assert.assertEquals(Assert.java:74)
	at org.incava.ijdk.util.TestListExt.testGet(TestListExt.java:30)

All that is known from that output is that the ListExt.get() method was being tested, and that the null was returned instead of 8. Looking at the source provides some more information:

        assertEquals(new Integer(8), ListExt.get(NUMS, -1));

Sort of. We have yet another step now, to look at the declaration of NUMS:

    public static final List<Integer> NUMS = java.util.Arrays.asList(new Integer[] { 2, 4, 6, 8 });

Thus it took three (or four) steps to get from the JUnit to what the failure actually meant, that “using the list ‘2, 4, 6, 8’, ListExt.get(-1) returned null instead of the expected 8″.

The test case can be significantly improved by following the guidelines for tests, that their output consist of:

  • the input
  • the expected result
  • the actual result

(Actually, that goes for bug reporting as well, or any kind of testing.)

One problem with unit tests is that they tend to consist of repeated code, such as:

        assertEquals(new Integer(2), ListExt.get(NUMS, 0));
        assertEquals(new Integer(4), ListExt.get(NUMS, 1));
        assertEquals(new Integer(6), ListExt.get(NUMS, 2));
        assertEquals(new Integer(8), ListExt.get(NUMS, 3));

Custom assertions – an assertion for testing a single method – reduces the repeated code:

    public <T> void assertListExtGet(T exp, List<T> list, int idx) {
        assertEquals(exp, ListExt.get(list, idx));
    }
        assertListExtGet(2, NUMS, 0);
        assertListExtGet(4, NUMS, 1);
        assertListExtGet(6, NUMS, 2);
        assertListExtGet(8, NUMS, 3);

Also note that autoboxing with JUnit 3.x and Java 1.5+ means that we have to resolve the ambiguity between whether assertEquals(Object, Object) or assertEquals(int, int) should be called, leading to the “new Integer(x)” calls in the first case, whereas with the refactored code, autoboxing is done with the call to assertListExtGet.

With that, we can refine the custom assertion to produce a more informative error message:

    public <T> void assertListExtGet(T exp, List<T> list, int idx) {
        String msg = "ListExt.get of '" + idx + "' in list: '" + list + "'";
        assertEquals(msg, exp, ListExt.get(list, idx));
    }

And the resulting output is very much like what we earlier had to discern by going through the code:

junit.framework.AssertionFailedError: ListExt.get of '-1' in list: '[2, 4, 6, 8]' expected:<8> but was:<null>
	at junit.framework.Assert.fail(Assert.java:50)
	at junit.framework.Assert.failNotEquals(Assert.java:287)
	at junit.framework.Assert.assertEquals(Assert.java:67)
	at org.incava.ijdk.util.TestListExt.assertListExtGet(TestListExt.java:21)
	at org.incava.ijdk.util.TestListExt.testGet(TestListExt.java:40)

The output is much more informative, and allows the debugging to start by bypassing the unit test code and going to the source.

Update: I’ve expanded on this with a series of posts I wrote while refactoring code in DoctorJ, beginning with this post.

Error Messages

Error messages should consist of both the input being handled, and what the actual problem was, as precisely as possible. Many (most?) error messages are just of the generic form “an error occurred” (and even that exact message), without any further information.

To illustrate the difference, some examples, from least to most informative:

	invalid account data
	Email invalid
	Email address invalid
	Email address in invalid format
	Email address not in the correct format
	Email address not in the correct format: foo.bar@baz
	Email address "foo.bar@baz" not in the correct format
	Email address "foo.bar@baz" not in the correct format: invalid domain name
	Email address "foo.bar@baz" not in the correct format: "baz" is not a valid domain name.
	User name invalid
	User name invalid: john gallagher
	User name "john gallagher" not valid: invalid character.
	User name "john gallagher" not valid: invalid character: " ".

Quoting the arguments is valuable, especially when they are strings and on the end of a message. For example, the second message below makes it clear that the cause of the error is that there are trailing spaces in the URL:

	Invalid URL: http://www.foo.com   
	Invalid URL: "http://www.foo.com   "

The principle of being precise means that error messages should not be of the format “x or y occurred”, for example, “The database could not be connected to or there was an error during initialization.”

As a programmer, I can clairvoyantly read that code:

    try {
        db.connect();
        db.initialize();
    }
    catch (Exception e) {
        throw new DBException("The database could not be connected to or there was an error during initialization.");
    }

That’s just irrational laziness, leading to user frustration (and calls to support). The basic principle is: give the user the information to solve their own problem.