A Day Refactoring Code, Part 3: Naming

“Naming is the essence of programming.” -Fred Brooks, The Mythical Man-Month

Continuing from part 2 of this series, let’s now revisit our names, now that we have a better understanding of the behavior here.

PosString hasn’t grown on me as a name, and taking a bit of inspiration from the GString in Groovy, let’s rename it PString, which suggests “Processable String” and “Positionable String”. We’ll also improve the field names “pos” and “len” by renaming them as “position” and “length”. “str” is also quite short, but is a common name for a string, whereas “pos” and “len” are less common.

Going back to ParsingSpellChecker (which I’ll refer to as PSC), PString now has all of its low-level string processing functionality. There are methods in ParsingSpellChecker that seem more specific to its domain, such as skipLink().

The edge condition here would be a method such as skipThroughWord(), which advances to the next character that is not a letter or digit. As the descriptions of the methods get longer, they are more likely to be too specific for usage in a general class.

So that’s it for PString. Let’s start migrating ParsingSpellChecker to use it. First is that we’ll keep the instance fields as public, and just switch the
references:

            this.pstr = new PString(str);
            this.str = pstr.str;
            this.len = pstr.length;

Through the code we replace “this.pos” with “this.pstr.position”, so again the low-level behavior is the same, just with ParsingSpellChecker being highly coupled to PString, for now.

We’ll do likewise with the len/length fields, so now our PSC methods have code such as:

    protected void consumeTo(String what) {
        while (this.pstr.position < this.pstr.length && this.pstr.position + what.length() < this.pstr.length && !this.str.substring(this.pstr.position).startsWith(what)) {
            ++this.pstr.position;
        }
    }

We’re midstream on this task, but given our status (show by running the alias for “git diff –stat HEAD”) we see that we have 52 changed lines of code, which is a reasonable amount of code for a checkin:

~doctorj% gitdfs
 .../org/incava/text/spell/ParsingSpellChecker.java |   52 +++++++++-----------
 1 files changed, 23 insertions(+), 29 deletions(-)

We check it in and get back to work:

~doctorj% git commit -m "Began transition for ParsingSpellChecker to use PString." .
[master 95db8c1] Began transition for ParsingSpellChecker to use PString.
 1 files changed, 23 insertions(+), 29 deletions(-)

We remove this.str references, replacing them with “this.pstr.str”, then began “flipping” the methods in PSC to call their PString equivalents, such as this change (diff) for currentChar():

     protected Character currentChar() {
-        return this.str.charAt(this.pstr.position);
+        return this.pstr.currentChar();
     }

We make the transition a bit easier by renaming “consume()” to “consumeFrom()”, matching our PString method advanceFrom().

We go through the PSC code, replacing usage of PString fields with accessors and the equivalent mutators, such as in skipBlanks():

    // before

    protected void skipBlanks() {
        while (this.pstr.position + 2 < this.pstr.length && currentChar() != '<' && !this.str.substring(this.pstr.position, this.pstr.position + 2).equals("{@") && !Character.isLetterOrDigit(currentChar())) {
            ++this.pstr.position;
        }
    }

    // after

    protected void skipBlanks() {
        while (this.pstr.hasChar() && !this.pstr.isMatch("<") && !this.pstr.isMatch("{@") && !Character.isLetterOrDigit(currentChar())) {
            this.pstr.advancePosition();
        }
    }

That’s still the worst offender in terms of “width”, the number of characters wide a line is, which is a good way to find candidate code to be rewritten “skinnier”. (Other ways to find candidate code: by number of lines, and by depth of nested blocks, a variant of width).

In the next part of this series we’ll wrap it up with a look at performance and code quality.

Advertisements

A Day Refactoring Code, Part 2: Unit Testing

Continuing from part 1, we’ll begin writing test code from low-level functionality to higher-level, essentially paralleling the class as it grows in functionality as well.

It might seem like overkill to test very low-level code (such as getX()), but without low-level tests, a programmer may find himself analyzing the code at an improper level, mistaking for complex errors problems that are more fundamental.

Not testing low-level functionality led to a problem when I was prototyping a Javadoc parser (for DoctorJ). I wrote the original prototype (redundancy alert) in Ruby, then began migrating it to JRuby (intending to link it with Java code).

When I switched from Ruby to JRuby, there were failed unit tests, and I assumed that the problem was in my code, not in JRuby. (Yes, I adhere to the principle that “select ain’t broken”.)

Eventually, after much work, I discovered that a bug in the Regexp.last_match functionality in JRuby meant that the very high-level parsing functionality unit tests failed, and it took a lot of effort to drill down through the parsing code to discover that the issue was with the Regexp engine itself. I was thankful for the opportunity to contribute a fix for the issue, which was released with JRuby 1.6.5.

If I had had simpler tests, I probably would have found the issue sooner.

Back to unit testing in general, a good rule is to write tests in the order of the number of instance variables they test, either directly or indirectly. Thus the simplest tests would invoke methods that use only one instance variable (such as getX and setX). Subsequent tests should accrete in complexity, testing more instance variables.

This is essentially taking the concept from integration tests (multiple components) within unit testing itself, seeing variables (and sequences of methods and usage of multiple classes) as components as well. Thus unit testing begins with the smallest *unit* (variables), and expands to the point of being in the gray area where integration tests begin.

Back to this example, the next method to test is the one returning the character at the current position. Again we’ll write a custom assertion, then the test itself.

Implementing the tests such that it advances the position beyond the end of the string results in an exception:

        // out of range:
        str0.advancePosition(12);
        assertCurrentChar(null, str0);
java.lang.StringIndexOutOfBoundsException: String index out of range: 15
	at java.lang.String.charAt(String.java:694)
	at org.incava.text.PosString.currentChar(PosString.java:75)
	at org.incava.text.TestPosString.assertCurrentChar(TestPosString.java:36)
	at org.incava.text.TestPosString.testCurrentChar(TestPosString.java:84)

That shows that currentChar() should be updated to test that it is within bounds:

from:

        return this.str.charAt(this.pos);

to:

        return this.pos < this.len ? this.str.charAt(this.pos) : null;

The next test is for the hasMore method. What is important here is that we are testing the edge condition, for when pos is 1 less than len, is equal to len, and is 1 greater than it.

        str0.advancePosition();
        assertHasMore(true, str0);

        str0.advancePosition(12);
        assertHasMore(true, str0);

        str0.advancePosition();
        assertHasMore(false, str0);

        str0.advancePosition();
        assertHasMore(false, str0);

This seems to work … except that we didn't actually test the edge condition by hand. Why not? Laziness (false), where we just assumed that the string is 13 characters long.

Simplicity

This highlights another tenet of good unit testing: simplicity. The string we tested is so long ("this is a test" is my default string for string/regexp testing) that we didn't do the test "manually".

It's been a while that we've been rewriting the code, so we can consider checking this into our VCS, or we can just keep going.

When I'm on a multi-person project, I (nearly) never check in code that will fail the unit tests. In contrast, in my local environment, I find it easier to restart work after a break (especially when starting work in the morning) to have the project in the state of a failed unit test. This immediately gets me to the latest point of progress, and focuses my attention on the specific problem I was last working on. Buffer lists just aren't enough, since they only show me the latest files that I was working on, but not exactly what.

So we check in the code, then get back to the hasMore method. We made a mistake by, again, starting with overly complicated tests, in this case a string that was too long, so we'll go simple here, adding the field, and amending the method:

    private PosString abc;

    public void testHasMore() {
        assertHasMore(true, abc); // at 'a'
        // no change
        assertHasMore(true, abc);

        abc.advancePosition();  // at 'b'
        assertHasMore(true, abc);

        str0.advancePosition(); // at 'c'
        assertHasMore(true, abc);

        str0.advancePosition(); // off end
        assertHasMore(false, str0);
    }

Did you see the error? That was the first version that I wrote, which led to this failure:

junit.framework.AssertionFailedError: 'this is a test' (2) expected:<false> but was:<true>
	at org.incava.text.TestPosString.assertHasMore(TestPosString.java:44)
	at org.incava.text.TestPosString.testHasMore(TestPosString.java:123)    

‘this is a test’? This error was caused by my not doing a replace-all within the method, instead editing it “by hand”. (I’m convinced that copying and pasting, combined with search and replace, are responsible for 77.1% of software problems.)

We’ll fix our method:

    public void testHasMore() {
        assertHasMore(true, abc); // at 'a'
        // no change
        assertHasMore(true, abc);

        abc.advancePosition();  // at 'b'
        assertHasMore(true, abc);

        abc.advancePosition(); // at 'c'
        assertHasMore(true, abc);

        abc.advancePosition(); // off end
        assertHasMore(false, abc);
    }

Okay, that works, but should it? Is there “more” when the position is at the last character? I say no, so we change the hasMore code to call the method with num == 1:

    public boolean hasNumMore(int num) {
        return this.pos + num < this.len;
    }

We update the test:

        abc.advancePosition(); // at 'c'
        assertHasMore(false, abc);

Now the unit tests run without failure. Note that parameters to methods, and variables in general, have a length in characters proportionate to their scope. It's overkill to write "indexOfTheArray" in a small for-loop, whereas having instance variables named "td" are inadequately descriptive.

Writing the hasNumMore tests gives us a chance to use the Range class from the IJDK library, making for a bit terser and cleaner syntax:

    public void testHasNumMore() {
        for (int i : new Range(0, 10)) {
            assertHasNumMore(true, tenletters, i);
        }
    }

Here we're using another test string, "tenletters", which happens to be ten letters long. Fun, huh?

This is an excellent example of why custom assertions are valuable. Within a loop (checking a list of values), without a message for the assertEquals(), we would not know the value for which the assertion failed. But in this case, with the custom assertion, we get the output showing the value (10) that failed:

junit.framework.AssertionFailedError: pstr: ''tenletters' (0)'; num: 10 expected:<true> but was:<false>
	at org.incava.text.TestPosString.assertHasNumMore(TestPosString.java:51)
	at org.incava.text.TestPosString.testHasNumMore(TestPosString.java:131)

That actually shows that the test is incorrect: there are only 9 more characters from the initial position in the string. So we change the end of the range from 10 to 9 (that is, the range is 0 through 9), rerun, and the tests pass.

From the old class, we will pull the consume and consumeTo, amending their names and their code, starting with advanceTo, which is derived from the old consumeTo:

    public void advanceTo(String what) {
        while (this.pos < this.len && this.pos + what.length() < this.len && !this.str.substring(this.pos).startsWith(what)) {
            ++this.pos;
        }
    }

    public void testAdvanceTo() {
        assertCurrentChar('a', abc);
        abc.advanceTo("b");
        assertCurrentChar('b', abc);
        abc.advanceTo("c");
        assertCurrentChar('c', abc);
        abc.advanceTo("d");
        assertCurrentChar(null, abc);
    }

That raises another question: what happens when we advance to the end of the string? Should the current character be the last one, or should it be null?

We'll amend it to be null, and clean up the code, which has too long of a conditional (and redundant, with the "pos < len && pos + n < len" statements,
adding a new method, isMatch:

        while (this.pos < this.len && !isMatch(what)) {
            this.pos++;
        }

    public boolean isMatch(String what) {
        return this.pos + what.length() <= this.len && this.str.substring(this.pos).startsWith(what);
    }

Bringing over the old consumeFrom method gives us a chance to use the new isMatch method, for clearer code:

    protected boolean consume(String what) {
        skipBlanks();
        if (this.pos + what.length() < this.len && this.str.substring(this.pos).startsWith(what)) {
            this.pos += what.length();
            return true;
        }
        else {
            return false;
        }
    }

    public boolean advanceFrom(String what) {
        if (isMatch(what)) {
            advancePosition(what.length());
            return true;
        }
        else {
            return false;
        }
    }

This is in keeping with the principle of having fewer dots per line. more dots indicates a violation of the Law Of Demeter, with excessive low-level access to instance variables.

Notice that this method, as we've advanced through the class, is using no instance variables directly. Essentially the class is "coming up" from a low-level implementation closer to the point of how it will be used.

We'll bring over consumeFromTo, renaming it advanceFromTo. Again, we have no instance variables directly accessed:

    public void advanceFromTo(String from, String to) {
        if (advanceFrom(from)) {
            advanceTo(to);
        }
    }

We got all the way to here, and from looking at the code, we have an edge condition issue, as to what hasMore means. Does it mean "has more characters remaining to the end of the string", or does it mean "has a current character", that is, is not at the end?

Following the principle of when in doubt, copy, we'll imitate the behavior of the Java collections, specifically Iterable, for which hasNext() means "has a current element", and next() takes it.

So hasMore() will be amended to mean "has a current character", and will be renamed to hasChar(), which is less ambiguous:

    public boolean hasChar() {
        return this.pos < this.len;
    }

Notice that we slipped isMatch into the code, without adding tests for it, so let's fix that now, moving the tests before those for the advanceFrom/To methods, since isMatch is lower level, that is, is used by advanceFrom and advanceTo.

The implementation of assertIsMatch quotes the string. Against this can be useful, in cases where the string would be a whitespace character.

        String msg = "pstr: " + pstr + "; str: '" + str + "'";
        assertEquals(msg, exp, pstr.isMatch(str));

Repetition in Tests

We write the advance* methods, and end up with this one for testAdvanceFromTo():

    public void testAdvanceFromTo() {
        assertCurrentChar('a', abc);
        abc.advanceFromTo("a", "c");
        assertCurrentChar('c', abc);

        assertCurrentChar('t', tenletters);

        // no change
        tenletters.advanceFromTo("e", "n");
        assertCurrentChar('t', tenletters);

        // to first e
        tenletters.advanceFromTo("t", "e");
        assertCurrentChar('e', tenletters);
        assertPosition(1, tenletters);

        // to second e
        tenletters.advanceFromTo("e", "e");
        assertCurrentChar('e', tenletters);
        assertPosition(4, tenletters);

        // to second t
        tenletters.advanceFromTo("e", "t");
        assertCurrentChar('t', tenletters);
        assertPosition(5, tenletters);

        // to third t (next character)
        tenletters.advanceFromTo("t", "t");
        assertCurrentChar('t', tenletters);
        assertPosition(6, tenletters);
    }

By using the higher-level (custom) assertions the code is much cleaner, but there is obvious repetition, so we will extract the common code into a method that executes a subset of behavior and assertions.

    protected void doAdvanceFromToTest(Character expChar, int expPos, PosString pstr, String from, String to) {
        pstr.advanceFromTo(from, to);
        assertCurrentChar(expChar, pstr);
        assertPosition(expPos, pstr);
    }

Our testAdvanceFromTo method is now much cleaner:

    public void testAdvanceFromTo() {
        assertCurrentChar('a', abc);
        doAdvanceFromToTest('c', 2, abc, "a", "c");

        assertCurrentChar('t', tenletters);

        // no change
        doAdvanceFromToTest('t', 0, tenletters, "e", "n");

        // to first e
        doAdvanceFromToTest('e', 1, tenletters, "t", "e");

        // to second e
        doAdvanceFromToTest('e', 4, tenletters, "e", "e");

        // to second t
        doAdvanceFromToTest('t', 5, tenletters, "e", "t");

        // to third t (next character)
        doAdvanceFromToTest('t', 6, tenletters, "t", "t");

        // to end of string
        doAdvanceFromToTest('s', 9, tenletters, "t", "s");

        // off end of string
        doAdvanceFromToTest(null, 10, tenletters, "s", "j");
    }

A simple rule of economics is as cost decreases, demand (usage/consumption) increases, and that rule applied to testing is that as tests are easier (and more fun, frankly) to write, people will write more tests.

Note in the example above, with the low-level functionality and testing put into a separate method, the test code can focus on the edge conditions, such as “what happens if I’m on the same string as the one I want to advance to?” and “what happens when one advances to the end of the string?”

Edge conditions are where code will fail. Beware of thinking that more assertions are better than fewer; quality of assertions is better than sheer mass. (And “Sheer Mass” would be a good name for a rock band.)

In the next post, we’ll discuss naming, “the essence of programming”.

How to Enable Toggling a KVM with Linux Mint

I just got this KVM from Monoprice (for only $27!), and had difficulty setting it up with Linux Mint 11. Specifically, when running under X, hitting the Scroll Lock key twice resulted in no response. However, when running in a VT (that is, switching from X via ctrl-alt-F1), it worked, so my guess is that there is some issue with Gnome misdirecting or misinterpreting the key event.

I eliminated the KVM itself as the problem by running a very long command (“find /”), and hitting Scroll Lock did not stop the output from scrolling, when running in a Gnome terminal. As with the KVM, Scroll Lock did stop the command output when I was running in a VT.

Online there were examples of using xmodmap to enable the Scroll Lock key, but these didn’t work for me. I tried to:

xmodmap -pm
xmodmap -e 'add mod3 = Scroll_Lock'

But neither the KVM nor the command-output tests worked. Then I found this snippet:

xset led 3 && sleep 0.2 && xset -led 3 

I ran it, and lo and behold, it worked. So I wrote it to a script:

echo "xset led 3 && sleep 0.2 && xset -led 3" > ~/bin/toggle_kvm
chmod +x ~/bin/toggle_kvm

Then went to Keyboard Preferences, added that command, and gave it a shortcut of “Scroll Lock”.

So the only difference now is that in Windows, the Scroll Lock key has to be hit twice, but only once in Linux.

A Day Refactoring Code, Part 1: Rewriting

This post continues the overview of code being refactored.

We’ll pull the str, pos, len fields and build a new class around them. Naturally, the first step is writing a unit test, beginning by testing the lowest-level functionality, which set, get, and advance. That low-level functionality is only an intermediate step in getting the class we want, since eventually we’ll have some more expansive behavior.

So for now those methods are public, but we’ll probably eliminate them or reduce their accessibility, thus encouraging usage of higher-level methods instead.

For each method we’ll be testing, we will have a custom assertion, an assertion that essentially wraps the low-level assertEquals with a more expressive message associated with it.

Custom assertions should generally follow the form:

    public void assertMethodName(VType expectedValue, OType objTested, Object ... methodArguments) {
        String msg = "obj: " + objTested + "; arguments: " + methodArguments;
        assertEquals(msg, exp, objTested.methodName(methodArguments));
    }

for example, here is the custom assertion for testing substring, which takes an
integer argument:

    public void assertSubstring(String exp, PString pstr, int pos, int num) {
        String msg = "pstr: " + pstr + "; pos: " + pos + "; num: " + num;
        assertEquals(msg, exp, pstr.substring(pos, num));
    }

This produces the output:

junit.framework.ComparisonFailure: pstr: 'abc' (0); pos: 1; num: 1 expected:<b> but was:<b>
	at junit.framework.Assert.assertEquals(Assert.java:85)
	at org.incava.text.TestPString.assertSubstring(TestPString.java:44)
	at org.incava.text.TestPString.testSubstringWithPosition(TestPString.java:110)

In comparison, unit tests that use the low-level assertEquals such as the follow result in the subsequent unit test output:

    public void assertSubstring(String exp, PString pstr, int pos, int num) {
        assertEquals(exp, pstr.substring(pos, num));
    }
junit.framework.ComparisonFailure: expected:<b> but was:<b>
	at junit.framework.Assert.assertEquals(Assert.java:85)
	at junit.framework.Assert.assertEquals(Assert.java:91)
	at org.incava.text.TestPString.assertSubstring(TestPString.java:48)
	at org.incava.text.TestPString.testSubstringWithPosition(TestPString.java:114)

In the first example, we know the position (1), number (1), and information about PString itself (its wrapped string is “abc”, and it’s at position 0).

In the next post, we’ll delve more deeply into the unit tests.

A Day Refactoring Code, Part 0: Overview

I’ve been rewriting much of DoctorJ, which has been suffering some code rot after several years of existence, now at major release 5.

While poking through the code, I found the class ParsingSpellChecker, which as its name implies, spell-checks, in particular working on Javadoc comment blocks, such as skipping <pre> blocks and "{@link" tags:

package org.incava.text.spell;

import java.io.InputStream;
import java.util.*;
import org.incava.ijdk.util.MultiMap;

public class ParsingSpellChecker {
    /**
     * Whether we can spell check, which we can't until we get a word or a
     * dictionary (which is a thing with a lot of words).
     */
    private boolean canCheck;

    /**
     * The spell checker, which doesn't parse.
     */
    private final SpellChecker checker;

    /**
     * The current string we're working on.
     */
    private String str;

    /**
     * The length of the current string.
     */
    private int len;

    /**
     * The current position within the string.
     */
    private int pos;
    
    public ParsingSpellChecker(SpellChecker checker) {
        this(checker, false);
    }

    public ParsingSpellChecker(SpellChecker checker, boolean canCheck) {
        this.checker = checker;
        this.canCheck = canCheck;
    }

    public boolean addDictionary(String dictionary) {
        this.canCheck = this.checker.addDictionary(dictionary) || this.canCheck;
        return this.canCheck;
    }

    public boolean addDictionary(InputStream dictionary) {
        this.canCheck = this.checker.addDictionary(dictionary) || this.canCheck;
        return this.canCheck;
    }

    public void addWord(String word) {
        this.checker.addWord(word);
        this.canCheck = true;
    }

    public void check(String str) {
        if (this.canCheck) {
            this.str = str;
            this.len = this.str.length();
            this.pos = 0;
    
            while (hasMore()) {
                skipToWord();
                if (hasMore()) {
                    if (Character.isLetter(currentChar())) {
                        checkCurrentWord();
                    }
                    else {
                        // not at an alpha character. Might be some screwy formatting or
                        // a nonstandard tag.
                        skipThroughWord();
                    }
                }
            }
        }
    }

    /**
     * Consumes from one string to another.
     */
    protected void consumeFromTo(String from, String to) {
        if (consume(from)) {
            consumeTo(to);
        }
    }

    /**
     * Skips content between pairs of brackets, a la HTML, such as, oh,
     * "<html>foobar</html>." Doesn't handle slash-gt, such as "<html
     * style="lackthereof" />".
     */
    protected void skipBracketedSection(String section) {
        consumeFromTo("<" + section + ">", "</" + section + ">");
    }

    protected void skipLink() {
        consumeFromTo("{@link", "}");
    }
    
    protected void skipBlanks() {
        while (this.pos + 2 < this.len && currentChar() != '<' && !this.str.substring(this.pos, this.pos + 2).equals("{@") && !Character.isLetterOrDigit(currentChar())) {
            ++this.pos;
        }
    }
    
    protected void skipToWord() {
        skipBracketedSection("code");
        skipBracketedSection("pre");
        skipLink();
        consume("&;nbsp;");
    }

    protected void checkWord(String word, int position) {
        MultiMap<Integer, String> nearMatches = new MultiMap<Integer, String>();
        boolean valid = this.checker.checkCorrectness(word, nearMatches);
        if (!valid) {
            wordMisspelled(word, position, nearMatches);
        }
    }

    protected void wordMisspelled(String word, int position, MultiMap<Integer, String> nearMatches) {
        int nPrinted = 0;
        final int printGoal = 15;
        for (int i = 0; nPrinted < printGoal && i < 4; ++i) { // 4 == max edit distance
            Collection<String> matches = nearMatches.get(i);
            if (matches != null) {
                for (String match : matches) {
                    System.out.println("    near match '" + match + "': " + i);
                    ++nPrinted;
                }
            }
        }
    }

    protected boolean consume(String what) {
        skipBlanks();
        if (this.pos + what.length() < this.len && this.str.substring(this.pos).startsWith(what)) {
            this.pos += what.length();
            return true;
        }
        else {
            return false;
        }
    }

    protected void consumeTo(String what) {
        int len = this.str.length();
        while (this.pos < len && this.pos + what.length() < len && !this.str.substring(this.pos).startsWith(what)) {
            ++this.pos;
        }
    }

    protected Character currentChar() {
        return this.str.charAt(this.pos);
    }

    protected boolean hasMore() {
        return this.pos < this.len;
    }

    protected void checkCurrentWord() {
        StringBuffer word = new StringBuffer();
        word.append(currentChar());
        
        int startingPosition = this.pos;
        boolean canCheck = true;

        ++this.pos;

        // spell check words that do not have:
        //     - mixed case (varName)
        //     - embedded punctuation ("wouldn't", "pkg.foo")
        //     - numbers (M16, BR549)
        while (hasMore()) {
            char ch = currentChar();
            if (Character.isWhitespace(ch)) {
                break;
            }
            else if (Character.isLowerCase(ch)) {
                word.append(ch);
                ++this.pos;
            }
            else if (Character.isUpperCase(ch)) {
                skipThroughWord();
                return;
            }
            else if (Character.isDigit(ch)) {
                skipThroughWord();
                return;
            }
            else {
                // must be punctuation, which we can check it if there's nothing
                // but punctuation up to the next space or end of string.
                if (this.pos + 1 == this.len) {
                    // that's OK to check
                    break;
                }
                else {
                    ++this.pos;
                    while (hasMore() && !Character.isWhitespace(currentChar()) && !Character.isLetterOrDigit(currentChar())) {
                        // skipping through punctuation
                        ++this.pos;
                    }
                    if (this.pos == this.len || Character.isWhitespace(currentChar())) {
                        // punctuation ended the word, so we can check this
                        break;
                    }
                    else {
                        // punctuation did NOT end the word, so we can NOT check this
                        skipThroughWord();
                        return;
                    }
                }
            }
        }

        // has to be more than one character:
        if (canCheck && this.pos - startingPosition > 1) {
            checkWord(word.toString(), startingPosition);
        }
    }

    protected void skipThroughWord() {
        ++this.pos;
        while (hasMore() && !Character.isWhitespace(currentChar())) {
            ++this.pos;
        }
    }
}

The problem with ParsingSpellChecker is that the low-level functionality (going through a string by “position” (current index))and checking against string length) don’t match the more high-level behavior of the class.

Another red flag in the class is there are three instance fields (len, pos, and str) that are very tightly connected, with essentially each of them having to be used with one another.

It is also telling that the indentation levels within the class are very deep, also indicating poor abstraction. The fewer indentation levels, the more likely it is that the code is evenly abstracted at the proper level.

Code repetition is another issue, with statements such as this.pos++ found in many locations.

From running PMD, we know that StringBuilder is preferred over StringBuffer for performance reasons. Similarly, by building its own substrings (via StringBuffer.append()), the code does not take advantage of String.substring, which uses offsets in the newly-created String to refer to a substring within the lower-level char array.

That’s the overview. In the next post, we’ll begin rewriting this mess.

Building Ikea Furniture Ikeasily

This post is about how to build Ikea furniture more easily, in my experience cutting assembly time at least by half, and being much more fun. And it’s very simple: power tools and planning.

Update: Many people finding this page appear to be searching for the hex sizes that Ikea uses. According to this document, Ikea’s sizes are 2mm, 3mm, 4mm, 5mm, and 6mm. A quick look through a couple of Ikea instruction manuals shows that the 3mm and 5mm seem to be common. I’ve also updated the link to the hex driver set at Amazon.

My house is teeming with Ikea goods, which I’ve accumulated over the past 15 years, and I most recently outfitted my home office, including a Galant desk, file cabinet, Expedit bookcase, and the classic Ikea hack monitor stands.

First is planning: completely unpack your boxes, sort out the parts, then look at the directions. Match the parts to the ones in the instructions, which is made easier by comparing the quantity – for example, some of the Ikea screws are notoriously similar, so knowing that you should have 8 of part 123456 makes it easier to match up.

Then go through the directions, and visualize how the components should be put together. It is best to lay out the pieces, arranged as they will be assembled. And be very careful about checking finished versus unfinished sides and edges. Ikea furniture goes together well, but disassembling and reassembling it is a different story altogether.

For the assembly process, first item of business is to throw out those paper clip pseudo-Allen wrenches that come with the parts. Using them provides no leverage, and if your hands get sweaty (hint: they will, as you get more frustrated), you’ll lose the ability to get a decent grip.

Instead, get a cordless screwdriver. Huh? Aren’t all screwdrivers cordless? No, this refers to power screwdrivers, such as this one, which is essentially a watered-down (less powered) drill. If you haven’t used a cordless screwdriver before, you’re in for a treat. A couple of notes: get one with a light, which is essential when working inside (meaning both indoors and inside a piece of furniture). Mine is a Skil, which as a brand is at the lower end of the price spectrum, and has served me well for years.

Why not just use a drill? A couple of reasons: one is that drills have much more power (the one linked above is 3.6 volts, whereas drills are usually 12 to 18 volts). Using a drill is possible but requires a more delicate touch, and the soft metal of Ikea screws leads to them being stripped easily. The bulkiness of drills makes them more difficult to use in tight spaces, and their heavier weight can be fatiguing.

For your screwdriver you’ll want a set of driving bits, specifically the Allen wrench equivalents of the Ikea paper clips. These are usually sold in a set, such as this one, for $11. The important thing is that you’ll want an Allen wrench (hex) bits in sizes 3mm and 5mm. As noted above in the update, the smaller one, 3mm, is used by far the most with Ikea furniture, and the 5mm one is also common. I don’t recall any other sizes that Ikea uses.

The Allen wrench bits will work, of course, in the place of the Ikea paper clips, and you’ll also get some regular screwdriver bits (Phillips head) for the standard screws that Ikea uses. (Going back and forth between Allen and Phillips bits is a good reason to get a cordless screwdriver that can change bits easily, i.e., without tools. Or better yet is to have two cordless screwdrivers, and not have to switch bits at all.)

Again, when using power tools with Ikea, be gentle, especially when at the end of driving a screw. Their woods (MDF) can tend to splinter, especially between the laminated part and the underlying wood, and their screw heads can be stripped easily. If you do that with a Phillips screw (it doesn’t usually happen with Allen screws) then change your driver bit to the largest one you have that fits the newly-stripped screw head, push in very forcibly, then slowly unscrew the screw.

A good habit before driving screws in or out is to “test fire” the screwdriver, with a quick touch to confirm that it is spinning in the right direction. It can be frustrating to be in the wrong direction, and to unscrew a half-driven screw, or vice versa. Again, with Ikea furniture, you don’t want to assemble or disassemble it more than once.

Another tip is to magnetize the screwdriver bits. I have a fairly large refrigerator magnet (1.5″ x 0.5″ x 0.3″ or so) that I rub on the end of the driver bit a few times, thus adding a bit of “stickiness” that helps hold the screws to the bit. Properly magnetized, a screw will hold to the bit in any position, making it far less likely that you drop a screw. I learned this when building computers, where it can be maddening to drop a screw onto/into a motherboard, making it difficult to find and remove.

The last word of advice is that you may want to modify the Ikea instructions as makes sense to you, such as whether you are working on a piece by yourself, and the amount of space that you have. For example, for my Galant desk the instructions say to assemble the desk top-down on the floor, with the legs up, then flip the desk over, onto its legs, when done. In my 10′ x 10′ office, with only me working on this, space and human strength and agility would have made that very difficult, so instead I assembled the legs, set them up normally (feet on the floor), put the desktop on top of the legs, then crawled under the desk and fastened the desktop to the legs. Of course, you should always wear eye protection, but especially when you are under the point where you are working, thus susceptible to falling screws or wood particles.

Put a different way, think of this: you’re building furniture (thus using proper tools and technique), not simply following a set of instructions.