Object Calisthenics: Stretch and Twist …

Quite a while ago I first read Object Calisthenics, by Jeff Bay, in the Thoughtworks Anthology, and it has resulted in pivotal change in how I write software. A testimony to its value to me is that I’ve reread it several times, putting it into what I would consider my desert-island set of books.

A PDF of it is available here.

Initially some of the suggestions struck me as odd, but overall I agreed with them. With the 1.3.0 rewrite of DiffJ, I decided to make more of an effort to put all of the principles into practice, and analyze and review the experience.

The difference was amazing.

The idea of Object Calisthenics is that the high-level concepts and principles of good object-oriented programming can be distilled into low-level rules to follow.

The rules are:

  • One level of indentation per method
  • Don’t use the “else” keyword
  • Wrap all primitives and strings
  • First class collections
  • One dot per line
  • Don’t abbreviate
  • Keep all entities small
  • No classes with more than two instance variables
  • No getters/setters/properties

Some of those rules sound familiar (keeping entities small, avoiding getX/setX), but others seem bizarre: don’t use else? wrapping primitives? no more than two instance variables?

But they work, and make sense.

1. One level of indentation per method.

This rule is simple, and applied, greatly reduces complexity. I avoid long methods — never more than a vertical screenful — and “wide” methods should be equally avoided. A good metric is to consider the complexity of a method equal to the product of its length (lines of code) and its depth (indentation). (Maybe one of these days I’ll revive statj, for Java statistical summaries.)

This rule also reduces complexity because it pushes code out of the depths of the problematic method to the objects being acted on. Thus a refactoring process might go from the original code of:

    public void moveEntities(List<Entity> entities) {
        for (Entity entity : entities) {
            if (isOnScreen(entity)) {
                for (Entity other : entities) {
                    if (entity != other && entity.nextLocation().equals(other.location())) {
                        collide(entity, other);
                    }
                    else {
                        moveEntity(entity);
                    }
                }
            }
        }
    }

to:

    public void moveEntities(List<Entity> entities) {
        for (Entity entity : entities) {
            processEntityLocation(entity, entities);
        }
    }

    public void processEntityLocation(Entity entity, List<Entity> entities) {
        if (!isOnScreen(entity)) {
            return;
        }

        Entity other = getCollisionEntity(entity, entities);
        if (other == null) {
            moveEntity(entity);
        }
        else {
            collide(entity, other);
        }
    }

    public Entity getCollisionEntity(Entity entity, List<Entity> entities) {
        for (Entity other : entities) {
            if (entity != other && entity.location().equals(other.location())) {
                return other;
            }
        }
        return null;
    }

Note the two levels of indentation in the getCollisionEntity method. I haven’t found a way to do the for-if-return loop with only one level.

The primary motivation of this rule is that one is not dealing with multiple levels — both indentation and abstraction — at once.

The bottom line: it works. I’ve written much more straightforward code this way, and the result is small methods that can then be pushed out of the current class to the primary object being acted upon, which generally, is the first parameter of the method.

For example, we could move getCollisionEntity to the Entity class:

public class Entity {
    public Entity getCollisionEntity(List<Entity> entities) {
        for (Entity other : entities) {
            if (this != other && location().equals(other.location())) {
                return other;
            }
        }
        return null;
    }

    // ...
}

2. Don’t use the “else” keyword

The argument is to reduce the number of branches, keeping long conditional blocks out of the code.

This seems less justified than the other rules, but one way I have found it beneficial is to use the ternary operator instead of if-else, which has two results: very simple statements, and concise methods. Of course, this only works if the return types are compatible.

Modifying the code above, the result is:

    return other == null ? moveEntity(entity) : collide(entity, other);

I find that much easier to read, and it avoids the vertical code sprawl.

My spin on this rule is to avoid the “else if” sequence, that is, the long switches as seen below, which can oftentimes be refactored to be more polymorphic.

    if (x) {
        ....
    } else if (y) {
        ....
    } else if (z) {
        ....
    }

3. Wrap all primitives and strings

This I found difficult to get used to, but now I’m a vehement advocate, for a reason not cited in the original paper.

The reason he states is for type-checking and conveying information to another programmers about what the variable actually is.

If you have type checking, use it. A (the?) strength of Java is the compiler (OK, the JVM too), which can find our silly errors, such as:

    void launchMissile(boolean isNuclear, boolean isTest, Coordinates target);

    // run a non-nuclear missile test:
    launchMissile(false, target, false);

It’ll give us a nice helpful friendly message, such as:

Norad.java:13: launchMissile(boolean,boolean,java.lang.String) in mil.norad.Norad cannot be applied to (boolean,java.lang.String,boolean)

However, the compiler can’t help us beyond that. Such as:

    // run a non-nuclear missile test:
    launchMissile(true, false, target);

It might seem like overkill (pun intended) to wrap the boolean (or use enums), but look at the result:

    void launchMissile(MissileType missileType, TestStatus testStatus, Coordinates target);

    // run a non-nuclear missile test:
    launchMissile(MissileType.NON_NUCLEAR, TestStatus.IS_TEST, target);

And, going back to Ruby, because Ruby (unlike Java) allows subclassing of all built-in types, it’s easy to simply derive a new type, and eventually to expose the necessary methods of the superclasses as delegators in the new class, and then to un-derive the class from the primitive type.

The original paper highlights the benefit of wrapping primitive types as for conveying meaning (to the compiler and to the programmer), but I think there’s a more important reason, and benefit, for avoiding primitive types: the fact that it is nearly certain that the behavior of the primitive type will not match how a variable is used.

For example, in a given body of code, one might write this:

    String userName;

That is saying that a user name is a string. Okay, let’s see if that’s right:

  • Strings can have zero length. User names cannot.
  • The length of a String can be 2,147,483,647. User names cannot.
  • Strings can contain spaces. User names cannot.
  • … etc. …

Another problem with this is that Strings have functionality (methods) that makes no sense for a user name. lastIndexOf? replaceAll? split? java.lang.String has 69 constructors and instance methods, and a user name would use only a few of those.

Thus the declaration “String userName” is quite incorrect (as in “lying”). One might say that “a user name is a string, but with certain restrictions, and with only a subset of the functionality of java.lang.String, and with additional behavior and associations.” That is, there (at least conceptually, if not in the code) is a class UserName, which is approximately, but hardly exactly, a string. And the more approximate that equivalence is, the greater the burden on the programmer and on other code to perform the mapping.

4. First class collections

From the original: "any class that contains a collection should contain no other member variables." Said differently, that means that a collection should be wrapped in a class specific to how the collection is used.

Thus the logic for using a collection goes into the wrapper class, instead of in the "client" code. This feels more like working with an object instead of in it.

For example, a set of scores (mapped from a name to a list of scores) could be implemented as a simple class, with the average and total methods added:

class ScoreList
  def initialize
    @scores = Array.new
  end

  def << score
    @scores << score
  end
    
  def [] index
    @scores[index]
  end

  def total
    @scores.inject(0) { |sum, n| sum + n }
  end

  def size
    @scores.size
  end

  def to_s
    @scores.to_s
  end
end

class Scores
  def initialize
    @scores = Hash.new { |h, k| h[k] = ScoreList.new }
  end

  def add name, score
    @scores[name] << score
  end

  def scores name = nil
    name ? @scores[name] : @scores.values
  end

  def average
    @scores.empty? ? nil : total / count
  end

  def total
    sum_of_scores { |list| list.total }
  end

  def count
    sum_of_scores { |list| list.size }
  end

  def to_s
    @scores.inspect
  end

  def sum_of_scores &blk
    scores.inject(0) { |sum, list| sum + blk.call(list) }
  end
end

As method parameters in a type-checked language, it self-documents the method, and keeps the variable types shorter than using the equivalent Java generics:

    public void process(Map<String, List<Integer>> scores);

    public void process(Scores scores);

And with the earlier example, the List could be changed to an Entities class, containing the logic for a set of entities.

For example, my original draft of the Scores class had a Scores class, which was a Hash of Strings to Arrays of integers. When I reviewed this post, I realized that that version was exposing the collection type Array, so I rewrote it with ScoreList wrapping the Array class, as well as Scores wrapping the Map class.

As with primitives, another reason not to use collections directly is that because they are general purpose, they contain methods that very likely don’t apply to how you are using the collection. For example, a Java method returning ArrayList implies that all of the methods of ArrayList are valid, including methods to delete elements from the list.

This rule revolutionized my perspective, and is one that I wish I’d learned and begun practicing years ago. All code I’ve written since applying this is much cleaner and feels much more object-oriented than the legacy practice. Now it’s odd to work directly with a core collection. Maybe Java was right in putting their collections a bit out of the way, in java.util instead of java.lang, to discourage them from being used directly.

5. One dot per line

This is similar to the rule about the “depth” of method blocks, in essence keeping a single level of abstraction.

Applying this to the Ruby code above (Ruby tends to be “horizontal”, with methods chained) eliminates the @scores.values.flatten statements:

class ScoreList
  def average
    @scores.empty? ? nil : total / all_scores.length
  end

  def total
    all_scores.inject(0) { |sum, n| sum + n }
  end

  def all_scores
    scores.flatten
  end
end

It also tends to make one push the car.getEngine().setSpeed(55) chain into the appropriate classes, letting them do the necessary delegating.

Again, this fixes the problem of dealing with too many levels of abstraction at once, and of working through objects instead of on them.

6. Don’t abbreviate

I just saw a violation of this today, a method named “getPWD()”, to return a password. As a long-time Unix user, I read that as “print working directory”, a problem common with abbreviating: the user has have to understand the context, and switch their mind-set appropriately.

Another problem with abbreviations is that can become eclipsed. A piece of legacy code that I maintain is named “IM” (for interface module), obviously written prior to instant messaging, which claimed that abbreviation. (It is ironic that “eclipse” is a word that has — at least to many programmers — has its original definition eclipsed, thanks to the IDE.)

I also agree with the point that methods with long names tend to be misplaced. A common occurrence of that is when nouns are in the message name itself, such as:

    sendMessageToUser(message, user);

which should instead be

    message.sendTo(user);

In general, this rule reflects classic object-oriented design: classes should be nouns, and methods should be verbs.

The corollary for the no-nouns-in-methods subrule is that classes should not contain verbs (or nounified verbs), such as “UserManager” and “MessageSender”.

7. Keep all entities small

This rule seems extreme: no classes more than 50 lines, and no packages with more than 10 files.

I definitely like the concept, that large classes tend to become god classes, doing much more work than they should, instead of that behavior being pushed to the acted-upon classes. And I agree that long classes are difficult to read. Notice how IDEs collapse methods? That’s a sign of “fixing” a problem by hiding it.

My corollary to this is that classes should be some minimal length as well. If a class in a verbose language (Java, C++) is too short (10-20 lines or so), it is usually a C-style struct pretending to be a class (just with getX() instead of a public x). Thus it usually has little behavior defined within that class, putting the onus on client code to do the processing.

However, if this rule is applied rigorously, it could make the classes from the core libraries too simple. A string class certainly needs to have a full suite of behavior to be useful. java.lang.String has 69 methods, and the Ruby String class has 131 methods of its own, with another 45 inherited. A guideline is that code should be more concise and focused the closer it gets to the end user.

That said, I look at my projects with a simple metric: the number of lines per class. There should be a relatively narrow range among the line count, deviating little from the average. That is, a nice, narrow bell curve.

A bad sign — and seen all too often — are the projects with 75% of their files containing fewer than 50 lines of code, and 5% with more than 500 lines. Again, that indicates god classes.

Another problem with large classes is that they are simply too complex to test adequately, whereas small classes, with relatively self-contained behavior, are easy to test. And the simple law of economics applies: the lower the cost, the more the consumption. Thus the cheaper it is to write a test, the more that will be written.

8. No classes with more than two instance variables

This is probably the most bizarre rule of all of these, primarily because it goes against standard practice. However, if one applies the above rules, specifically about not using primitives and collections directly, it is much more feasible.

For example, a game could have a unit with the location and movement as:

public class Unit {
    private int x;
    private int y;
    private int direction;
    private int speed;
}

Applying this rule, we could define the Location and Velocity classes, replacing the above with:

public class Unit {
    private Location location;
    private Velocity velocity;
}

That’s another sign of good design, having variable names that closely match their types.

And as with the above, having the cohesively-defined Location and Velocity classes makes them easier to test, as well as adding behavior to them, such as range validation, such as that a location has to be within the coordinates of a rectangle (such as the screen).

9. No getters/setters/properties

An object should be “set” via its constructor, and low-level changes (meaning to its instance variables) thenceforth should be within the object itself.

Taking the example in rule 8, that would mean that Location could have a move(Velocity) method.

The principle here is that instead of directly manipulating an object (again, working in the object) the object should be “told” what to do, in the spirit of the original concepts of object-oriented programming, sending messages to an object.

The benefit of this rule is that it encourages thinking in terms of what the object does, instead of what the object has, that is, the “verbs” of the object instead of its “nouns”.

I highly encourage everyone to read the original, which is excellent in explaining the mapping from object-oriented principles to the programming practices.

If you liked this post — and who wouldn’t? — you might also like my series of posts about rewriting a class, using many of these rules.

Advertisements

5 thoughts on “Object Calisthenics: Stretch and Twist …

  1. Hey Jeff, great post. I just read the Object Calisthenics paper and was searching for more information on it. That concludes the story of how I found your blog. I’m curious what your thoughts are on Rule 9: No getters/setters/properties. I’ve tried not using getters to retrieve information about an object’s state, but it becomes difficult when you need to output the information, for example to display it. Each object needs to be coupled with each target that it may send information to. Allen Holub showed this with an Exporter and Importer interface that is passed into objects (in this article http://www.javaworld.com/javaworld/jw-01-2004/jw-0102-toolbox.html?page=1). However, his recommendation violates the corollary you mentioned in your post at the bottom of Rule 6: “The corollary for the no-nouns-in-methods subrule is that classes should not contain verbs (or nounified verbs), such as ‘UserManager’ and ‘MessageSender'”. Do you have any additional thoughts on the no getter/setter topic?

    • Thank you for your feedback, and great questions.

      I agree that those two idioms — no getters and setters, and no verbified classes — clash with each other, especially in MVC architectures. From my experience (most recently with Rails) a model is essentially just a data structure. The view defines all the getter functionality, and the controller, the setter code.

      As to his initial example, the Money class, in my view (no pun intended) that is incorrectly named, with its assumptions leading to later problems. If Money is initially defined with the assumption of dollars as being the currency, then instead of Money it should be named Dollars, or Dollars should be a subclass. I once wrote a C++ library that was exactly that, converting currency from Dollars to Marks to Colones, etc, all derived from a Currency class. That allowed client code to use whichever currency they wanted, which would be automatically (via implicit conversion) to the base currency, converted via the exchange rate.

      The option of the MVC architecture is, as Allen describes, causes a choice, between big classes, or with wide class hierarchies. Classes can be “big”, with their controller and view code defined within the class (along with the model, of course). This I generally dislike, because it makes those classes more complex and thus more difficult to test.

      “Wide” class hierarchies are defined with (what I call) parallel classes, such as EmployeeModel, EmployeeView, and EmployeeController (or Exporter and Importer, as he writes), which, as you say, are defining classes with verbs.

      I think what Allen is describing is that a class often isn’t simply a fully-defined class. Its functionality can be defined not only within the class itself, but within the context in which it is used. Thus an Employee isn’t simply an Employee, but has functionality within the context of the UI, the database, etc.

      Given that, I would make the exception of allowing classes to be defined as views and controllers, which, although it violates the verbs-in-class-names principle, is more pragmatic. However, I think that when a programmer is tempted to write a UserManager, he should step back and make certain that he is not simply moving the methods that would properly be defined on the User class, thus making the User class less self-aware, and making it necessary to understand two classes (User and UserManager) instead of just one.

      Pragmatism should rule over principles. Overly adhering to principles can lead to tortured code, as can be seen in the dogma of “only one return statement per method” resulting in code like:

          public boolean canExecute(Action action) {
              boolean executable = false;
              for (Role role : getRoles()) {
                  for (Right right : role.getRights()) {
                      if (right.permits(action)) {
                          executable = true;
                          break;
                      }
                  }
                  if (executable) {
                      break;
                  }
              }
              return executable;
          }
      

      As opposed to the simpler:

          public boolean canExecute(Action action) {
              for (Role role : getRoles()) {
                  for (Right right : role.getRights()) {
                      if (right.permits(action)) {
                          return true;
                      }
                  }
              }
              return false;
          }
      

      As with all of the principles in Object Calistentics, I think that they can lead to much better code, but not always, and this seems to be a case in point.

  2. > I haven’t found a way to do the for-if-return loop with only one level.

    I encountered the same problem when trying to apply Calisthenics rules.
    There is in fact a way to strictly follow rule #1, but:
    * it’s a bit ugly (especially with “enhanced for statements” that first need to be converted in their pre-Java5 format),
    * it sort of breaks rule #2 if ternary conditional operators are not allowed (you explicitly accept them in your article, but from what I’ve read not everyone does).

    Here is that way:

    import java.util.Iterator;
    
    public abstract class ForIfReturn<T, U> {
        public U forIfReturn(Iterable<T> objects) {
            for (T obj : objects) {
                if (someCondition(obj)) {
                    return someValue(obj);
                }
            }
            return defaultValue();
        }
    
        public U noIndentation(Iterable<T> objects) {
            T obj = null;
            boolean someCondition = false;
            for (Iterator<T> iterator = objects.iterator(); iterator.hasNext() && !(someCondition = someCondition(obj = iterator.next()));) {
            }
            return someCondition ? someValue(obj) : defaultValue();
        }
    
        abstract boolean someCondition(T obj);
    
        abstract U someValue(T obj);
    
        abstract U defaultValue();
    }
    

    I reckon there may be neater ways to achieve that, it’s just the one I figured out.
    However, because of the ugliness of that no-indentation format, when doing Calisthenics I allow myself to use the neater for-if-return pattern as the only exception to rule #1.

    ========
    Note that, thanks to Java 8 Streams, the for-if-return pattern can easily be written the functional way (for Collection however, not for Iterable):

    Optional<U> first = objects.stream().filter(this::someCondition).map(this::someValue).findFirst();
    return first.isPresent() ? first.get() : defaultValue();
    

    (I’m not using #orElse because #defaultValue could be heavy calculation.)
    But, hey, it’s not Object Calisthenics anymore, it’s Functional Calisthenics! 😉

    • Correction about the functional code: I was not aware of #orElseGet, so a better way (with call-by-name evaluation of #defaultValue) is simply:

      return objects.stream().filter(this::someCondition).map(this::someValue).findFirst().orElseGet(this::defaultValue);

      (Or with #map and #findFirst inverted.)

  3. Pingback: Object Calisthenics | alexpfx

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s