Stati of Projects

I’ve been jumping around from one project to another on an as-needed basis, and here is the current status of each.

RIEL – I’ve been updating the colorizing code, adding the functionality to use extended color codes on ANSI terminals, instead of the default 10.

Glark – In the midst of a major rewrite, for both code purity and functionality. The main changes so far are extensions to the path/file arguments, bringing in some functionality from find. Glark will also (with the imminent integration of RIEL) support extended colors.

PVN – This project was in heavy development until a month ago, when it went into a state of waiting for updates to Glark, since it will be using Glark for its seek (searching) subcommand.

DiffJ – This project I rewrote in JRuby, but that was a little too slow for a command-line application –  even heavily tweaked, I couldn’t get the startup time under two seconds. So I re-rewrote it in Java, and intend to revisit it to add more intelligent code comparisons, such as understanding that “if (a == b) foo();” is the same as “if (a == b) { foo(); }”

IJDK – Mostly dormant at this point. When I’ve rewritten Java projects, I’ve tried to extract the generic code from them and add them to IJDK, but I haven’t been heavily involved with any of my Java projects lately.

Java-Diff – A while ago this was brought up to date to use generics, and was refactored for code clarity. There hasn’t been any reason to update it since then.

DoctorJ – Alas, this is dormant. Some of its warnings have been integrated into the Java compiler, such as mismatched parameter names. It still goes beyond that level of pedanticalness, so it is most suitable in the development of projects where documentation is paramount, such as APIs.

Related posts:

Improving Performance of DiffJ/JRuby

DiffJ is nearly ready for release, but I’ve not been content with the performance, which is significantly slower with the JRuby implementation than the pure Java version.

My changes were based on the recommendations on the JRuby wiki.

Before any optimization, a test run of diffj against a pair of Java files ran with the times:

user    :   4.84
system  :   0.19
cpu     : 184.20
total   :   2.72

Following the suggested changes, I added the -client argument to the Java process, which resulted in:

user    :   4.99
system  :   0.21
cpu     : 184.80
total   :   2.80

So performance actually worsened.

Next was passing the argument -Djruby.compile.mode=OFF arguments to the Java process:

user    :   4.92
system  :   0.17
cpu     : 184.80
total   :   2.75

Again, performance worsened slightly.

With both the -client and -Djruby.compile.mode=OFF arguments, performance was still down, as one would expect now:

user    :   4.92
system  :   0.18
cpu     : 186.00
total   :   2.73

So then I more carefully went through my code, looking at the time to require each file, and found two salient problems.

The first is that I was dynamically creating several hundred methods in the RIEL ANSIColor class, for each combination of decorations and foreground and background colors (such as “bold_red_on_white”). I refined that code, the RIEL Log and Loggable classes, and the extensions to the Ruby String class to dynamically create the color methods as necessary.

Thus the dynamic definition of the method “bold” is truly a usage of the decorator pattern.

That resulted in a slight improvement:

user    :   4.68
system  :   0.22
cpu     : 186.80
total   :   2.62

I wondered about the overhead of Rubygems, so I removed RIEL as a gem, and instead added it within the DiffJ source tree. Performance improved significantly:

user    :   3.74
system  :   0.20
cpu     : 184.80
total   :   2.13

Combining all of the above resulted in the best performance:

user    :   3.62
system  :   0.19
cpu     : 181.20
total   :   2.10

That is acceptable to me, and I’ll be releasing version 1.3.0 of DiffJ soon. Of course, it’s on Github here, so feel free to download it and build it.

sort{ |a, b| a <=> b }.ing with Ruby

One idiom from Perl that I’ve missed with Ruby is the ability to chain comparisons together, such as:

my @a = qw{ this is a test };

$, = ", ";

print sort { substr($a, -1) cmp substr($b, -1) || length($a) <=> length($b) } @a;
print "\n";

Which results in the output:

a, is, this, test

In Ruby, it’s a little more complicated, since Perl evaluates a zero as false, but Ruby does not. However, the nonzero? method for all Ruby Numeric objects essentially performs this conversion, for use in a boolean evaluation, returning nil if it is zero, and the number otherwise. So in Ruby, the above code would be:

a = %w{ this is a test }

puts a.sort { |a, b| (a[-1] <=> b[-1]).nonzero? || a.length <=> b.length }.join(", ")

One additional note: if you’re using this in a spaceship method (“”) for the Comparable interface, remember that it must return a numeric value, so if you chain evaluations together, the final statement should be zero, since all previous evaluations were nil (meaning that they were equal). This bit me during some recent DiffJ work, and here is an example of a corrected method:

class Java::net.sourceforge.pmd.ast::Token
  include Comparable, Loggable

  # ...

  def <=> other
    (kind <=> other.kind).nonzero? ||
      (image <=> other.image).nonzero? ||
      0
  end

That’s DiffJ opening the PMD token Java class and adding the Ruby Comparable interface to it, so tokens can be sorted in Ruby collections.

On that note, DiffJ is in rough beta status now. I’m using it for my work (refactoring and cleaning up legacy Java code), and just corrected a glitch in the Token code, ironically enough, for supporting usage in Hash objects. I’d neglected to implement the eql? method, erroneously thinking that Hash uses the Comparable code.

With that fix, the JRuby implementation of DiffJ produces the same output as the Java implementation. It’s somewhat slower, so I’ve been investigating AOT compiling of it, but that doesn’t seem to have much of an effect.

I just realized that another feature from Perl that I’ve missed (and until writing that code above, hadn’t used for 10 years) is defining the array separator with the “$,” variable. Similar to that, my RIEL library modifies the to_s method of an Array to output “, ” between elements for output, since the default is to have no space between elements.

How to Write JUnit Test Sets in Rake

As DiffJ has been rewritten in JRuby, it now has numerous tests, with a group of tests in which each test corresponds to a add/delete/change type, such as methods being added to a class declaration.

I found myself wanting to run subsets of tests, such as all those for methods. The organization of the tests is that pathnames match the scope of the code being checked, so, for example, src/test/ruby/diffj/type/method/parameters/typechange/test.rb is the test for checking for a change in the type of parameters of a method, which is part of a type, meaning an interface or class. (As a parenthetical aside, the test files themselves match the test pathnames, so the two files being tested here are src/test/resources/diffj/type/method/parameters/typechange/d0/Changed.java and src/test/resources/diffj/type/method/parameters/typechange/d1/Changed.java. And yes, this begs for an Emacs shortcut for toggling among the files, which I haven’t written yet.)

Although I’m a Rake novice, I had an idea of what I wanted: a way to specify in the Rakefile a subset of tests. Yes, I probably could have written test suites, but for whatever reason, it seemed more logical to do this as part of a build. And being a Z shell user, I also knew the pattern to apply. More about that in a minute.

So here’s what I wanted: to run “rake test:method”, and have that run all the unit tests under src/test/ruby/diffj/type/method. Or in glob-speak, “src/test/ruby/diffj/type/method/**/test*.rb”. (My testcases without their own tests are named “tc.rb”, so src/test/ruby/diffj/type/method/parameter/tc.rb has code common to the testcases under src/test/ruby/diffj/type/method/parameter.)

Enjoying the wonderful experience of writing object-oriented build scripts (bitterly said as a professional maintainer of Ant code), I subclassed the Rake test task,

class DiffJRakeTestTask < Rake::TestTask
  def initialize name, filter = name
    super(('test:' + name) => :testscompile) do |t|
      t.libs << $srcmainrubydir
      t.libs << $srctestrubydir
      t.pattern = "#{$srctestrubydir}/**/#{filter}/**/test*.rb"
      t.warning = true
      t.verbose = true
    end
  end
end

DiffJRakeTestTask.new('field')
DiffJRakeTestTask.new('method')
# ... and others

Going roughly line by line, that takes a parameter such as “method”, creates a test task named “test:method”, and adds a glob for test files (“test*.rb”) under a directory matching the filter, which defaults to the name. The “lib” lines add the directories containing my Ruby source and test code.

Using the filter means we can also specify partial or full paths for a test subset (even just a single file), writing them as needed, such as when working on failing tests. My needed ones ended up being the following:

DiffJRakeTestTask.new('method/body/zeroone')
DiffJRakeTestTask.new('method/parameters/zeroone')
DiffJRakeTestTask.new('method/throws/zeroone')
DiffJRakeTestTask.new('method/parameters/reorder')
DiffJRakeTestTask.new('method/parameters/reorder/typechange')

And we also can use the “filter isn’t the same as the name” functionality with the “test:all” task:

DiffJRakeTestTask.new('all', '*')

Yes, that’s probably redundant with the default “tests” task, but I prefer the feeling of wielding more control.

It looks like I’ll have DiffJ fully implemented in JRuby by the end of the month. I’m only packaging it as Zip files, and would appreciate offers to repackage it for Ubuntu, Fedora and other distributions. I know nearly nothing about Eclipse, but have had received interest in DiffJ as an Eclipse plugin, so that is another valuable contribution someone could make.

DoctorJ 5.2.0 and DiffJ 1.2.0 Released

DoctorJ 5.2.0 is an extensive rewrite of the code, with unit tests expanded and refined. It uses the latest version of PMD as the parsing and AST code, and is the first version of DoctorJ to use the IJDK module, leading to much more elegant code. In terms of functionality this version add spell-checking for strings of a minimal length (which defaults to 2). The project now builds with Gradle.

DiffJ 1.2.0 also has significantly rewritten code, and also uses the IJDK module. This version adds (on terminals that support it) colorization of differences. It too uses Gradle for its build.

Both projects are now distributed only in zip format, and I welcome offers to repackage them for various package managers.

The distributes are available for download:

Projects Now on GitHub

I am in the process of migrating some of my projects to my space at GitHub.

At this point, the following exist:

  • DiffJ – a Java-aware command-line utility for comparing Java files. This program ignores whitespace and comments, and detects added and removed classes, methods, and fields, as well as changes in code.
  • IJDK – extensions to the JDK, primarily motivated by functionality in various Groovy and Ruby libraries.