Fakir 0.0.1 released

The initial version of Fakir has been released.

This gem is somewhat small, but contains some functionality that I’ve needed in various projects, such as getting a random element from an array (thus Fakir::Array#rand), and getting non-repeating random numbers, i.e. where [1, 1, 1, 2, 2] is actually random, but doesn’t look random, in the same way that [heads, heads, tails, tails] looks less random than [heads, tails, tails, heads].

Fakir::Random#rand goes even beyond that, with support for a “window” in which a number will not be repeated. Thus with a window of 3, a number will not be repeated within 3 invocations of rand.

This gem was developed for seeding Cabin Notebook, and I hope it’s of beneficial use for others.

Advertisements

Cabin Notebook and Fakir

After postponing for years I’ve been working on a website that I haven’t found the equivalent of, Cabin Notebook, which is currently populated (seeded) only with random data.

I’ve been using the excellent Faker Ruby gem, but found its random data to be … too random. That is, I didn’t want truly repeating random data, such as its wide variety of prefix+first+middle+last+suffix names and phone number formats (“Dr. Millie Beatrice Houndhouse III”, “(812)-555-1213” and “1-312-555-9696”), and having the same random number be used repeatedly (such as subsequent labels of “Photo #2”, annotating different photos). Fakir will also support a more flexible version of Array#rand, so that a random element will not be returned within N invocations of that method.

So I’ve created the fakir gem, which supports data that are not quite so random. It’s in the beginning stages, but I plan to have it soon be fully-featured enough to seed the data at Cabin Notebook.

Rails testing and Emacs mode line

I’ve been diving/digging into the oh-so-excellent Rails Tutorial, by Michael Hartl, and am into chapter 3, where testing begins.

Imagine my surprise — go ahead, imagine it — when my Emacs mode line changed color.

Originally:

modeline_original

And when I ran “bundle exec rake test”, it succeeded with:

modeline_success

Then changing the test code to fail, after running “bundle exec rake test” again:

modeline_failed

Tracking down this code, I found that the guard-notiffany gem was doing this behavior, sending a change to the Emacs mode line via emacsclient, with this bit of code in lib/notiffany/notifier/emacs.rb:

        def notify(color, bgcolor)
          elisp = <<-EOF.gsub(/\s+/, " ").strip
            (set-face-attribute 'mode-line nil
                 :background "#{bgcolor}"
                 :foreground "#{color}")
          EOF
          emacs_eval(elisp)
        end

The issue there was that it just changes the mode line “permanently”. I’d prefer that the mode line colors change for a moment, then time out and revert to their previous settings.

I wrote a bit of code to do this, and modified lib/notiffany/notifier/emacs.rb locally, but there is a pending pull request that has a better implementation.

However, my hack is:

        def notify(color, bgcolor)
          elisp  = "(let ((orig-bg (face-attribute 'mode-line :background))"
          elisp << "      (orig-fg (face-attribute 'mode-line :foreground)))"
          elisp << "     (set-face-attribute 'mode-line nil"
          elisp << "                         :background \"#{bgcolor}\""
          elisp << "                         :foreground \"#{color}\")"
          elisp << "     (sit-for 3)"
          elisp << "     (set-face-attribute 'mode-line nil"
          elisp << "                         :background orig-bg"
          elisp << "                         :foreground orig-fg))"
          emacs_eval(elisp)
        end

The sit-for keeps the mode line as red or green for 3 seconds or when there is Emacs input. I prefer that, to clear the mode line more quickly, since I find the bright color distracting.

Rewriting Glark

For the PVN project, I’ve wanted to use Glark as a library for matching text (for the ‘seek’ subcommand), but I’d written Glark as a command-line application, and its design reflected that. It also, like so many “field-tested” programs, had very few tests, with the expectation that because
of being heavily used, flaws would easily surface. That’s relatively accurate: I’m fairly sure that I use Glark more than any other program not built into Unix (I even have grep aliased to “glark -g”).

I was once asked in a job interview what my opinion was of my own code. My response was that my code evidently sucks, because I’m always rewriting it. That was definitely true with Glark, being one of my first Ruby programs, migrating it from Perl, and not having touched it in a long time.

The problems with Glark were classics of bad programming: lack of tests, overly complicated code, use of global variables, poor class composition, and excessive coupling.

So I’ve been eager to rewrite Glark, but without tests, a program is much too brittle, so I knew that I’d first have to add tests. I simple can’t enjoy writing code without tests. However, with a comprehensive test suite, rewriting code is bliss. So I first wrote a few tests, then refined my test framework to the point that writing a unit test is as simple as this:

def test_simple
  fname = '/proj/org/incava/glark/test/resources/textfile.txt'
  expected = [
              "    3   -rw-r--r--   1 jpace jpace   45450 2010-12-04 15:24 02-TheMillersTale.txt",
              "   10   -rw-r--r--   1 jpace jpace   64791 2010-12-04 15:24 09-TheClerksTale.txt",
              "   20   -rw-r--r--   1 jpace jpace   49747 2010-12-04 15:24 19-TheMonksTale.txt",
              "   24   -rw-r--r--   1 jpace jpace   21141 2010-12-04 15:24 23-TheManciplesTale.txt",
             ]
  run_app_test expected, [ '--xor', '\b6\d{4}\b', 'TheM.*Tale' ], fname
end

That’s Glark matching 6nnnn ^ TheM*Tale. At this point, grep has added some of what made Glark quite distinct from it — highlighted/colorized matches and context — but Glark’s most unusual (and fun to program) feature is matching of compound expressions, such as:

% glark --and=3 write --or puts print **/*.rb

That is matching “write” within 3 lines of puts or print.

Do you need that very often? Nope. But it does come in handy, such as in the case of “I’m looking for where we are catching an InvalidArgumentException and logging it (within the next 5 lines) as an error:

% glark --and=5 'catch.*InvalidArgumentException' 'Log.error' **/*.java

Speaking of interviews, a friend of mine has a good practice of when he goes to a company to assess their software, he asks to see what they consider their worst code. Often that code is at the core of their project and is the oldest code, written early on by someone who may have left the company, and/or the code has been piled on with more and more complexity that it is difficult to detangle.

In Glark, the worst code of the entire application was the Options class, clocking in at 761 lines long, containing the 42 options in Glark. This class is a Singleton, which is the fancy Design Pattern way of saying Global Variable. (The worst by-product of the Gang of Four was the sanctifying of Singletons as being a good practice.)

Another sign that the Options class was written poorly is an insanely simple metric: wc. That is, running the “wc” command on all files, sorting them numerically, and looking at the largest files. There are the bottom, in all its corpulent glory:

% wc **/*.rb | sort -rn
    4     7   160 lib/glark.rb
  102   654  5213 lib/glark/help.rb
  183   527  4569 lib/glark/input.rb
  248   640  6064 lib/glark/exprfactory.rb
  266   681  6052 lib/glark/output.rb
  297   777  7392 lib/glark/glark.rb
  440  1048  9663 lib/glark/expression.rb
  761  2615 23377 lib/glark/options.rb
 2301  6949 62490 total

The Options class is used throughout Glark, so extracting it was quite challenging. I decomposed the Options class into smaller groups, and it just so happened that there was a design to follow, documented in, of all places, the help (man) page for Glark. That is, because there are so many options, for legibility and organization they are displayed in the man page in the groups “input”, “matching”, “output”, and “debugging/errors”. So I repackaged the Glark options into input, match, output, and info, and also used that as the organization for the modules within Glark. Thus the Glark::File class went into lib/glark/input/file.rb, and Grep::Lines went into lib/glark/output/grep_lines.rb.

I continued to refine the tests to the point that adding new ones was trivial. As the test coverage increased, this had the effect of making it an aberration when I worked on untested code.

On that note, here’s an easy way to test your test. That is to determine if your test really works, break the code that it is testing. (A “return nil if true” at the beginning a method works nicely.) If the test still passes, then it’s incomplete. If the test fails, then should add confidence that test coverage is adequate. This is also where it becomes fun to break code, and to break tests. As with anything, if it’s fun, we’ll do more of it, which is why it is essential to have a test framework that makes tests easy (ergo, fun) to write.

For years I’ve tracked my daily progress by the simple metric of lines of code, but after hearing this suggested on the Ruby Rogues podcast, I’ve begun the practice of adding one user-facing feature per day, such as a new subcommand or option to PVN. My definition of “feature” includes adding and refining documentation, and it also includes removing options, especially if they are confusing, redundant, unused or obsolete.

I track features with a Features.txt in the root directory of each project, of the form (from PVN):

Thu Oct 25 19:22:52 2012

  seek command: added [ -C --no-color ] option.

This keeps me on track by actually recording features that are added. A script I run on all {project}/Features.txt files shows whether I’ve added one for today, and when I’ve missed a day (only one since I started doing this).

My process for adding features feels like an extension of the TDD process, in short:

  • conceive of a feature
  • add it as a test
  • implement it
  • run tests
  • refactor the tests and code …
  • document the feature in the readme and help files
  • add the feature to the features file
  • commit with the feature description as the comment description.

That’s about it for this update on the coarse rewriting of Glark. You can track the progress of it on GitHub (https://github.com/jeugenepace/glark), and if you want to see the code before the rewrite, check out revision 4d10f192f46ec3df34f971f8b40e03f8df0aed27.

Adding setup and teardown to RubyUnit Test Suite

The tests for pvn raise an interesting question to me. The pvn subcommands that wrap/extend the svn subcommands process the svn output, so the tests need svn output to run.

I’ve considered doing mock objects for svn, but that became confounding, where I was essentially mocking all of the svn subcommands that I need. So I’m leaning in the direction of doing operations against a “real” Subversion repository. On a 234M Subversion repository, it takes around 0.4 seconds to do a backup (cp -r), when both the source and the target locations are on the SSD in my machine.

So it’s feasible for the test sequence to be:

  • back up the svn repository (0.4 seconds)
  • check out the svn repository checked out to working copy (4 seconds)
  • working copy modified, with added, changed and deleted files and directories (< 10 seconds for all tests)
  • svn repository restored to backed up version (1 second)

I’m still considering how to go about this, but in the meantime I’ve written the following for adding suite-wide setup and teardown for the pvn base testcase. Note that this is on a class:

require 'runit/testcase'

module PVN
  class TestCase < RUNIT::TestCase
    include Loggable

    WIQUERY_DIRNAME = "/Programs/wiquery/trunk"

    class << self
      def setup
        @@orig_location = Pathname.pwd
      end

      def teardown
        Dir.chdir @@orig_location
      end

      def suite
        @@cls = self

        ste = super

        def ste.run(*args)
          @@cls.setup
          super
          @@cls.teardown
        end
        ste
      end
    end

Project pvn

pvn, here on GitHub, is my newest project, a front-end to the Subversion command-line program “svn”. This program will unite my various svn programs, such as svndelta, and my various scripts.

I’ve been building it from scratch, with a core of commands wrapping those in svn, and new commands. At this point the only more-or-less fully working subcommand is log, which extends svn log to use “relative revisions” (explained below), and displays colorized output. I plan to add the functionality so that a user can set the format and colors for log output.

What is probably the most innovative functionality in pvn is that revisions may be “relative”, meaning that +N is the Nth revision, and -N is the Nth revision before the most recent one. In a large project, revision numbers can get into the six and seven digits, and it’s much easier to think of “the latest checkin I did”, as opposed to “my checkin with revision 317127”.

I don’t have a timeline, but my goal is to have the functionality of the log and diff subcommands implemented by the end of the year.