Extended Colors with Git

It’s well-known that Git supports colors for many operations, such as diff, and that these colors are customizable. Examples online show how to modify ~/.gitconfig to set various fields for git functions, such as this for the “diff” command:

    [color "diff"]
        meta = yellow bold
        frag = magenta bold
        old = red bold
        new = green bold

All other examples that I saw online used the same color names, that is, the ANSI colors for terminals: black, red, green, yellow, blue, magenta, cyan, white.

What I wondered, after rewriting Glark so that extended colors could be used for highlighting matches, since terminals now support those colors, is whether Git supported extended colors. Digging around through the Git source code shows that a color, in addition to being one of the above color names, can also be a number between 0 and 255, per the ANSI escape codes.

The nit is that that’s not an RGB value; it’s a ANSI code that corresponds to a color, and for people accustomed to RGB, the ANSI code is dissimilar enough to be confusing.

An RGB value can be mapped to a ANSI code with a simple equation (this is from the Rainbow Ruby Gem):

def to_code red, green, blue 
  r, g, b = [ red, green, value ].map { |v| (6 * v / 256.0).to_i }
  16 + 36 * r + 6 * g + b
end

Each of the values for red, green and blue is scaled to between 0 and 5, then offset for the ANSI version of RGB.

This Ruby snippet dumps the list of colors as foregrounds:

(0 .. 255).each do |c|
  puts if c > 0 && (c % 10) == 0
  printf "\e[38;5;#{c}mabc %3d\e[0m  ", c
end
puts

The colors, as foregrounds on white:

fg_on_white

As foregrounds on black:

fg_on_black

As backgrounds on white:

bg_on_white

And as backgrounds on black:

bg_on_black

Also note that the Git color fields are of the format "[attributes] foreground [background]", where if a second color is given, then it is used as the background.

The same is true for ANSI codes, so that the second ANSI code specified will be used as the background.

Attributes are the following: bold, blink, ul (for underline), reverse, and dim. More than one can be used.

Not that I recommend it, but a valid configuration could thus be:

    [color "diff"]
        meta = bold 190 22
        frag = blink 189 89
        old = blink bold 160 143
        new = reverse bold blink ul 52 227

Which looks like the following:

example_diff

Again, I don't recommend it. I'll post an update when I've settled on a color theme.

Review of LinuxMint 14 KDE

I finally upgraded my main personal machine to Linux Mint KDE 14. That machine was running Mint 11, for around two years, but when I began working with Scala, I discovered that Emacs 24 is much better for Scala support. Not finding Emacs 24 in the Mint 11 repositories, I finally took the time and effort to upgrade.

Now I wonder why I’d waited so long.

The first immediate improvement, albeit superficial, was that KDE uses blue as its primary color. It makes sense that that Mint would of course choose, well, a minty green as its color, but that is one of my least favorite colors, reminding me of a 1970s refrigerator. I didn’t care for the brown of Ubuntu, and missed the blue of Fedora, so now I’m back, in a way.

I switched to the Oxygen theme, which is nicely dark, mostly dark greys. The other themes I looked at seemed to be excessively noisy, and I like a minimal desktop experience, with no peripheral distractions.

As I complained before, when I temporarily switched from Gnome to KDE (and back again) in KDE the fonts look, in a word, horrible. Absolutely horrible, if you will permit me two words.

This time I googled around a bit, and found this thread.

The summary of that is to go to the Fonts settings and set them all to Ubuntu 10 Regular, except for Fixed Width (Ubuntu Mono 12), Small (Ubuntu 9), and Windows title (Ubuntu 10 bold).

Set the following:

  • Anti-aliasing: enabled
  • Exclude range: unchecked
  • Sub-pixel rendering: RGB
  • Hinting style: slight

Install Windows fonts via: “sudo apt-get install ttf-mscorefonts-installer”. The command line app is necessary because you’ll need to accept the license agreement, which has no equivalent for the GUI-based package managers.

That has made a huge difference in the appearance. I do not understand why these would not be the default settings in KDE, so mark that as one advantage in the favor of Gnome. One advantage. I haven’t found a second one.

I also installed the Inconsolata font (the package “ttf-inconsolata”), which I tried out with Emacs after reading about it as being highly recommended. After a while I went with (back to, actually) DejaVu Sans Mono, font size 9, since I found Inconsolata characters to be too wide.

The KDE UI takes a little while to get used to, especially seeing all apps in the panel, not just the ones for the current workspace. I also set the shortcut for the start menu to alt-F1, after trying to re-map the Windows key, with a modicum of success.

This upgrade makes me feel like I’m back in my early Red Hat / Fedora days, with the UI clean and responsive. I haven’t yet tried out activities under KDE, but I am planning to.

On that note, being a KDE neophyte, I’m looking for a KDE book, and would appreciate any recommendations.

Recommended Books

This is a list of what I consider the best books relevant to programming, and perhaps, even to life.

Two of the older books, which shaped me during my C++ days, are Large Scale Software Design (by John Lakos) and Object-Oriented Design Heuristics (by Arthur Riel). The former is very relevant to Java projects, perhaps even more so now, since by its nature, Java makes it unclear as to the package hierarchy. That is, java.io and java.util are intertwined, with mutual dependencies. Having a clear package/module hierarchy can make it easier to understand the levels within a project, and how their behavior aggregates up through those levels.

Object-Oriented Design Heuristics is simply a must-read for anyone doing OO programming, which means essentially every programmer. Maybe even the functional programmers. It’s been a long time since I’ve read it, but one salient point that I remember is that managers are discouraged. That is, nonsense such as ResourcePoolManager, which often (usually? always?) act as god classes, and break the OO design by centralizing behavior instead of having it tightly integrated with the client classes. The book goes extensively into inheritance, and how is it abused and distorted, where classes within a hierarchy often break down in terms of being properly decoupled.

If you program, you must, absolutely must, understand regular expressions. I cannot fathom how anyone doing text-based programming (which also means essentially everyone) could not use regular expressions. Every day. Every hour. Much of the magic of dealing with huge code bases is simply mastering regular expressions. So the must-read book on this topic, by another Jeff, is Mastering Regular Expressions, by Jeffrey Friedl. It’s another book that if you read through chapter 3, you’ll understanding 80% of regular expressions, an application of the 80/20 rule, AKA the Pareto principle.

For Scrum, the best book is Agile Software Development with Scrum, by Ken Schwaber and Mike Beedle. Unlike those released by the book-by-the-pound publishers, this one is concise and direct, clocking in at around 150 pages. As with the book above, reading just three chapters will give you 80% of the full understanding of Scrum.

Hackers and Painters, by Paul Graham, is another must-read, probably the best book I’ve read about the mindset of great programmers, and creators/artists in general. It makes clear that programming is more art than science, and more painting than engineering, contrary to the roots, and biases, of this field. It’s so good that I recommend it not only to programmers, but to those fortunate people who have a programmer in their lives, and want to understand their mental processes.

Delving into non-programming books, I suggest The Fifth Discipline, by Peter Senge. It’s more about businesses than about software, but one point that programmers should appreciate is that there are systemic behaviors, or accidental behaviors, as Fred Brooks might say, resulting from the organization of a system, such as a business, but also programming teams, and even the design of their code. Trying to work around an inappropriately designed system can be extremely frustrating, even to the point of futility.

This might be an odd choice, but I’m going to shoehorn it in here anyway, because I like it so much: The Gentle Art of Verbal Self-Defense, by Suzette Elgin. It discusses patterns in communication, such as loaded questions, those with invalid premises (“If you really cared about getting this feature done, you wouldn’t be wasting time refining the build process”). Especially in this era where multi-cultural teams are the norm, it is imperative that people understand the deeper significance of the words they choose. It is somewhat like The Fifth Discipline, such that communication itself results in systemic behaviors and dynamics within a team.

I don’t think it’s too controversial to say that programmers are gifted, and by that I don’t mean that they are superior to others – just different. Very different. As Paul Graham elucidates in Hackers and Painters, there is simply a different internal mechanism among programmers, which results in their (illogical, to some) obsession with “minor details”, that is, the core of their work. (Programming is mostly just minor details, aggregated into huge systems.) Programmers also tend to be highly sensitive, as could be expected of people who have to closely watch for even slight variations in behavior or performance. So I suggest The Gifted Adult, by Mary-Elaine Jacobsen, which shows insights into how those people think and behave, and why their levels of concentration can be easy disrupted. I especially think that software managers should read it, to understand why the cats that they are trying to herd behave so much unlike “normal people”.

Wrapping up my recommendations is another really odd choice: The Path Between the Seas, by David McCullough. It’s about the building of the Panama Canal, about how after years of failed attempts, the canal succeeding in being built by two things: building railroads (to move workers and to remove dirt) and eliminating disease (yellow fever). John Stevens, the head engineer, devoted a significant amount of time (over a year, as I recall) building railroads instead of digging. So once the digging phase began, resources could be moved much more quickly, and with dirt removed, it reduced the exposure to mudslides from the frequent heavy rains. Dr. William Gorgas also solved a resource problem by eliminating disease, thus keeping workers productive. How this fits with software is that much of our work isn’t the digging per se; we have a one-off, with the system around the project itself, such as resources, both material and human.

Glark 1.10.0

Glark 1.10.0 is ready to be released, after a couple of years of being not a high priority for me. I was inspired to rewrite it when I looked through the code, much of it written early in my Ruby days. It began as a Perl script, and retained that scriptitude through its life.

At times while rewriting Glark, I wish that I’d blogged that experience. The short description is that I had a few primary guidelines:

  • Test as thorough as possible, ideally one test (at least) per feature. A feature is essentially the same as an option. Each source file/class should have an equivalent test case.
  • Keep files and classes small, and relatively even in size.
  • Simplify the set of options.
  • Eliminate global variables.
  • Add one feature per day.

Following these principles resulted in a code base I am much more satisfied with.

Previously much, if not most, of Glark was “field-tested”, a euphemism for “I tried it out a while ago, and I think it worked then.” As the test suite grew, the code became much easier to refactor with confidence.

Regarding the size of files and classes, I used a simple metric:

% wc lib/**/*.rb | sort -rn

And then I usually tackled what was at the bottom of the list.

The average file is now 59 lines long, with the largest being 201 lines, and the smallest, 10 lines. In the previous implementation, the smallest file was 102 lines, the largest, 761 lines, and the average, 328 lines.

Option processing was the major chunk of code tangled through the code base, primarily because there was a single Options class, a singleton used essentially everywhere throughout the code. I first split that into the subsets of options, such as those for the input options, for matching, and for output, with their equivalent submodules using only those option objects instead of the global/singleton.

This was further cleaned up by removing the option processing from what I eventually labeled their “specs”, the values that determined the behavior of the submodule. One idea that I had is that eventually those specs could be passed in from outside of Glark itself, for usage by external programs, such as PVN.

Adding one feature per day, which I’ve written about previously, motivated me to do some non-coding things, mainly writing documentation. I’ve realized that much of the distinguishing functionality of Glark hasn’t been well documented, and the man page has now increased from 927 lines to 1126.

It’s been an interesting evolution of the behavior of Glark, as much of its early functionality has been added into grep, such as colorized matches and context around them. Elaborating on what I wrote in the readme/man page:

Glark extends grep by matching complex expressions, such as “and”, “or”, and “xor”. This is useful in a case such as “I’m looking for “foo” and “bar”, within three lines from each other.” It can be infinitely complex, such as, also from the man page: “glark –and=5 Regexp –xor parse –and=3 boundary quote”, meaning: (within 5 lines of each other: (/Regexp/ and (/parse/ xor (within 3 lines of each other: /boundary/ and /quote/))))

Glark handles file, directory, and path arguments, optionally recursing directories to a certain depth, and processing path arguments as a set of files and directories. .svn and .git subdirectories are automatically excluded.

I realize, with a bit of guilt, that that defies the Unix principle of keeping programs small, and with minimal overlapping functionality, since much of that is already done by the “find” command. However, some of this behavior was included in early Glark, and grep itself has the “-r” option for recursing directories, so I wanted to extend that to be more advanced, in part because when running Glark on Windows systems, there is no “find” command.

Binary files are excluded (by default), but can, in the case of compressed or archived files, have their extracted contents be searched.

This rolls into Glark behavior that I’d wanted for a while, mainly for searching Jar files for class names, which I previously did via shell scripting, such as:

for i in *.jar; do jar tf $i | glark --label=$i Exception; done

That now can be done with:

glark --binary-files=list Exception *.jar

Glark can use a per-project configuration file, so different projects can have their own Glark parameters, such as different files to include and exclude for searching, and different colors for pattern highlighting. My goal there is that it can add feedback for when one is working in different projects, such as highlighting matches in Java code differently than in Ruby. Colorizing is still
only on a per-project basis, not on the file type itself, which I’m considering adding, since it might be helpful to distinguish matches in Ant build files from those in Gradle.

I am doing some final testing, and then the Ruby Gem should be available. I am looking for maintainers to repackage Glark as an RPM and a Debian package, although I will probably release unofficial packages for those within a few days as well.

Stati of Projects

I’ve been jumping around from one project to another on an as-needed basis, and here is the current status of each.

RIEL – I’ve been updating the colorizing code, adding the functionality to use extended color codes on ANSI terminals, instead of the default 10.

Glark – In the midst of a major rewrite, for both code purity and functionality. The main changes so far are extensions to the path/file arguments, bringing in some functionality from find. Glark will also (with the imminent integration of RIEL) support extended colors.

PVN – This project was in heavy development until a month ago, when it went into a state of waiting for updates to Glark, since it will be using Glark for its seek (searching) subcommand.

DiffJ – This project I rewrote in JRuby, but that was a little too slow for a command-line application -  even heavily tweaked, I couldn’t get the startup time under two seconds. So I re-rewrote it in Java, and intend to revisit it to add more intelligent code comparisons, such as understanding that “if (a == b) foo();” is the same as “if (a == b) { foo(); }”

IJDK – Mostly dormant at this point. When I’ve rewritten Java projects, I’ve tried to extract the generic code from them and add them to IJDK, but I haven’t been heavily involved with any of my Java projects lately.

Java-Diff – A while ago this was brought up to date to use generics, and was refactored for code clarity. There hasn’t been any reason to update it since then.

DoctorJ – Alas, this is dormant. Some of its warnings have been integrated into the Java compiler, such as mismatched parameter names. It still goes beyond that level of pedanticalness, so it is most suitable in the development of projects where documentation is paramount, such as APIs.

Related posts:

Tests Result in Better Coders

We know that tests result in better code. They also result in better programmers.

The first implementation of a project will often be utilitarian, complying with the edict of “just get it working”. Code will be written and tweaked until the tests pass, but there are few cycles, if any, devoted to refining the code to be higher quality. This the red/green/refactor cycle is limited to just red and green.

The refactoring phase is where the code goes from simply working to being well-crafted. As the mindset of the programmer changes from being focused on utility to being focused on quality, they program accordingly different.

With the principle of quality being foremost, and with the confidence from having a thorough set of tests, the programmer can also push the boundaries of their knowledge, such as delving into advanced object-oriented programming and metaprogramming, which can be dauntingly complex and risky. Those areas do not necessary have an immediate impact on functionality, so when a programmer is in functionality/utility mode, they’re less likely to employ those techniques. However, in quality mode, a programmer will feel more justified, and confident, in using those techniques, which over the long term can cause a dramatic improvement to a project.

Often forgotten is that both code and tests should be refactored. Test code that is unclear, misleading, or just wrong can be very frustrating to someone trying to understand a body of code, since the tests are the best starting point.

The bottom line is that programmers must understand that testing includes refactoring (including of the test code itself), and that the refactoring phase is where programmers, and projects, can become vastly better.

Related posts:

Rewriting Glark

For the PVN project, I’ve wanted to use Glark as a library for matching text (for the ‘seek’ subcommand), but I’d written Glark as a command-line application, and its design reflected that. It also, like so many “field-tested” programs, had very few tests, with the expectation that because
of being heavily used, flaws would easily surface. That’s relatively accurate: I’m fairly sure that I use Glark more than any other program not built into Unix (I even have grep aliased to “glark -g”).

I was once asked in a job interview what my opinion was of my own code. My response was that my code evidently sucks, because I’m always rewriting it. That was definitely true with Glark, being one of my first Ruby programs, migrating it from Perl, and not having touched it in a long time.

The problems with Glark were classics of bad programming: lack of tests, overly complicated code, use of global variables, poor class composition, and excessive coupling.

So I’ve been eager to rewrite Glark, but without tests, a program is much too brittle, so I knew that I’d first have to add tests. I simple can’t enjoy writing code without tests. However, with a comprehensive test suite, rewriting code is bliss. So I first wrote a few tests, then refined my test framework to the point that writing a unit test is as simple as this:

def test_simple
  fname = '/proj/org/incava/glark/test/resources/textfile.txt'
  expected = [
              "    3   -rw-r--r--   1 jpace jpace   45450 2010-12-04 15:24 02-TheMillersTale.txt",
              "   10   -rw-r--r--   1 jpace jpace   64791 2010-12-04 15:24 09-TheClerksTale.txt",
              "   20   -rw-r--r--   1 jpace jpace   49747 2010-12-04 15:24 19-TheMonksTale.txt",
              "   24   -rw-r--r--   1 jpace jpace   21141 2010-12-04 15:24 23-TheManciplesTale.txt",
             ]
  run_app_test expected, [ '--xor', '\b6\d{4}\b', 'TheM.*Tale' ], fname
end

That’s Glark matching 6nnnn ^ TheM*Tale. At this point, grep has added some of what made Glark quite distinct from it — highlighted/colorized matches and context — but Glark’s most unusual (and fun to program) feature is matching of compound expressions, such as:

% glark --and=3 write --or puts print **/*.rb

That is matching “write” within 3 lines of puts or print.

Do you need that very often? Nope. But it does come in handy, such as in the case of “I’m looking for where we are catching an InvalidArgumentException and logging it (within the next 5 lines) as an error:

% glark --and=5 'catch.*InvalidArgumentException' 'Log.error' **/*.java

Speaking of interviews, a friend of mine has a good practice of when he goes to a company to assess their software, he asks to see what they consider their worst code. Often that code is at the core of their project and is the oldest code, written early on by someone who may have left the company, and/or the code has been piled on with more and more complexity that it is difficult to detangle.

In Glark, the worst code of the entire application was the Options class, clocking in at 761 lines long, containing the 42 options in Glark. This class is a Singleton, which is the fancy Design Pattern way of saying Global Variable. (The worst by-product of the Gang of Four was the sanctifying of Singletons as being a good practice.)

Another sign that the Options class was written poorly is an insanely simple metric: wc. That is, running the “wc” command on all files, sorting them numerically, and looking at the largest files. There are the bottom, in all its corpulent glory:

% wc **/*.rb | sort -rn
    4     7   160 lib/glark.rb
  102   654  5213 lib/glark/help.rb
  183   527  4569 lib/glark/input.rb
  248   640  6064 lib/glark/exprfactory.rb
  266   681  6052 lib/glark/output.rb
  297   777  7392 lib/glark/glark.rb
  440  1048  9663 lib/glark/expression.rb
  761  2615 23377 lib/glark/options.rb
 2301  6949 62490 total

The Options class is used throughout Glark, so extracting it was quite challenging. I decomposed the Options class into smaller groups, and it just so happened that there was a design to follow, documented in, of all places, the help (man) page for Glark. That is, because there are so many options, for legibility and organization they are displayed in the man page in the groups “input”, “matching”, “output”, and “debugging/errors”. So I repackaged the Glark options into input, match, output, and info, and also used that as the organization for the modules within Glark. Thus the Glark::File class went into lib/glark/input/file.rb, and Grep::Lines went into lib/glark/output/grep_lines.rb.

I continued to refine the tests to the point that adding new ones was trivial. As the test coverage increased, this had the effect of making it an aberration when I worked on untested code.

On that note, here’s an easy way to test your test. That is to determine if your test really works, break the code that it is testing. (A “return nil if true” at the beginning a method works nicely.) If the test still passes, then it’s incomplete. If the test fails, then should add confidence that test coverage is adequate. This is also where it becomes fun to break code, and to break tests. As with anything, if it’s fun, we’ll do more of it, which is why it is essential to have a test framework that makes tests easy (ergo, fun) to write.

For years I’ve tracked my daily progress by the simple metric of lines of code, but after hearing this suggested on the Ruby Rogues podcast, I’ve begun the practice of adding one user-facing feature per day, such as a new subcommand or option to PVN. My definition of “feature” includes adding and refining documentation, and it also includes removing options, especially if they are confusing, redundant, unused or obsolete.

I track features with a Features.txt in the root directory of each project, of the form (from PVN):

Thu Oct 25 19:22:52 2012

  seek command: added [ -C --no-color ] option.

This keeps me on track by actually recording features that are added. A script I run on all {project}/Features.txt files shows whether I’ve added one for today, and when I’ve missed a day (only one since I started doing this).

My process for adding features feels like an extension of the TDD process, in short:

  • conceive of a feature
  • add it as a test
  • implement it
  • run tests
  • refactor the tests and code …
  • document the feature in the readme and help files
  • add the feature to the features file
  • commit with the feature description as the comment description.

That’s about it for this update on the coarse rewriting of Glark. You can track the progress of it on GitHub (https://github.com/jeugenepace/glark), and if you want to see the code before the rewrite, check out revision 4d10f192f46ec3df34f971f8b40e03f8df0aed27.

Tweaking Emacs: Snippets

Using snippets with Emacs has dramatically increased my productivity. I’m using YASnippet, and am thoroughly impressed by it.

Setup is easy, so I won’t repeat that here.

I put my own snippets in ~/.emacs.d/lisp/yasnippets/snippets/java-mode and …/ruby-mode, and you can see them here.

The obvious benefit with snippets is their brevity, usually replacing unnecessary or redundant boilerplate code, such as this one for creating a list in Java:

# name : List<String> list = new ArrayList<String>();
# key: list
# --
List<${1:String}> ${2:list} = new ArrayList<$1>();$0

So creating a list of Integers, named “numbers” is done with “list<tab>Integer<tab>numbers”.

Naturally, since this is Emacs, there is a real programming language available for use, so snippets can call Emacs Lisp code, such as this for creating a constructor in Java, which assumes the constructor name is the basename of the current file:

# name: constructor
# contributor: jeugenepace@gmail.com
# key: init
# --
public `(file-name-nondirectory
         (file-name-sans-extension
	  (or (buffer-file-name)
	   (buffer-name (current-buffer)))))`($1) {
        $0
    }

This is simply applying the principle of automating work as much as possible. In the words of Larry Wall: “The computer should be doing the hard work. That’s what it’s paid to do, after all.”

There are two other aspects about using snippets that I think are overlooked. At least I hadn’t expected them when I began using snippets.

The first is that snippets can essentially provide their own documentation. That is, the syntax ${1:String} above means that the default text “String” will be displayed, until and unless the programmer overwrites it with the different variable type. This is especially beneficial for what often confuses me, the order of the iterated variable, and the accumulator variable (injection below, and I’ve also seen it called memo).

With snippets, the default text describes the variable functionality. Voila and viola, no more confusion:

# name: inject(...) { |...| ... }
# key: inject
# --
inject(${1:0}) { |${2:injection}, ${3:element}| $0 }

The third benefit of snippets is that they add consistency across programming languages. My two primary languages are Java and Ruby (day job, night job), and I’ve added snippets to Java and Ruby mode for common functionality. For example, the Java snippet “constructor” above has the shortcut “init”, as does my snippet for Ruby mode:

# name : initialize
# key: init
# --
def initialize $1
  $0
end

Thus when I’m writing a constructor in either language, all I need to type is “init<tab>”. I’ve done the same for initializing instance variables with the same name as a constructor parameter, these two “svar” snippets (“svar” == “set variable”):

# name : set_variable
# key: svar
# --
@$1 = $1
# name : set instance variable
# key: svar
# --
this.$1 = $1;

What drove me somewhat crazy when editing snippets was that Emacs defaults to adding a final newline at the end of a buffer when saving it. Thus when I used one of my snippets, an extra line would be added. (For some snippets, I do want a newline, such as shortcuts for adding methods and constructors, which should be separated from other methods by a blank line.)

I knew of one solution, changing mode-require-final-newline to nil, but I didn’t want to change the global settings. I figured that YASnippet might have its own mode for editing snippets, and indeed it does, and even better, in that mode, a final newline isn’t written.

I set snippet-mode for my snippets with this:

(add-to-list 'auto-mode-alist '("\\.emacs.d/lisp/yasnippet/snippets" . snippet-mode))

Please do yourself a favor and check out YASnippet.

Sharpening the Emacs Saw

Emacs has been my primary/only editor for around 20 years, and I’ve become more aware of how little of its power I actually use. Emacs Rocks has been very inspirational in motivating me to further explore the functionality of Emacs. Lately I’ve been applying the DRY principle to Emacs itself, where if I find myself repeating a sequence of commands (including characters) in Emacs, I’ll write a macro or snippet (using YASnippet, which I highly recommend).

Integrating this approach with the Pomodoro and/or Getting Things Done principles, whenever I find myself repeating, er, myself, I’ll stop and add that sequence as appropriate, usually as a change to either my Emacs environment or my shell (Z shell) setup. It might be a bit of a distraction, diverting my attention for a minute or two, but the benefit is that I can begin using the new shortcut immediately. In fact, to enforce the new shortcut, if I accidentally use the equivalent long sequence, I’ll back up (such as deleting the just-typed characters) and use the new shortcut instead. This helps to erase the old muscle memory and enforce good, new habits.

YASnippet is especially valuable in automatically generating the boilerplate so much required by programming languages, especially Java. Even Ruby has its own overhead, and I recently heard (I’m about eight months behind) on a Ruby Rogues podcast that essentially if a Ruby programmer is actually typing “end”, then their environment is not adequately customized. I’d add that the same would apply to curly braces in Java and other C-derived languages.

A rule of thumb I’ve read is to add one script per day, and I’d advise the same for one’s editor, whether it be Emacs or any other powerful editor that can be customized, as virtually of them can be now. PragProg now has a new Vim book, Practical Vim. I don’t know of an equivalent book for Emacs, but it seems that there would be a market for it.

How to Copy and Paste

Imagine this challenge: not copying and pasting for a week. One week, without
ctrl-V (or ctrl-Y, or just plain old p).

Yes, challenging, and it’ll make you realize just how much we’re violating the
concept of reuse, copying pasting the same 1 or 2 or 200 lines of code. Yes, I
recently saw a method that was obviously copied from the one just above it,
another 200 line method, with — get this — only four lines changed, for two
variables, one a boolean and the other a string.

So the first way to copying and pasting properly is to avoid doing it. (An
interesting feature of an editor would be to monitor the copying and
pasting a programmer does during a coding session, and warn them when the metric
becomes excessive: “Another 38 lines copied … are you sure that shouldn’t be a
method?”)

The next step is to copy as little as possible. *Never* more than two lines.
(Forgiven is the boilerplate code some languages require, and the copyright
comments at the top of files.)

My metric with regular coding is one line: if I start to copy two lines, I
consider rewriting that as a method. (This is fun, when copy-and-paste is
replaced by cut-and-paste, meaning that there is some refactoring and reuse
afoot.)

Now this is the best part, and this little trick will avoid a lot of bugs that
result from bad copying and pasting: don’t copy the characters that you’ll be
replacing.

For example, we have the line of Java:

        Foo bar = new Foo(String.valueOf(args[0]), new Frobnitz(args[1], args[2]), 6, "six", "vi");

And there will be a similar variable, named “baz”, but initialized with args 3,
4, and 5, and with different numbers:

        Foo baz = new Foo(String.valueOf(args[3]), new Frobnitz(args[4], args[5]), 7, "seven", "vii");

The normal way to copy would be to copy the entire line, then go back and modify
the newly-pasted code, replacing “bar” with “baz”, 0 with 3, 1 with 4, etc. The
problem with that is that it is too easy (and thus common) to omit updating one
of the variables, leading to this:

        Foo bar = new Foo(String.valueOf(args[0]), new Frobnitz(args[1], args[2]), 6, "six", "vi");
        Foo baz = new Foo(String.valueOf(args[3]), new Frobnitz(args[4], args[2]), 7, "seven", "vii");

Note that “args[2]” is wrong above, not being updated after the second line was
pasted.

How to avoid this: only copy the code that will not be replaced. So in the
above, we would first copy up to (but not including) the “0″, paste that, type
“0″, then copy and paste the next string up through “args[":

        Foo bar = new Foo(String.valueOf(args[
                                              0
                                               ]), new Frobnitz(args[

At that point, I'd just type the entire remainder of the line, since it's faster
(and more accurate) to type a few characters than to copy and paste them.

So the copy and paste rules, in summary:

  • Avoid it.
  • Keep it short.
  • Keep what you paste.