JRuby Issue with Regexp.last_match

In the DoctorJ project, I’ve been rewriting the Javadoc parser, and did the initial rewrite in Ruby. That was straightforward, and I then began migrating that to JRuby, with the idea that the code could gradually morph from Ruby to Java, via JRuby, such as regular expressions being reimplemented as java.regex.Pattern instead of Regexp.

The first step of the Ruby to JRuby transition was simply to change the shebang line to “#!/usr/bin/jruby”. However, there were test failures, and finding the source of those failures was difficult because the tests were so high level, meaning that the parsed Javadoc is what was tested, not the results of processing the Java comment with the Javadoc regular expression.

Eventually it became clear that the issue was with JRuby itself, not with my code. The JRuby code is very clear to understand and is formatted very well, and it closely matches the C source code of Ruby itself, making issues even easier to diagnose.

In this case, the issue was with the RubyRegexp class in JRuby, which, when setting the value that Regexp.last_match will return, has a reference to the region (capture/group) for the current match. However, that reference is to a “live” object (as opposed to an immutable one), and subsequent matches for that regular expression will update the region object, so the first Matchdata returned by Regexp.last_match will have captures that are the same as the latest match.

Here is a RubySpec that describes this issue:

describe "JRUBY-6141: Matchdata#captures" do
  before :all do
    "first, last".scan(Regexp.new('(first|last)')) do
      @firstmatch ||= Regexp.last_match
    @lastmatch = Regexp.last_match

  it "returns first value from Regexp.last_match after all String#scan iterations" do
    @firstmatch.captures[0].should == "first"
  it "returns last value from Regexp.last_match after all String#scan iterations" do
    @lastmatch.captures[0].should == "last"

The solution for this issue is that the Region object should be cloned for setting the Regexp.last_match reference.

This issue was submitted as JRUBY-6141. JRuby uses RubySpec, so I provided the above test, as well as a Git patch, which were committed to the JRuby source.