JRuby Issue with Regexp.last_match

In the DoctorJ project, I’ve been rewriting the Javadoc parser, and did the initial rewrite in Ruby. That was straightforward, and I then began migrating that to JRuby, with the idea that the code could gradually morph from Ruby to Java, via JRuby, such as regular expressions being reimplemented as java.regex.Pattern instead of Regexp.

The first step of the Ruby to JRuby transition was simply to change the shebang line to “#!/usr/bin/jruby”. However, there were test failures, and finding the source of those failures was difficult because the tests were so high level, meaning that the parsed Javadoc is what was tested, not the results of processing the Java comment with the Javadoc regular expression.

Eventually it became clear that the issue was with JRuby itself, not with my code. The JRuby code is very clear to understand and is formatted very well, and it closely matches the C source code of Ruby itself, making issues even easier to diagnose.

In this case, the issue was with the RubyRegexp class in JRuby, which, when setting the value that Regexp.last_match will return, has a reference to the region (capture/group) for the current match. However, that reference is to a “live” object (as opposed to an immutable one), and subsequent matches for that regular expression will update the region object, so the first Matchdata returned by Regexp.last_match will have captures that are the same as the latest match.

Here is a RubySpec that describes this issue:

describe "JRUBY-6141: Matchdata#captures" do
  before :all do
    "first, last".scan(Regexp.new('(first|last)')) do
      @firstmatch ||= Regexp.last_match
    end
    @lastmatch = Regexp.last_match
  end

  it "returns first value from Regexp.last_match after all String#scan iterations" do
    @firstmatch.captures[0].should == "first"
  end
  
  it "returns last value from Regexp.last_match after all String#scan iterations" do
    @lastmatch.captures[0].should == "last"
  end
end

The solution for this issue is that the Region object should be cloned for setting the Regexp.last_match reference.

This issue was submitted as JRUBY-6141. JRuby uses RubySpec, so I provided the above test, as well as a Git patch, which were committed to the JRuby source.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s