This Week on perl5-porters - 25 September-1 October 2006

This Week on perl5-porters - 25 September-1 October 2006

"Well not since that time you thought that embedding perl was a good idea in the first place" -- Johnathon Stowe, commenting on Marc Lehmann saying that yes, he loved embedding perl in other applications and that no, he didn't do drugs.

Topics of Interest

Wrapping up EVAL handling in blead's regexp engine

Dave Mitchell checked in a couple of patches to fix the breakage that had crept in during the recent overhaul to the regexp engine. Yves Orton liked what he saw, and suggested a few ways to improve the maintainability.

Dave told him to go for it. And so he did.

  Mad scientists at work

pod errors in perlref.pod

David Landgren noticed some formatting errors when looking at perlref in blead. Rafael Garcia-Suarez wondered what podlator was being used, since the pod checked out as valid. The podlator in question was the bundled Pod::Perldoc, which led David to wonder why something like Russ Alberry's podlators code. Russ explained what was holding this up.

Nick Ing-Simmons

It was with great sadness that the porters learnt that Nick Ing-Simmons had died of a heart attack. Nick was one of the longest serving contributors to Perl's development, and we are all poorer for his loss. I can recall many a time when people puzzled over things were done in a particular way in the code, until Nick would chip in with a a cogent explanation of why. It will be harder without him.

The Perl Foundation ran the following obituary

  Thanks Nick

Unhandled exceptions in shared libraries

Kalpana Shetty explained a problem with shared libraries throwing an exception, and the perl program dumping core on a SIGABRT signal. No takers, possibly since the exact nature of the shared library wasn't specified.

reentr reshuffle

Jarkko Hietaniemi looked at bug #40256 and a few other bits and pieces, and rejigged to make everyone happy. In fact, Marc Lehmann was positively ecstatic.

  Get down

seek beyond length

Juerd Waalboer wanted to know whether it was reasonable to seek, upon a scalar opened for IO, beyond its length. And especially what happens as a result. Dave Mitchell confirmed that Bad Things were indeed happening, that probably should be fixed.


I think that this may be what inspired Jarkko to file cook up the following patch:

  Don't talk to strangers

Finding the *correct* line number

Curtis "Ovid" Poe wanted to disambiguate two calls to a function on the same line with caller. Dave Mitchell mentioned that he had proposed to spend two bytes per op, which, at the expense of a certain increase in memory consumption, would clear these sorts of problems up, once and for all.

Fixing up lndir to improve source tree symlinking

Jim Cromie made a small change to lndir to make it behave more nicely in the presence of modified files (which could other clobber the pristine base tree). This kicked off a discussion with John Peacock, Rafael Garcia-Suarez and Andy Dougherty about other tricks that can be used to deal with similar sorts of home experiments.

Error message changes in blead

Yves Orton mourned the passing of pseudo-hashes, and their attendant Can't coerce array into hash error message. Nicholas Clark reasoned that it should be possible to change the current Not a HASH reference to something like can't coerce ARRAY reference to HASH reference, but wondered if it would break anything.

Yves pointed out that since the message had already changed, the issue was moot. Nicholas agreed that an error message should explain why something was wrong, rather than just saying that it was wrong.

One may assume that afterwards, Yves went away to see what could be done to improve the situation, and came back with a new conclusion: perl's error message suck. They are scattered throughout the source, which makes maintenance difficult, and localisation impossible. He proposed changing things like

        DIE(aTHX_ "Not a GLOB reference");



This way, a message can be changed in one single place, and would allow commercial vendors to make perl's error message fit in with their own frameworks (witness the recent z/OS efforts), and finally allow messages to be translated in to other languages.

Jonathan Stowe wasn't exactly opposed to the idea, but mentioned the frequent idiom:

  eval { somethingnasty() };
  $@ and $@ =~ /Some nasty error/ and recover();

Rafael pointed out that there are messages with %s specifiers as well, but that it would surely benefit the maintenance of perldiag.

  Globales Symbol "x" erfordert ausdrücklichen Paketnamen

Patches of Interest

What Yves Orton did this week

Nicholas applied Yves's patch from last week, which prompted Yves to deliver another patch, this time sawing off six thread-shared variables. The result is that use re 'debug' is now lexically scoped. This latter patch was redone after Rafael had a bit of trouble with it. Yves also took the opportunity to throw in some tests for lexically scoped regexp debugging.


On to the main course, Yves then delivered a patch that introduced a new assertion that allows the writing of elegant expressions to match arbitrarily deeply nested pairs of tokens. For instance, one could parse XML documents code with:


(along with the appropriate amount of hand waving). Yves was critical of the implementation, in that it requires capturing (...) to be used, rather than grouping (?:...), but on the other hand it conforms with what Python and PCRE already do.

Dave Mitchell was most impressed, and quizzed Yves about the behavioural semantics regarding backtracking. Rafael wanted to know if it was possible to extend the patch to allow case sensitivity changes by way of something like (?i(?1)). (Alas, no).

Robin Houston also expressed his delight at the patch and admitted to being the party who added this functionality to PCRE since he lacked the courage to attack Perl's regular expression internals. He too raised a couple of questions about behaviour in borderline areas.

Yves followed up with another version of the patch that added user documentation and more tests. He has also been reading Jeffrey Friedl's Mastering Regular Expressions, and has taken up the challenge to resolve as much as possible the areas in which Jeffrey finds Perl's regular expressions wanting. He laid out his roadmap in perltodo.

All applied.

  And the crowd goes wild

And as Yves was sweeping the floor of the lab clean from his previous efforts, he bundled up one last patch to tidy the regexp debugging output somewhat.

Improving Exporter documentation

Smylers caught up on the Exporter thread (thanks to the summary, heh) and argued in favour of ensuring that code was strict-compatible, even if use strict wasn't explicitly mentioned in the snippet. This means either using package variables or using our.

Continuing to deal with some gcc warnings

Sadahiro Tomoyuki figured out why a recent patch of Jarkko's was causing things to go boom! on a perl compiled with -Duse64bitint. Patched.

File::Temp doesn't handle cmp overloading

Rafael Garcia-Suarez posted a short snippet demonstrating a problem with File::Temp overloading. When stringified (interpolated in a string), a File::Temp object returns the underlying file name.

This works fine until you try to compare the stringified object to a scalar, at which point perl starts looking for an eq operator and dies. So the fix was to overload cmp, and all was well, but Rafael wondered whether overload should synthesise a cmp from a stringification override.

Rick Delaney pointed out that it will already do this if you add the fallback => 1 attribute. John Peacock didn't like the idea of generating cmp automatically.

cflags.SH: The Revenge of gcc -std=c89

Jarkko, ever the optimist, kicked through the rubble of his previous C89 configuration patch and worked up a scan to get better results back to see how -ansi, -pedantic and -std=c89 behave in the face of weird system headers and other platform oddities, and asked for people on proprietary hardware to take it for a spin and see what comes of it.

  Don't take no for an answer

New and old bugs from RT

threads creation memory leak (#40416)

Santeri Paavolainen posted a short program that demonstrated a fairly definite leak when creating threads. Dave Mitchell noted that it was fixed in blead.

  Another reason

File::Find has issues with symlinks (#40417)

Ammon posted a very long and detailed report about the problems that File::Find has in relation to symbolic links. Code included.


Unicode Command Line Arguments (#40418)

Dale Gerdermann wanted to read Unicode arguments from the command line and it seemed to work in all cases until he hit a regular expression, and there he had to force a utf8::upgrade($arg) for it to work.

Dave Mitchell pointed the -C command line switch that takes care of this issue.

Dale found similar Unicode problems in conjunction with LWP and Unicode (#40432)

XML-Twig tests cause bleadperl to segfault (#40420)

Shlomi Fish reported that the test suite of XML-Twig version 3.26 causes segfaults on bleadperl, but forgot to say which tests were at fault.

Segfault in pack (#40427)

dgay showed how to provoke a core dump with a simple call to pack and provided a simple patch to fix up the problem. Unapplied.

  Should have a test, too

Perl5 Bug Summary

  A net increase of ten bugs this week

  Kill a bug today

New Core Modules

In Brief

Dave Mitchell applied an expedient hack to allow \x{NNN} in t/op/re_tests.

  So there's no excuse now

Mark Stosberg wanted to clarify the documentation for with a short snippet. H.Merijn Brand golfed it.

  And still readable

David Landgren made a small clarification in perlref concerning the interpolation of scalar references.

Christian Jaeger made a couple of suggestions to improve the navigation between, Request Tracker and CPAN.

Jim Cromie wanted to start cooking with perl.gcov, so Steve Peters and Sébastien Aperghis-Tramoni gave him a couple of recipes.

  Got you covered

John E. Malmberg. responsible in large part for brings Perl's VMS implementation kicking and screaming into the third millennium, had to say goodbye for now. He's out looking for a new job, and so had to bow out from the list for the time being.

  Thanks, and good luck

Yves Orton noticed that the ptree constants TODO tests are passing and wondered if that meant that they could be untodo'ed.

Jim Cromie proposed a -U patch, that stands for Unofficial Userhacking.

Jarkko thought that gcc's -pedantic should be renamed -useless.

About this summary

This summary was written by David Landgren.

Weekly summaries are published on and posted on a mailing list, (subscription: The archive is at Corrections and comments are welcome.

If you found this summary useful, please consider contributing to the Perl Foundation to help support the development of Perl.