This Week on perl5-porters - 29 May-4 June 2006

Funnily enough I developed a patch for this exact problem a few days ago. -- Dave Mitchell, commenting bug #39252

Topics of Interest

On un-eval-able code

Peter Valdemar Mørch ran into a curious problem with dieing code, which when wrapped in an eval, apparently dies as well, however, nothing was captured in $@. This only occurs when perl runs out of file handles.

Dave Mitchell saw that Carp overwrites $@ all by itself if it fails to load Carp::Heavy, and so he put a stop to that, and also tweaked pp_ctl.c to deal with running out of file handles in a more graceful manner.

  Can you handle it

A better DynaLoader.t with less assumptions

The DynaLoader thread continued to attract a surprising amount of traffic, with an attempt by Jarkko Hietaniemi to provide a test suite that would work in a suitably safe cross-platform manner. Craig A. Berry weighed in with some information on the quirks the VMS brings to the party.


Redoing part of change #27374

Back in March, H.Merijn Brand checked in some changes to Porting/ to teach it about VMS's Abe Timmerman discovered that the DCL code in question doesn't run too happily on VAX/VMS 7.2 and suggested an improvement, however, he then saw compiler errors in globals.c due to conflicting definitions of variables.

Craig A. Berry was happy to see this level of care and attention being devoted to the VAX, and explained how to debug the problem. John E. Malmberg also provided some information on how to trace back from the error message to the source (which involves a certain amount of unwinding of included header files).

Pod::Html not safe with taint mode

Ivan Wills runs the DocPerl project on SourceForge, and noticed that Pod::Html doesn't play nicely with taint mode, due to its internal caching mechanism calling getcwd(), and wondered what to do about it.


  The p5p thread

Documenting how the regexp engine works

Having spent a fair amount of time under the hood of the regular expression engine recently, Yves Orton started to document his understanding of how it works, and asked for feedback. Several people made some suggestions.

Notably, Dave Mitchell wanted the documentation to make it clear that it was a "this is sort of how it works, more or less, today, maybe a bit" type document, and not something that people could construe as a guarantee, and therefore use it as an API, forever condemning the future porters to maintaining the behaviour in a compatible manner.

Marvin Humphrey hoped that the information provided would be sufficient for an XS author to figure out how to execute a regular expression directly from C.

  First cut

Yves rolled the suggestions into his prose, and delivered a new version.

  Second cut

Using the Aho-Corasick pattern matching algorithm

Yves then shipped a masterful patch, to add another alternative for the engine to use to find matches. The fact that he dared delve into intuit_start deserves a round of applause. But before people could do anything, Yves withdrew the patch, having spotted a way to improve it.

  It ain't done until it's done

What are legal characters inside $(xxxx)?

John E. Malmberg wanted to know what $(xxxx) could be expected to contain, since it turns out that on an OpenVMS ODS-5 filesystem, ./$(xxxx) is a perfectly valid filename, and he was worried that Perl would try to interpret in odd ways, or rather, a make-like utility might have a bit of trouble with it.

Craig A. Berry responded, saying that he thought that John had stumbled across a weird interaction between Perl and VMS's DCL parsing and macro expansion, and wanted to know where and why the problem arose.

John explained that, while the system had worked adequately up until now, the advent of ODS-5 filesystems changes things somewhat, and he wanted to see how a Unix filename encoded in UTF-8 would be handled in this brave new world.

  In a desert on a file with no name

tied scalar references (aren't)

Dave Mitchell reported a problem that had surfaced on Perlmonks, in that when a reference is tied, after the first call to FETCH(), the code in pp_entersub and pp_rv2sv have code that accidentally on purpose short-circuits FETCH from being called again, thus stopping the usual tie mechanism from operating.

Nicholas Clark considered the issue, and concluded that while the current behaviour was broken, and that it while it was clear what the correct behaviour should look like, fixing it would not be easy.

The problem is that magic needs to be invoked only once when reading or writing a variable. For ordinary scalars, this works fine, but for references, it doesn't work, because there is nowhere to squirrel away the necessary flag to determine whether or not to invoke magic. And if fixing it means that magic might get called twice, the result is likely to be even worse than failing to calling it once.

  By the pricking of my thumbs

Watching the smoke signals

Smoke [5.9.4] 28316 FAIL(F) openvms V8.2 (Alpha/3cpu)

This failure, on one of Abe Timmerman's boxes, was caused by Compress::Zlib's test suite, and for some reason, a scalar that was supposed to be holding the name of a file in which to capture STDERR was empty. Paul Marquess asked for the verbose output of the tests, so see what was going wrong.

Nicholas Clark wondered whether the tests would fail in a similar fashion if the variable held a single newline character, as he recalled that on VMS, newlines will sometimes wind up where Unix people wouldn't expect to find them. Craig A. Berry ran the tests on some other equipment, but couldn't reproduce the failures. On the other hand, he did notice that autosplit.ix might be not wind up in the right place.

  Up in smoke

New and old bugs from RT

More on memory leaks from eval "sub { \$foo = 22 " (#37231)

Nicholas Clark thanked Dave Mitchell for the eval leak plugging, and reported that the only way he was able to make perl leak on bad code was with 1; use abc and a downright evil BEGIN {$^H{a}++}; 1; use 6.

Dave Mitchell scoffed at this feeble attempt and added a SAVEDESTRUCTOR_X function to deal with cleaning up the parse stack if it fails prematurely. He was brave enough to announce that that should be the end of the eval leaks.

  The holey grail

More on readline of a non-newline-terminated last line results in Bad file descriptor (#39060)

Mark Martinec reported this problem early May. Andreas Koenig moved the issue forward by using his binary-chop compilation technique to trace the problem back to a change made by Jarkko in 2003.

  Being silently happy

More on numeric comparison operands not treated consistently (#39062)

Andreas then wheeled the time machine out again, this time isolating the problem to a patch from Gurusamy Sarathy in 2000. In doing so, he discovered that an easy work-around consists of adding $Data::Dumper::Useperl=1, which makes the problem go away (that is, use the pure-Perl rather than XS version).

He also slimmed the test case down to reveal the peculiar differences between left and right operands in numeric comparisons more clearly.

  Not your usual number

Failure not always detected in IPC::Open2::open2 resolved (#39127)

Dave tightened up the documentation, and made the code deal correctly with exec failures.

  There's always IPC::Run

By a strange coincidence, "xdrudis", doing his work at tinet, was dealing with the same job. He had built an elaborate bug report to suggest a number of ways of handling it. Dave dealt with it in change #28347.

  The head was never found

IO::Socket::connect returns wrong errno on timeout resolved (#39178)

Steve Peters applied the fix that suggested in the bug report.

Panic opt close (#39233)

Johan Vromans found that m/(?:(\w\w){2}){8}/ cause perl to panic with a Panic opt close in regex. David Landgren found that an even simpler pattern, m/(?:(a){1})?/, would also do the trick.

  No idea why, though

Core dump in Storable::store (#39246)

Mark-Jason Dominus managed to provoke Storable into dumping core, by using it as a backing cache for his Memoize module.

Steve Peters identified this problem as being related to, or a duplicate of, bug #21436.

defined-ness of substrings disappear over repeated calls (#39247)

Martin at datacash discovered a problem with substrings in subroutines losing their pPOK, which can lead to all sorts of bad things, the least of which being scalars passed to DBD::Mysql being treated as NULL when they are not.

Sadahiro Tomoyuki determined that change #20462, designed to fix bug #23207, was the origin of the problem. He was puzzled, however, since when he reverted the change, while this fixed up Martin's immediate problem, the original problem that that patch was meant to fix didn't manifest itself. Which would mean it was fixed by another unrelated patch.

  A twisty maze of changes, all alike

On the other hand, when Martin did the same, the problem did show up, which made him wonder a bit, until Tomoyuki explained that the newly-unmasked bug shows up on 5.9.0, but no longer on versions 5.9.1 through 5.9.3. He also found the source of confusion: bug #24816, which was fixed in blead with #22074, and later backported as change #27391 in the maintenance track, solves both problems and thus #20462 can well and truly be reverted.

  Confused yet?

All this made Martin wonder whether DBD::Mysql should be taking more care with what gets passed to it, specifically, testing for magic as well as SvOK, which would make it more robust.

Tomoyuki thought that perl's internals are far too complicated for XS authors to be expected to deal with in all circumstances, and that a better solution in the present case would be to store the substring into a temporary scalar and pass that, instead of passing the substring directly.

  Substrings are deeply magical

handy.h does not guard against being #included twice (#39251)

Gabriel Nazar noticed that handy.h doesn't have the #ifdef/#define/#endif trick to stop the compiler from going bananas when it gets included more than once. H.Merijn Brand wanted to know what circumstances led to this happening in the first place.

  I want handy

stat() doesn't work on dirhandles (#39261)

Mark-Jason Dominus posted a short snippet that showed how you can't stat a handle opened by opendir, and get anything useful back. Peter Dintelmann thought that stat was documented as not doing anything useful with directory handles.

Steve Peters had some code on the back-burner that could deal with this issue, and others, so he dusted it off to see what could be done.

  Would be handy

Perl make failure on hpux system (#39269)

Brian Shields was having a considerable difficulty in compiling a version of Perl for HP-UX, which boiled down to getting ndbm and/or gdbm working correctly.

Dominic Dunlop tried to explain what was going wrong, and the steps Brian should take to get everything working. H.Merijn pointed to his HP-UX repository, where Brian might be able to find some binary packages for the stuff he needed.

  The HP-UX repo

Perl5 Bug Summary

Two less bugs this week than last week. Next stop, 1300.

  10 opened - 12 closed = 1488

  And the rest of the gang

New Core Modules

  • Encode version 2.18 released by Dan Kogai, the Encode maintainer.

    Ruud Affijn pointed to an NNTP article about unexpected encoding behaviour, but the summariser has unexpected decoding troubles in trying to read it.

In Brief

Jarkko Hietaniemi tweaked t/op/incfilter.t to play nicely when doing the make clean; make miniperl; make minitest dance,

  Edit, compile, run

and also gave PerlIO_cleanup() the responsibility of freeing PL_perlio_fd_refcnt, since nothing else appeared to be doing so.

  A dirty job, but someone's gotta do it

Yves Orton (re)?discovered that C is not Perl and AVs don't just magically free themselves.

  DWIM, dammit!

Jarkko Hietaniemi posted a link to, and wondered if anyone tried the Open Watcom compilers. Steve Peters said that he had had a brief look in the past and had got as far as figuring out what make command to use, and then ran out of tuits.

  What compiles

Andreas Koenig noticed a couple of files missing from MANIFEST relating to test files.

Brendan O'Dea figured out what the problem was with Debian mis-packaging perl which caused List::Util to end up lacking a shared object file. Rafael applied Brendan's fix.

Juerd Waalboer delivered an updated version of perlunitut, a tutorial on dealing with Unicode in Perl, and this was added to blead by Rafael.

  Unicode redecoded

He also pinged the list again about an issue with Encode and encoding, to see if anyone had some ideas on the matter. Johan Vromans seemed quite taken with the concept (of what to do when both source and destination use different encodings). Rafael thought that patches would be nice.

  Scratch that itch

Daniel Frederick Crisman took another stab at three minor fixes related to perlop. Rafael thought that the first fix fixed something that wasn't broken, that the second dealt with a file that was auto-generated (perhaps meaning that the source document needs to be patched instead), and that the third fix had POD errors.

  podchecker is your friend

Sadahiro Tomoyuki gave perlunicode.pod a workover to improve the narrative. Applied by Rafael.

Jarkko found that his AIX compiler had considerable trouble with an #ifdef inside a macro definition, and sought a way to fix it. This wasn't as simple as first thought, and at the end John E. Malmberg pointed out how things could still go wrong on OpenVMS when using Bash.

He also noticed that pp_ctl.c uses Latin-1 characters (or in any event, ASCII characters beyond 127), which would also be a portability issue.

And finally, Jarkko noted some weird errors with tests under ext/B, which Abe Timmerman thought was probably due to Test::Smoke not keeping up with the changes to Test::Harness. Smokers were advised to upgrade their smoking implements.

  Stick that in your pipe

