This Week on perl5-porters - 24-30 April 2006

This Week on perl5-porters - 24-30 April 2006

Lots of work consting, compiling and testing perl this week.

Topics of Interest

Running hard just to stand still

Nicholas Clark commented on Jerry D. Hedden's cry for help concerning the slow application of his threads patches, and wondered what do other open source languages, such as Lua, Python, Ruby and Tcl cope with the day-to-day drudgery of sifting through bug reports, shepherding the changes to the code base, and generally making sure that the project keeps moving forward.

  Holding the fort

New taint tests cause threaded FreeBSD to hang

Nicholas Clark saw that some new taint tests were causing perl to hang on recent versions FreeBSD, and wondered what to do about it. Steve Peters reproduced the problem on OpenBSD as well.

On the other hand, Dominic Dunlop gave Mac OS/X Darwin a clean bill of health (despite the fact that it is derived from FreeBSD).

Anton Berezin (the man for Perl on FreeBSD) suggested linking with a different threads library. Everything worked when Nicholas tried that, leading him to conclude that hints/ could do with an update.

  I'll give you a hint

Rebuilding XS modules for threaded perl

Peter Scott wanted some "best practices" advice on how to build a threaded and non-threaded perl, and have them coexist peacefully, falling back on the unthreaded perl for all the ancillary tools such as perldoc and perlbug. Specifically, and hence the title of his message, he wanted to know if he could get away with using the same compilation of XS modules for both perl instances.

Jan Dubois clarified the situation, explaining that threaded and unthreaded perls are binary incompatible, so modules with XS components need to be recompiled, and need to be installed in separate locations (so that the associated perl picks up the right libraries for itself).

Andy Dougherty welcomed Peter's suggestion to patch the documentation with his findings.

  We build them here, we build them there, we build these modules everywhere

Using / and X in unpack

Peter Dintelmann had a question concerning the ability to walk back the stream with unpack. He was trying to understand how to do it but was struggling with the paucity of information. Sadahiro Tomoyuki explained exactly how it worked, even going so far as to examine the C source and realise that the error checking could do with a little tightening up.

Tomoyuki's explanation was just what Peter needed.

Refitting t/op/*.t files to use

David Landgren started to work on refitting t/op/pat.t to use, which would make it simpler and easier to add new tests. He had a couple of questions, mainly concerning how to modernise some fairly archaic tests.

Andy Lester was thrilled. Nicholas Clark recalled that one useful feature that does not have but should, is the management of temporary files. Being able to create them on demand, and clean them up automatically at the end of the script. Apparently this sort of make-work code is scattered throughout the test suite.

  Into the third millenium we go

David also had a look at a section in t/op/pat.t for patterns that warn when compiled and wondered whether a particularly intricate collection of closures was in fact a no-op. No takers.

  The sound of chain-saw warming up

The new version was finally delivered. It generated a lot of discussion, but alas, remained unapplied.

Use of eval in t/op/append.t

David then looked at t/op/append.t and saw that some of the test results were eval'ed (rather than the thing being tested), and also wondered why the tests were tested with regexps instead of comparison operators. Sadahiro Tomoyuki explained that the tests were there to see if concatenation stops on \0 (binary zero), and regexps were used instead of comparison operators, in case the comparisons also had an equivalent bug which stopped them scanning on \0.

Which led David to wonder if there wasn't a just as likely possibility for regexps to also stop on \0, in which case something like rindex might be better, since it would never encounter any dubious characters.

Tomoyuki thought that the eval was no longer necessary (although it does mean that things happen at run-time rather than at compile-time, which may have an influence).

Elsewhere, David looked at the tests in t/op/grep.t and was surprised to discover that some tests can pass, and print ok, or they can fail, and print... nothing.

  As long as everything works

Perl on Win98

David Bree asked for advice on compiling perl on Windows 98 using MinGW. He had in fact compiled it, but was wondering whether it was safe to install, since about five percent of the tests failed. Which doesn't seem to shabby. Unfortunately, no-one was able to offer any advice.

  Maybe if we knew what had failed

Merging wince and win32 directories

Yves Orton started to tackle the long-talked-about much-awaited merge of the wince and win32 directories, that contain the platform-specific extensions for the Windows CE platform on one hand, and Windows in general on the other.

Vadim Vladimirovich Konovalov (hereafter referred to as Vadim) promised to add some improvements to the WinCE side, and was pleased to see that someone was taking the necessary time to clean up the source tree. He also mentioned the existence of a project, but that it is pretty calm at the moment.

Yves's approach was to create a new win directory, starting with the contents of win32, and then see what happened as he pushed and poked bits of wince into it, until there is nothing left over.

  Where do you want Perl to go today?

t/op/stat.t failures on z/OS

Mohammad Yaseen reported the difficulties he was having with 5.8.7 on z/OS (the IBM mainframe OS). In this instance, t/op/stat.t was failing with messages about /dev/tty, depending on whether the test was run in single-user or multi-user mode.

Jim Cromie asked for some specific details, to help understand what single-user and multi-user means in the context of z/OS.

Clawing scarce bits back from the magic vtable structure

As Nicholas happily integrated a change from blead to maint concerning bitmask flags on a field in the magic vtable structure, he realised that there weren't a whole lot of spare bits that remained, and even though code with the first definitions of those flags has been out in the wild for some time, he wondered if some of the newer bitmask flags could be reeled back in.

Dave Mitchell thought that three flags could be freed up, since they were either marked as being wildly experimental, or not documented at all.

  Peak bits

taint and fork on Win32

David Golden had been taking a VanillaPerl on Win32 for a spin, and noticed that Test::WWW::Mechanize kept dying a horrible death. Carl Franks and he narrowed it down to a delightfully simple snippet: perl -Te fork and then searched the bug database but found nothing similar. So he wanted to know whether to file a bug report, or whether the failure mode was common knowledge. No takers.

  Can't have your fork and taint it

Is a FileHandle IO::Seekable?

Adam Kennedy related the problems he was having after having taken over the maintenance of Archive::Zip. In a nutshell, he needs to have seekable file handles. And yet it turns out that at one point in the test suite, a file handle winds up being connected to a pipe, which definitely is not seekable. So Adam wanted to know if something was truly seekable, as opposed to merely having a seek method.

The best Randy W. Sims could come up with was a kluge with file test operators.

  Seeking advice

Patches of Interest

The continuing saga of the improvements to the regexp engine

Yves Orton repatched his trie work to simplify the sharing of the code base between blead and maint.

  Nice trie

5.8.0 could be supported as well, if there was some way to fake SAVEBOOL via ppport.h.

Reini Urban showed what needed to be done to get it to compile with the ActivePerl source.

consting goodness

Andy Lester took a crack at adding some consting goodness to doop.c, but Sadahiro Tomoyuki pointed out a mistake. So Andy took had another go at it. Not applied, as far as I can see.

He then killed a modal boolean that controlled the behaviour of S_glob_2inpuv in sv.c, preferring to split the functionality into two different routines. Applied.

Andy then carried on, inlining a couple of static routines in perl.c. Also applied.

Andy found reasons to use NOOP macro, part one.

Moving right along, he removed some unused (or rather, useless) casts in regcomp.c and regexec.c, which prompted Yves Orton to comment that when he was working on adding trie support to the regexp engine, he found those casts to be quite helpful in figuring out what was going on.

He had more luck removing unused context in sv.c.

And another context parameter in pp_ctl.c, with some consting goodness thrown in for good measure.

And some more places in which he could tweak the usage of SvREFCNT_inc calls.

  In mg.c

  and pp_ctl.c

And some compiler warnings in perlio.c.

  And that's what Andy did this week

Using v?snprintf/strlcpy/strlcat when useful

Jarkko Hietaniemi patched a number of files to use the C run-time equivalent string routines (that have been vetted as working correctly at build time) rather than alternative, less safe routines. Jarkko asked for eyeballs to look carefully at the patch in case there were any glaring errors.

Steve Peters took the patch for a spin on the platforms he had lying around. In the meantime, Jarkko found a section in pp_ctl.c that he had missed.

H.Merijn Brand wondered what the status was concerning these routines, since on some proprietary platforms, the C source to these routines had some fairly fussy copyright noticed attached. Steve said he had dug up the source to the Perl routines from an old copy of inn, and believed that they were in the public domain.

Russ Allbery said that he authored inn, and that Steve was perfectly entitled to having taken them for use in perl. This prompted H.Merijn to suggest removing the #ifdef trickery surrounding the calls to these routines, and instead use them everywhere, and bundle Russ's public domain code to be used on platforms that don't provide them.

  New uses for old


Jarkko continued on his dVAR quest with -DPERL_GLOBAL_STRUCT_PRIVATE to move Perl C run-time data to the heap. As he marked a previously const variable as non-constant, Andy Lester insisted on a full report. Jarkko explained what had happened, and Andy thought of another approach. Jarkko urged Andy to proceed cautiously, as it concerns only the Symbian port as is also rather fragile. Any changes must be tested with PERL_GLOBAL_STRUCT_PRIVATE and PERL_GLOBAL_STRUCT

  Do not meddle in the affairs of Jarkko

And then sent in another dVAR patch, this time to make perl's malloc happy, plus some corrections to fix up some signed/unsigned mismatches that had crept in when no-one was looking.

gcc -ansi -pedantic noise reduction

The patches kept rolling in from way up north, this time cleaning up the worst of the damage when one lets gcc -ansi -pendantic loose on the source (it screams a lot). He recommended that porters build perl from time to time using these two compiler options, and taking the appropriate action based on the fallout. Steve Peters applied the patch. Andy Lester was pleased.

  I dare you

Another handy tip for the code police

And wrapping up, Jarkko offered a handy tip for people who enjoyed torture-testing the compilation of the codebase, to use the -O gcc compiler switch in conjunction with -Wall (warn about everything). It turns out that some code checks don't kick in until the optimizer is brought to bear on the code. Andy Lester wanted to know if different -O levels kicked in successive classes of warnings.

  Cheap trick

Andy posted the armada of compilation switches that he uses to compile the source.

  -Wextra is good for a laugh, too

Patching various t/op test files to use

David Landgren sent in a some patches to for files in t/op to use, instead of their home-grown each-one-different methods. One main benefit is that now the tests have names, so when test 43 fails, it becomes trivial to locate it in the source file.

In the process, he also found an nice bug of a string comparison using numeric equality in t/op/loopctl.t.







  t/op/ (or what I did over my lunch hours this week)

(Summariser's note: At first I wasn't sure how I was going to summarise this, but Jim Cromie sort of forced my hand).

  It's all Jim's fault

Clarifying the documentation

Tom Regner had a discussion with Ilya Zakharevich on comp.lang.perl.modules about Fatal, and the result of the discussion was a POD patch. Unapplied.

Watching the smoke signals

Smoke [5.9.4] 28020 FAIL(M) MSWin32 Win2000 SP4 (x86/1 cpu)

Abe Timmerman identified this failure as the result of the Windows part of the code base not finding change #27992 to its taste.

Smoke [5.9.4] 28009 FAIL(Xm) irix 6.2 (IP22/1 cpu)

In this report, it appears that something had gone wrong with the gcc C run-time library during the configuration phase.

New and old bugs from RT

Many bizarre bugs

Animator discovered that the bizarre code leads to bizarre ARRAY assignment (bug #9374) had been fixed in 5.8.8 and 5.9.3. Andreas Koenig identified change #25828 as being the one responsible for the fix.

In a similar bug report (#3112), Animator simplified the snippet that causes a bizarre copy of ARRAY in last. No-one attempted a fix for the time being,

  for (1) { push @a, last; }

and bug #9540 still produced Bizarre copy of HASH errors,

  @{my %self; %self}{1} = 1;

while #22238 continued to give you a Bizarre copy of HASH in leave

  @{ my %self; %self }{1} = 1;

and for the last in a series of bizarre bug reports, Animator noted that bug #36229 (Bizarre copy of IO) is a duplicate of an earlier bug (#3314).

  $a = ${*STDOUT{IO}}

So don't do that.

unpack fails on utf-8 strings (#33734)

Nicholas Clark returned to the problem raised by Marc A. Lehmann back in early 2005 and explained what he had done with Ton's code, which fixed up the problem in a perhaps non-backwards-compatible manner. Glenn Linderman wondered whether the new pack/unpack semantics could not be bundled into a loadable module and made available via a feature pragma.

Marc Lehmann thought that unpack should simply be fixed (the problem is how unpack behaves in the face of UTF-8 data) once and for all. If the fix is too major, then it will appear in 5.10. If it can be arranged to not break anything already relying on the current behaviour, then it can be fixed in 5.8.

Perl build problem (failing File::Basename) (#38891)

Dominic Dunlop nudged this bug report, requesting more information. No response received as we went to the press.

double free detected (#38943)

Ulrich Windl coaxed a *** glibc detected *** double free or corruption out of the debugger. Neither Dave Mitchell nor Nicholas Clark were unable to reproduce the problem on the perls and platforms they had on hand.

Michael Shroeder remembered that Ulrich had already reported a similar problem with Term-ReadLine-Gnu, and wondered if the two problems were related, and Ulrich concurred that it appeared to be a problem concerning the C libreadline library in his SuSE distribution.


debugger can't breakpoint until module loaded (#38966)

Albert Cahalan was debugging some unfamiliar code, and wanted to set a breakpoint in some as-yet-unseen code, and wondered why the debugger was refusing to do so, given that gdb will prompt for more information when in similar circumstances when debugging C code.

Rafael wondered if b postpone subname would do the trick. Albert agreed that that was just the ticket, but thought that the debugger could be tweaked to ask whether one wished to postpone the breakpoint if the routine was currently unknown to the debugger.

People missed the point, and started making fun of Albert, suggesting that if he didn't know what he wanted to debug, he should connect the debugger to Symbol::Approx::Sub in order to breakpoint on a routine that most closely matched.

Albert replied calmly with an example involving gdb that made perfect sense. The main obstacle to adding this functionality is that everyone suddenly appears to be very busy Doing Other Stuff when it comes to patching the debugger.

  Wouldn't it be nice

t/op/getppid.t fails in a Solaris Zone (#39010)

Mark Suter reported a failure in t/op/getppid.t, that is due to a changed assumption concerning the owning pid of an orphaned process (hint: it is not 1). It turns out that Sun had been kind enough to loan the porters a Solaris machine with zones set up, and the bug had been found and fixed four weeks ago.

  Coming to a 5.8.9 near you

Attempt to free unreferenced scalar (#39012)

Salvador Fandiño filed a bug report with a nice short snippet showing how to provoke an Attempt to free unreferenced scalar error. After a bit of thought, Dave Mitchell showed how it could be simplified even further. And then after a bit more thought, patched scope.t to fix the problem, and rolled the snippet into a test for t/op/local.t.

  All in a day's work

Segfault in functional streams/bug in closure allocation? (#39017)

christian@pflanze tried to create an infinite stream with closures but the only thing he managed to create were core dumps. Dave Mitchell explained that it wasn't actually closure-related, but yet another manifestation of the recursive implementation used to free recursive data structures.

  Make a deep structure, eat up all your C stack

Christian returned with a follow-up, showing how he had managed to get it to almost but not quite work (but at least it no longer crashed).

The thread returned under the title of Garbage collection issues later on, where Christian wondered about a couple of issues he was having with his code. Dave Mitchell explained precisely what was happening, and then asked Christian to explain what he was really trying to do. Once Dave understood what Christian wanted, he realised the proposed code was far too complex, and sketched out an alternative approach, with negligible memory requirements to boot.

Yves thought that what Christian was trying to do was shoehorn Lisp idioms into Perl, and suggested he take a look at Mark-Jason Dominus's Higher Order Perl.

  HOP to it

Pod::Html generates incorrect href's in some html anchors (#39020)

"bgstewart" filed a problem with some incorrectly generated HTML from POD. Steve Peters explained that it had already been fixed. Tels thought there was a bug in the source POD, and backed up his assertion with a report from podcheck. Russ thought the POD was valid, just somewhat unusual.

  Merely plain, not simple

Tie::Memoize::EXISTS not caching the value (#39026)

This was reported as a bug, but no-one commented on it.

Perl5 Bug Summary

Another decline in the number of tickets noted this week.

  13 created + 27 closed (yay!) = 1525

  What outstanding bugs

New Core Modules

In Brief

The select((select(OUTPUT_HANDLE), $| = 1)[0]) thread evolved into a discussion about sequence points in Perl.

Rafael looked at Marcus Holland-Moritz's patch to clean up 212 warnings emitted by gcc-4.2 and applied 4 out of 5 parts, the last part seeming to be superfluous. Marcus commented in more detail on the reasons behind the patches.

Craig Berry made a belated followup to Ken Williams's File::Spec changes for VMS, giving it a clean bill of health.

Nicholas dropped some of recent tweaks to blead's version of Test::Harness in order to keep it perfectly synchronised with the latest CPAN version.

  Stamping out gratuitous differences

The problem concerning multi-line attributes and Attribute::Handlers noted by Bas van Sisseren (#38475), and its required fix to toke.c was applied by Rafael, and the example was turned into a regression test.

  Won't see that again

Rafael also applied the fix suggested in opening |- triggers unjustified taint check. (#38709)

Jerry D. Hedden continued to sync threads with blead. Applied.

And then he consolidated some XS functions. Also applied.

Here's one that was missed in maint:

And another that was caught:

Nicholas Clark revived the POSIX build fails in bleadperl and RTMAX patch bug (#36951). David Dyck confirmed that it is still alive and kicking,

and closed out the utf8 overload stringify bug (#34297), saying that it was fixed in blead and that he hoped to backport it to 5.8.9.

He then showed how uc/lc/ucfirst/lcfirst fail on typeglobs on 5.8.8 in bug #39019. The bug doesn't manifest itself on blead, but that is because it's typeglob implementation is quite different these days.

Joshua ben Jore wondered how to interpret the meanings of the MAD keys.

  No-one else knew, either

Nicholas Clark wondered why lc and friends don't taint empty strings when locales are in effect (bug #39028). Sadahiro Tomoyuki gave his thoughts on the matter, explaining that perl was probably Doing The Right Thing.

Abe Timmerman fixed up snsprintf and vsnsprintf problems for MSVC on the Windows platform.

About this summary

This summary was written by David Landgren.

If you want a bookmarklet approach to viewing bugs and change reports, there are a couple of bookmarklets that you might find useful on my page of Perl stuff:

Weekly summaries are published on and posted on a mailing list, (subscription: The archive is at Corrections and comments are welcome.

If you found this summary useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl.