=head1 TITLE Organization and Rationalization of Perl State Variables =head1 VERSION Maintainer: Steve Simmons Date: 3 Aug 2000 Mailing List: perl6-language@perl.org Number: 17 Version: 2 Status: Developing =head1 ABSTRACT Perl currently contains a large number of state and status variables with names that resemble line noise (henceforth called C<$[linenoise]> variables). Some of these are (or should be) deprecated, and the naming methods for the rest are obscure. Since we are (potentially) adding, removing, and changing the functionality of these variables with Perl6, we should seize the opportunity to rationalize the names and organization of these variables as well. These variables need to be made available with mnemonic names, categorized by module, be capable of introspective use, and proper preparation made for deprecation and translation. =head1 DESCRIPTION C allows for the runtime examination and setting of a number of variables. These existing C<$[linenoise]> variables are horrible names and need cleanup, re-organization, and syntactic sugar. Different variables have different problems, usually one or more of the following: =over =item * unused (as far as we know) =item * useless, but still appear in older programs =item * duplicated by other features =item * standing in the way of advance because of backwards compatibility =back In the pre-RFC discussion of this issue, it was also pointed out that these variables are hard to deprecate without nagging the crap out of users running the programs. The proposed solution was broadly applicable, and has been spun off into RFC 3. The use of the C module is an attempt to solve the anti-mnemonic features of these variables. A better solution is to do it right in the first place, with a number of attendant wins: =over =item * no need for Yet Another Module to be loaded =item * promotes code readability =item * probably promotes core Perl maintainability =item * grouping sets of variables together by function gives the code writer strong hints as to variables that might affect each other =item * should we move formats (or any similar current core functionality) to a loadable module, this neatly encapsulates all the data with a single referenced name =item * solves problems with C<$[linenoise]> variable proliferation =item * potentially promotes introspection - see below. =back In addition, many features which are now (re)set by other calls should set appropriate state variables as well. Thus a C script which contains: use strict foo; might set a var C<$PERL_STRICT{foo}>, and so forth (this is probably a poor example). Credit where it is due: the idea of putting related values together into an appropriately tagged hash is shamelessly ripped off from common C usage. =head3 Caveat - Global State Variables Are Dangerous Having the setting of simple variables modify the function of a broad set of things is an inherently dangerous and inelegant way of doing things. Mark-Jason Dominus has proposed that they simply be removed whenever possible. I tend to agree with his arguement, and join in urging that the problems he addresses in message C<20000805171023.24408.qmail@plover.com> be addressed by the core teams working in the areas that use such variables. The thread started my Mark-Jason in C has a good discussion of the issues. Notwithstanding, should the core team continue to allow global variables for some purposes, the names and categorization should be improved. =head2 Advantages and Non-Loses =head3 Clean Backwards Compatibility To promote backward compatibility, one could write a C module which would alias the old names to the new ones. That Would Be Wrong, but someone will probably do it - so it may as well be us. Obviously we cannot provide backwards compatibility for a variable whose meaning has changed or which has vanished, but most of the rest can be captured cleanly. =head3 Promoting Removable Core Modules It has been strongly proposed that `core C' be broken down internally into a number of modules such that one could build smaller or larger custom perls. This feature would ease that work, and possibly set a standard for how well-behaved non-core modules should implement such things. =head3 Provide Possible Guidelines To Core-able Modules The discussion of removable core modules has strongly implied (sometimes explicitly stated) that sites could take modules which are not currently in the core and move them there. Having a standard which those modules could follow for variable setting and exposure would be a major win. =head2 Disadvantages =head3 Backwards Compatibility Existing C code which makes use of C<$[linenoise]> variables which had not changed meaning in C will fail if (as proposed) the variables are completely replaced by the names proposed here. This is, at best, a minor objection because =head3 Increased Typing Considered Painful A number of people have expressed great joy at the brevity of the current names. =head3 Loss of Distinctiveness Currently when one sees a C<$[linenoise]> variable, they notice its specialness and have a visual clue to take care. C<$[linenoise]> stands out. Since some of these variables have wide-reaching side-effects, any solution should try to maintain that distinctiveness. =over 4 =item * Few if any scripts match this description =item * The proposed C to C translation tool should catch this and do the appropriate renaming. =back be modified to use the C modules described above in I. =head2 Other Possible Features It has been suggested by Alan Burlison that this allow for localizable settings. Thus module A might turns warnings off when its features are in use, while module B is unaffected by the setting done by module A. This is a change in the functionality of such settings, and has the potential for broadly changing what happens in C. I suggest that this issue is actually independent of how the variables are named, and should be taken to another RFC if anyone is interested. =head1 IMPLEMENTATION =head2 Internal Implementation The internal representation of these is largely irrelevant(!). This RFC prescribes what the I interface looks like; the implementation team should select whatever mechanism they prefer for internal use. It's probably a maintenance win if the same is used, but the proposers don't intend to dictate to the developers. =head2 External Implementation - Proposals =head3 Overall Variables should be sorted into functional areas which may or may not have a one-to-one correspondence with internal (core) There have been four suggestions for how the variables should be named. In the examples below, I have capitalized the higher-level portions of the names. This capitalization is not a requirement of the RFC, and is done purely to make the new names stand out in this document. =head3 Well-Named Global Hashes And Keys For each collection of variables, a well-named pseudo-hash with well-named keys: $PERL_CORE{warnings} vs $^W $PERL_CORE{version} vs $^V $PERL_FORMATS{name} vs $^ $PERL_FORMATS{lines_left} vs $- $PSEUDO_HASHES{strict} vs (none) An additional variable should be provided, $PERL_CORE{variables} which contains a list of all the settable hash names (eg, $PERL_CORE{variables} = qw(PERL_CORE PERL_FORMATS PSEUDO_HASHES ...) Advantages: Only a single point in the namespace is used for each variable. The hashes can be handed around en masse efficiently via references. Pseudo-hash member access is efficient. Names are free from module dependency. Introspection with hashes is more powerful (see below). Disadvantages: Pseudo-hashes are new to most programmers. =head3 Well-Named Module Variable Sets This has an almost one-to-one naming match to the first suggestion, but used module naming: $PERL::CORE::Warnings vs $^W $PERL::CORE::Version vs $^V $PERL::FORMATS::Name vs $^ $PERL::FORMATS::LinesLeft vs $- $PSEUDOHASH::CONTROL::Strict vs (none) Advantages: Looks like individual variables again. Modules could be moved (compiled) in and out of the perl core and, so long as a script did the appropriate C statement, the script would require no other changes. Disadvantages: Locks variables into given modules. The second-level names may become difficult to do sensibly. =head3 Well-Named Per-Module Hashes And Keys This is a hybrid of the first two: $PERL::CORE{warnings} vs $^W $PERL::CORE{version} vs $^V $PERL::FORMATS{name} vs $^ $PERL::FORMATS{lines_left} vs $- $PSEUDOHASH::CONTROL{Strict} (none) Advantages: Looks like individual variables again. Modules could be moved (compiled) in and out of the perl core and, so long as a script did the appropriate C statement, the script would require no other changes. Introspection and variable discovery via C is still a win. Disadvantages: Locks variables into given modules. The second-level names may become difficult to do sensibly. Introspection becomes more difficult because one must find the second-level name(s) for each class of item (eg, PERL::CORE and PERL::FORMATS). Sometimes there should not be More Than One Way To Do It - a single mechanism makes it easier to notice, find, use, and understand such var. =head3 Introspection With Hashes The use of hashes and pseudo-hashes leads to a straightforward and `natural' mechanism by which programmers can discover all the relevant variables for a given item: foreach my $key ( %PERL_CORE ) { print "\%PERL_CORE{$key} is $PERL_CORE{$key}\n"; } =head3 Summary The RFC maintainer thinks the first suggestion, I, is the best choice. While it has potential problems for module removal should a hash be shared between several modules, these are (IMHO) worth the flexibility of I locking a given hash to a given module and providing a single, consistent mechanism programmers would use to obtain value settings. The community has not (as of version 1.1) expressed a strong preference in either direction. =head2 Value Protection A value can be set and reset, but it's existence should be protected. Thus one could set the variables, but not undefine or delete them. If hashes are chosen, the user should not be allowed to add or remove key/value pairs from the hash. Further, values such as $C should be read-only. =head2 Backwards Compatibility Issues and Suggestions Backwards compatibility is an issue in two ways: =over4 =item * Moving old scripts to C =item * Moving old coders to C =back There are two solution sets for this, both of which should be implemented as part of C =head3 Variable Translation The proposed script translator should catch all uses of pre-perl6 C<$[linenoise]> variables and translate them to the new names according to the guidelines below. =head3 Variable Remapping An C module could be provided. Like the current C module, this would attempt to map the new names into the old C<$[linenoise]> variables for those programmers absolutely addicted to the old names. It should handle the variables according to the guidelines below. =head3 Permanent Aliasing? Several comments have been made (including one claimed to be second-hand from Larry Wall) that they just don't like typing long variable names. If this is a sufficently strong problem, the contributors to this RFC would have no objection to having both variable namesets present simultaneously. This would have the advantage that the variables would still be reorganized and grouped, and some additional introspection possible, but the folks who are really married to the ultra-short names would still have them available with no typing overhead. =head3 Guidelines for variable translation and remapping The C conversion script should silently translate old variable names to the new names when the meaning of the variable is essentially unchanged. The C module should provide the old name as an alias or map to the new name. =head4 Variables with changed meaning The C conversion script should translate old variable names to the new names when there seems to be a reasonable but not perfect mapping between the meanings of variables. A one-time warning should be issued at compile time for each use of such variables, similar to the manner in which the current C warns of first use of undeclared variables. A comment should be inserted into the code above the use of the variable saying something like: # Comment inserted by PERL6_trans on # Variable does not exist in perl6. We have translated # it to below, but and are not completely # equivalent. Please check your usage and make the appropriate # changes. In the best of all worlds, the comment might give hints as to what to think about when changing that var. The C module should provide no mapping for these variables. =head4 Variables which have been removed The translator should leave removed variables which have been removed in place, but a well-tagged C statement should be placed above The C module should provide no mapping for these variables. =head4 New variables Should the translator notice use of an existing variable with the same name, it should ?print a warning and stop? ?print a warning and go one? ?automaticly rename to prior var to l_? TBD. The C module should provide no mapping for these variables. =head2 Deprecation Warnings In Usage Programmers using hand-translated scripts or who attempt to run untranslated scripts with C should get compile-time warnings and errors, depending on severity of mis-use. At a minimum, C and C should warn of the use of deprecated variable names. That deprecation warning should be I, not simply C is deprecated>. Messages should identify the old var, new var, and at least imply an action. Some possible samples include: changing var $foo no longer has any effect, see the value of var $bar no longer has any meaning, see var $baz has been replaced by $FOO_BAR{baz} and so forth. In addition, it should be possible to have the programmer detect and control these messages on a case by case basis. RFC 3 proposes a mechanism by which programmers could take more direct control of this; should that RFC be accepted then that mechanism should be used. Should it fail, we will make a more specific recommendation here. =head1 CONTRIBUTIONS Contributions ranging from voluminous commentary to pretty decent wisecracks have been received from: Ted Ashton "J. David Blackstone" Corwin Brust Alan Burlison Piers Cawley Tom Christiansen Mark-Jason Dominus Ken Fox Chaim Frenkel Jarkko Hietaniemi Tim Jenness Bart Lateur Peter Scott William Setzer Dan Sugalski "Bryan C. Warnock" Nathan Wiger and others who I probably missed. In addition, it's been pointed out that this suggestion was raised in the past; unfortunately the fingerprints have long since worn away. =head1 REFERENCES RFC 3 on Run-Time Error Message Control