| perl6 Implementation of Threads in Perl |
Implementation of Threads in Perl
Maintainer: Bryan C. Warnock <bwarnock@gtemail.net> Date: 1 Aug 2000 Last Modified: 28 September 2000 Mailing List: perl6-internals@perl.org Number: 1 Version: 4 Status: Frozen
No significant changes. Freezing.
Perl 6 should be built around threads from the beginning.
Perl 5 attempted (with relatively good success) to implement threads atop the current architecture. It did, unfortunately, leave several gaps, traps, and "features" in heavy concurrency uses. These weaknesses could be fixed if Perl was built with threading from the start.
All Perl programs are threaded. Most just only have one.
Impatience, Hubris, and Laziness, in that order.
Attempt to build-in thread constructs for the internals, while allowing a Thread module to safely and robustly add user thread constructs, while not making things bad for the single-threaded folks.
The summary is based on the current Perl 5 architecture. As the internal structure changes, like using vtables, the thread design will have to change.
There has been very little direct discussion about threads or this RFC. The bulk of the discussion centered around whether thread-global or program-global global variables should be the default.
There has been a bit of indirect discussion on threads, like how various structures will fit into a multi-threaded program. No attempt has been made to cross-reference these fringe discussions.
$main::foo == $foo within one thread, while $main::foo != $main::foo in different threads. There need be no way to specify the particular thread-space, as it should be visible only to the owning thread.
Rationale: This allows the bulk of the modules, which don't rely on sharing data across multiple threads, to work as written for a single-threaded program.
global keyword or function that explicitly accesses a variable in the program-global stash.
global $main::foo = $foo; # Let another thread know what my $foo is.
global $main::foo = \$foo; # Share my local foo. Dangerous!
$foo = global $main::foo; # Localize this instance of $main::foo.
Rationale: There needs to be some way to distinguish between the program- global stash and the thread-global stashes. It should undeniably mark the data that is being shared.
Rationale: We don't want to bog single-threaded programs with needless mutex operations, let alone attempt to do so on platforms that have no such beastie.
use Threads may set up the thread constructs, but threads will not be spawned until runtime.
Rationale: It would probably be too difficult to have the interpreter try to figure out what thread is which, without actual threads. A cart and horse problem.
used. As this occurs during the compile, there is only one thread in existence to receive the module. Perl should continue to track these modules (as it currently does to prevent multiple inclusions), although it may need its own bin. Subsequent threads should then reslurp these modules back in on their start up.
Discussion: This is a potential problem child. Your main thread uses a module, but may have its own data. It may or may not have changed the module's data before a secondary thread was started. The second thread can't simply copy the first thread's space, because it may have thread- specific information contained there. But you can't blindly globalize each module, because then you lose the ability for a standard module to work without its own explicit and robust reentrent handling. Therefore, each thread needs to reuse the original modules upon creation.
#!/your/path/to/perl -w
use English; # Lexical feel. Will work across all threads.
use Threads;
use Foo;
use Bar;
# the main thread has all four above in its arena
my $thread2 = Threads->new(\&start_thread2);
...
sub start_thread2
{
...
# Before this sub was called, the second thread was created, and it
# reused English, Threads, Foo, and Bar, pulling them into its spaces.
}
This, of course, could lead to massive program bloat, as each of ten threads may each have their own copy of the same 50 modules. Here, however, I'm hoping that better engineering will take place, and programmers won't needlessly use modules globally. See item 7.
Rationale: This is consistent with items 1, 4, and 7.
use and no their own modules, outside of the global ones. This allows each thread to only use the modules they need, saving on the global system bloat described above, and giving each thread the most control over its environment - such as letting two threads use two different versions of the same module.
Discussion: Threads are mainly used in one of two ways: parallel processing of the same dataspace, such as loop processing with non-iteration-dependent data members; or "assembly line" parallel processing, with each thread doing a different function on a database, like thread signals conversion, with a buffer read, byte/nybble/bit swapping, data conversion, and buffer write split across multiple threads. I won't pretend to know which is truly more common, or more deserving of the threading model, but the second is certainly easier to use from an end-user perspective, and the first can be converted to the second with relative ease. This model allows different threads to do different things with little impact on the footprint of the program, by allowing each thread to use its own modules.
use English; # In main thread at compilation
use Threads; # In main thread at compilation
Threads->use("Foo"); # Loads Foo into main thread at runtime.
my $thread2 = Threads->new(\&start_thread2);
...
sub start_thread2
{
...
# Before this sub was called, the second thread was created, and it
# reused English and Threads.
Threads->use("Bar"); # Remember, this is this thread's Threads
}
Rationale: Consistent with above, with some pleasant side-effects. With inclusion rules being the same, a thread won't use something it already received globally, and modules themselves can use the original, standard syntax, since the inclusion is deferred to runtime, a use Modules will load into the thread-space of the thread that invoked its use method. The downside, of course, is that module inclusion is deffered to downside. Perhaps a separate RFC on program invariant checking is necessary....
BEGIN and END blocks will continue to be single-threaded compile time constructs.use Thread and the sematics it would add. See the notes below about module inclusion. (Obviously, other changes to the language notwithstanding.)RFC 86: IPC Mailboxes for Threads and Signals
RFC 178: Lightweight Threads
RFC 185: Thread Programming Model
RFC 293: MT-Safe Autovariables in perl 5.005 Threading
|
Perl.org sites
: bugs
| dev
| history
| jobs
| learn
| lists
| use
Site Information and Contacts |
|