perl5260delta (1) Linux Manual Page
NAME
perl5260delta – what is new for perl v5.26.0
DESCRIPTION
This document describes the differences between the 5.24.0 release and the 5.26.0 release.
Notice
This release includes three updates with widespread effects:
- •
- "." no longer in @INC
For security reasons, the current directory (".") is no longer included by default at the end of the module search path (@INC). This may have widespread implications for the building, testing and installing of modules, and for the execution of scripts. See the section "Removal of the current directory (".") from @INC" for the full details.
- •
- "do" may now warn
"do" now gives a deprecation warning when it fails to load a file which it would have loaded had "." been in @INC.
- •
- In regular expression patterns, a literal left brace "{" should be escaped
See "Unescaped literal "{" characters in regular expression patterns are no longer permissible".
Core Enhancements
Lexical subroutines are no longer experimental
Using the "lexical_subs" feature introduced in v5.18 no longer emits a warning. Existing code that disables the "experimental::lexical_subs" warning category that the feature previously used will continue to work. The "lexical_subs" feature has no effect; all Perl code can use lexical subroutines, regardless of what feature declarations are in scope.
Indented Here-documents
This adds a new modifier "~" to here-docs that tells the parser that it should look for "/^\s*$DELIM
/" as the closing delimiter.
These syntaxes are all supported:
<<~EOF;
<<~\EOF;
<<~'EOF';
<<~"EOF";
<<~`EOF`;
<<~ 'EOF';
<<~ "EOF";
<<~ `EOF`;
The "~" modifier will strip, from each line in the here-doc, the same whitespace that appears before the delimiter.
Newlines will be copied as-is, and lines that don’t include the proper beginning whitespace will cause perl to croak.
For example:
if (1) {
print << ~EOF;
Hello there
EOF
}
prints “Hello there
” with no leading whitespace.
New regular expression modifier /xx
Specifying two "x" characters to modify a regular expression pattern does everything that a single one does, but additionally TAB and SPACE characters within a bracketed character class are generally ignored and can be added to improve readability, like "/[ ^ A-Z d-f p-x ]/xx". Details are at “/x and /xx” in perlre.
@{^CAPTURE}, %{^CAPTURE}, and %{^CAPTURE_ALL}
"@{^CAPTURE}" exposes the capture buffers of the last match as an array. So $1 is "${^CAPTURE}[0]". This is a more efficient equivalent to code like "substr($matched_string,$-[0],$+[0]-$-[0])", and you don’t have to keep track of the $matched_string either. This variable has no single character equivalent. Note that, like the other regex magic variables, the contents of this variable is dynamic; if you wish to store it beyond the lifetime of the match you must copy it to another array.
"%{^CAPTURE}" is equivalent to "%+" (i.e., named captures). Other than being more self-documenting there is no difference between the two forms.
"%{^CAPTURE_ALL}" is equivalent to "%-" (i.e., all named captures). Other than being more self-documenting there is no difference between the two forms.
Declaring a reference to a variable
As an experimental feature, Perl now allows the referencing operator to come after "my()", "state()", "our()", or "local()". This syntax must be enabled with "use feature 'declared_refs'". It is experimental, and will warn by default unless "no warnings 'experimental::refaliasing'" is in effect. It is intended mainly for use in assignments to references. For example:
use experimental 'refaliasing', 'declared_refs';
my \$a = \$b;
See “Assigning to References” in perlref for more details.
Unicode 9.0 is now supported
A list of changes is at <http://www.unicode.org/versions/Unicode9.0.0/>. Modules that are shipped with core Perl but not maintained by p5p do not necessarily support Unicode 9.0. Unicode::Normalize does work on 9.0.
Use of \p{script} uses the improved Script_Extensions property
Unicode 6.0 introduced an improved form of the Script ("sc") property, and called it Script_Extensions ("scx"). Perl now uses this improved version when a property is specified as just "\p{script}". This should make programs more accurate when determining if a character is used in a given script, but there is a slight chance of breakage for programs that very specifically needed the old behavior. The meaning of compound forms, like "\p{sc=script}" are unchanged. See “Scripts” in perlunicode.
Perl can now do default collation in UTF-8 locales on platforms that support it
Some platforms natively do a reasonable job of collating and sorting in UTF-8 locales. Perl now works with those. For portability and full control, Unicode::Collate is still recommended, but now you may not need to do anything special to get good-enough results, depending on your application. See "Category "LC_COLLATE": Collation: Text Comparisons and Sorting" in perllocale.
Better locale collation of strings containing embedded NUL characters
In locales that have multi-level character weights, "NUL"s are now ignored at the higher priority ones. There are still some gotchas in some strings, though. See "Collation of strings containing embedded "NUL" characters" in perllocale.
CORE subroutines for hash and array functions callable via reference
The hash and array functions in the "CORE" namespace ("keys", "each", "values", "push", "pop", "shift", "unshift" and "splice") can now be called with ampersand syntax ("&CORE::keys(\%hash") and via reference ("my $k = \&CORE::keys; $k->(\%hash)"). Previously they could only be used when inlined.
New Hash Function For 64-bit Builds
We have switched to a hybrid hash function to better balance performance for short and long keys.
For short keys, 16 bytes and under, we use an optimised variant of One At A Time Hard, and for longer keys we use Siphash 1-3. For very long keys this is a big improvement in performance. For shorter keys there is a modest improvement.
Security
Removal of the current directory (.) from @INC
The perl binary includes a default set of paths in @INC. Historically it has also included the current directory (".") as the final entry, unless run with taint mode enabled ("perl -T"). While convenient, this has security implications: for example, where a script attempts to load an optional module when its current directory is untrusted (such as /tmp), it could load and execute code from under that directory.
Starting with v5.26, "." is always removed by default, not just under tainting. This has major implications for installing modules and executing scripts.
The following new features have been added to help ameliorate these issues.
- •
- Configure -Udefault_inc_excludes_dot
There is a new Configure option, "default_inc_excludes_dot" (enabled by default) which builds a perl executable without "."; unsetting this option using "-U" reverts perl to the old behaviour. This may fix your path issues but will reintroduce all the security concerns, so don’t build a perl executable like this unless you’re really confident that such issues are not a concern in your environment.
- •
- "PERL_USE_UNSAFE_INC"
There is a new environment variable recognised by the perl interpreter. If this variable has the value 1 when the perl interpreter starts up, then "." will be automatically appended to @INC (except under tainting).
This allows you restore the old perl interpreter behaviour on a case-by-case basis. But note that this is intended to be a temporary crutch, and this feature will likely be removed in some future perl version. It is currently set by the "cpan" utility and "Test::Harness" to ease installation of CPAN modules which have not been updated to handle the lack of dot. Once again, don’t use this unless you are sure that this will not reintroduce any security concerns.
- •
- A new deprecation warning issued by "do".
While it is well-known that "use" and "require" use @INC to search for the file to load, many people don’t realise that "do "file"" also searches @INC if the file is a relative path. With the removal of ".", a simple "do "file.pl"" will fail to read in and execute "file.pl" from the current directory. Since this is commonly expected behaviour, a new deprecation warning is now issued whenever "do" fails to load a file which it otherwise would have found if a dot had been in @INC.
Here are some things script and module authors may need to do to make their software work in the new regime.
- •
- Script authors
If the issue is within your own code (rather than within included modules), then you have two main options. Firstly, if you are confident that your script will only be run within a trusted directory (under which you expect to find trusted files and modules), then add "." back into the path; e.g.:
BEGIN
{
my $dir = “/some/trusted/directory”;
chdir $dir or die “Can’t chdir to $dir: $!
”;
#safe now
push @INC, ‘.’;
}
use “Foo::Bar”;
#may load / some / trusted / directory / Foo / Bar.pm do “config.pl”;
#may load / some / trusted / directory / config.plOn the other hand, if your script is intended to be run from within untrusted directories (such as /tmp), then your script suddenly failing to load files may be indicative of a security issue. You most likely want to replace any relative paths with full paths; for example,
do "foo_config.pl"might become
do "$ENV{HOME}/foo_config.pl"If you are absolutely certain that you want your script to load and execute a file from the current directory, then use a "./" prefix; for example:
do "./foo_config.pl" - •
- Installing and using CPAN modules
If you install a CPAN module using an automatic tool like "cpan", then this tool will itself set the "PERL_USE_UNSAFE_INC" environment variable while building and testing the module, which may be sufficient to install a distribution which hasn’t been updated to be dot-aware. If you want to install such a module manually, then you’ll need to replace the traditional invocation:
perl Makefile.PL && make && make test && make installwith something like
(export PERL_USE_UNSAFE_INC = 1;
perl Makefile.PL && make && make test && make install)Note that this only helps build and install an unfixed module. It’s possible for the tests to pass (since they were run under "PERL_USE_UNSAFE_INC=1"), but for the module itself to fail to perform correctly in production. In this case, you may have to temporarily modify your script until a fixed version of the module is released. For example:
use Foo::Bar;
{
local @INC = (@INC, ‘.’);
#assuming read_config() needs ‘.’ in @INC
$config = Foo::Bar->read_config();
}This is only rarely expected to be necessary. Again, if doing this, assess the resultant risks first.
- •
- Module Authors
If you maintain a CPAN distribution, it may need updating to run in a dotless environment. Although "cpan" and other such tools will currently set the "PERL_USE_UNSAFE_INC" during module build, this is a temporary workaround for the set of modules which rely on "." being in @INC for installation and testing, and this may mask deeper issues. It could result in a module which passes tests and installs, but which fails at run time.
During build, test, and install, it will normally be the case that any perl processes will be executing directly within the root directory of the untarred distribution, or a known subdirectory of that, such as t/. It may well be that Makefile.PL or t/foo.t will attempt to include local modules and configuration files using their direct relative filenames, which will now fail.
However, as described above, automatic tools like cpan will (for now) set the "PERL_USE_UNSAFE_INC" environment variable, which introduces dot during a build.
This makes it likely that your existing build and test code will work, but this may mask issues with your code which only manifest when used after install. It is prudent to try and run your build process with that variable explicitly disabled:
(export PERL_USE_UNSAFE_INC = 0;
perl Makefile.PL && make && make test && make install)This is more likely to show up any potential problems with your module’s build process, or even with the module itself. Fixing such issues will ensure both that your module can again be installed manually, and that it will still build once the "PERL_USE_UNSAFE_INC" crutch goes away.
When fixing issues in tests due to the removal of dot from @INC, reinsertion of dot into @INC should be performed with caution, for this too may suppress real errors in your runtime code. You are encouraged wherever possible to apply the aforementioned approaches with explicit absolute/relative paths, or to relocate your needed files into a subdirectory and insert that subdirectory into @INC instead.
If your runtime code has problems under the dotless @INC, then the comments above on how to fix for script authors will mostly apply here too. Bear in mind though that it is considered bad form for a module to globally add a dot to @INC, since it introduces both a security risk and hides issues of accidentally requiring dot in @INC, as explained above.
Escaped colons and relative paths in PATH
On Unix systems, Perl treats any relative paths in the "PATH" environment variable as tainted when starting a new process. Previously, it was allowing a backslash to escape a colon (unlike the OS), consequently allowing relative paths to be considered safe if the PATH was set to something like "/\:.". The check has been fixed to treat "." as tainted in that example.
New -Di switch is now required for PerlIO debugging output
This is used for debugging of code within PerlIO to avoid recursive calls. Previously this output would be sent to the file specified by the "PERLIO_DEBUG" environment variable if perl wasn’t running setuid and the "-T" or "-t" switches hadn’t been parsed yet.
If perl performed output at a point where it hadn’t yet parsed its switches this could result in perl creating or overwriting the file named by "PERLIO_DEBUG" even when the "-T" switch had been supplied.
Perl now requires the "-Di" switch to be present before it will produce PerlIO debugging output. By default this is written to "stderr", but can optionally be redirected to a file by setting the "PERLIO_DEBUG" environment variable.
If perl is running setuid or the "-T" switch was supplied, "PERLIO_DEBUG" is ignored and the debugging output is sent to "stderr" as for any other "-D" switch.
Incompatible Changes
Unescaped literal { characters in regular expression patterns are no longer permissible
You have to now say something like "\{" or "[{]" to specify to match a LEFT CURLY BRACKET; otherwise, it is a fatal pattern compilation error. This change will allow future extensions to the language.
These have been deprecated since v5.16, with a deprecation message raised for some uses starting in v5.22. Unfortunately, the code added to raise the message was buggy and failed to warn in some cases where it should have. Therefore, enforcement of this ban for these cases is deferred until Perl 5.30, but the code has been fixed to raise a default-on deprecation message for them in the meantime.
Some uses of literal "{" occur in contexts where we do not foresee the meaning ever being anything but the literal, such as the very first character in the pattern, or after a "|" meaning alternation. Thus
qr/{fee|{fie/
matches either of the strings "{fee" or "{fie". To avoid forcing unnecessary code changes, these uses do not need to be escaped, and no warning is raised about them, and there are no current plans to change this.
But it is always correct to escape "{", and the simple rule to remember is to always do so.
See Unescaped left brace in regex is illegal here.
scalar(%hash) return signature changed
The value returned for "scalar(%hash)" will no longer show information about the buckets allocated in the hash. It will simply return the count of used keys. It is thus equivalent to "0+keys(%hash)".
A form of backward compatibility is provided via "Hash::Util::bucket_ratio()" which provides the same behavior as "scalar(%hash)" provided in Perl 5.24 and earlier.
keys returned from an lvalue subroutine
"keys" returned from an lvalue subroutine can no longer be assigned to in list context.
sub foo : lvalue{keys(% INC)}(foo) = 3;
#death sub bar : lvalue
{
keys(@_)
}
(bar) = 3;
#also an error
This makes the lvalue sub case consistent with "(keys %hash) = ..." and "(keys @_) = ...", which are also errors. [perl #128187] <https://rt.perl.org/Public/Bug/Display.html?id=128187>
The ${^ENCODING} facility has been removed
The special behaviour associated with assigning a value to this variable has been removed. As a consequence, the encoding pragma’s default mode is no longer supported. If you still need to write your source code in encodings other than UTF-8, use a source filter such as Filter::Encoding on CPAN or encoding’s "Filter" option.
POSIX::tmpnam() has been removed
The fundamentally unsafe "tmpnam()" interface was deprecated in Perl 5.22 and has now been removed. In its place, you can use, for example, the File::Temp interfaces.
require ::Foo::Bar is now illegal.
Formerly, "require ::Foo::Bar" would try to read /Foo/Bar.pm. Now any bareword require which starts with a double colon dies instead.
Literal control character variable names are no longer permissible
A variable name may no longer contain a literal control character under any circumstances. These previously were allowed in single-character names on ASCII platforms, but have been deprecated there since Perl 5.20. This affects things like "$
