perl5280delta (1) Linux Manual Page
NAME
perl5280delta – what is new for perl v5.28.0
DESCRIPTION
This document describes differences between the 5.26.0 release and the 5.28.0 release.
If you are upgrading from an earlier release such as 5.24.0, first read perl5260delta, which describes differences between 5.24.0 and 5.26.0.
Core Enhancements
Unicode 10.0 is supported
A list of changes is at <http://www.unicode.org/versions/Unicode10.0.0>.
delete on key/value hash slices
"delete" can now be used on key/value hash slices, returning the keys along with the deleted values. [perl #131328] <https://rt.perl.org/Ticket/Display.html?id=131328>
Experimentally, there are now alphabetic synonyms for some regular expression assertions
If you find it difficult to remember how to write certain of the pattern assertions, there are now alphabetic synonyms.
CURRENT NEW SYNONYMS
------ ------------
(?=...) (*pla:...) or (*positive_lookahead:...)
(?!...) (*nla:...) or (*negative_lookahead:...)
(?<=...) (*plb:...) or (*positive_lookbehind:...)
(?<!...) (*nlb:...) or (*negative_lookbehind:...)
(?>...) (*atomic:...)
These are considered experimental, so using any of these will raise (unless turned off) a warning in the "experimental::alpha_assertions" category.
Mixed Unicode scripts are now detectable
A mixture of scripts, such as Cyrillic and Latin, in a string is often the sign of a spoofing attack. A new regular expression construct now allows for easy detection of these. For example, you can say
qr/(*script_run: \d+ )/x
And the digits matched will all be from the same set of 10. You won’t get a look-alike digit from a different script that has a different value than what it appears to be.
Or:
qr/(*sr: \w+ )/x
makes sure that all the characters come from the same script.
You can also combine script runs with "(?>...)" (or "*atomic:...)").
Instead of writing:
(*sr:(?<...))
you can now run:
(*asr:...)
# or
(*atomic_script_run:...)
This is considered experimental, so using it will raise (unless turned off) a warning in the "experimental::script_run" category.
See “Script Runs” in perlre.
In-place editing with perl -i is now safer
Previously in-place editing ("perl -i") would delete or rename the input file as soon as you started working on a new file.
Without backups this would result in loss of data if there was an error, such as a full disk, when writing to the output file.
This has changed so that the input file isn’t replaced until the output file has been completely written and successfully closed.
This works by creating a work file in the same directory, which is renamed over the input file once the output file is complete.
Incompatibilities:
- •
- Since this renaming needs to only happen once, if you create a thread or child process, that renaming will only happen in the original thread or process.
- •
- If you change directories while processing a file, and your operating system doesn’t provide the "unlinkat()", "renameat()" and "fchmodat()" functions, the final rename step may fail.
[perl #127663] <https://rt.perl.org/Public/Bug/Display.html?id=127663>
Initialisation of aggregate state variables
A persistent lexical array or hash variable can now be initialized, by an expression such as "state @a = qw(x y z)". Initialization of a list of persistent lexical variables is still not possible.
Full-size inode numbers
On platforms where inode numbers are of a type larger than perl’s native integer numerical types, stat will preserve the full content of large inode numbers by returning them in the form of strings of decimal digits. Exact comparison of inode numbers can thus be achieved by comparing with "eq" rather than "==". Comparison with "==", and other numerical operations (which are usually meaningless on inode numbers), work as well as they did before, which is to say they fall back to floating point, and ultimately operate on a fairly useless rounded inode number if the real inode number is too big for the floating point format.
The sprintf %j format size modifier is now available with pre-C99 compilers
The actual size used depends on the platform, so remains unportable.
Close-on-exec flag set atomically
When opening a file descriptor, perl now generally opens it with its close-on-exec flag already set, on platforms that support doing so. This improves thread safety, because it means that an "exec" initiated by one thread can no longer cause a file descriptor in the process of being opened by another thread to be accidentally passed to the executed program.
Additionally, perl now sets the close-on-exec flag more reliably, whether it does so atomically or not. Most file descriptors were getting the flag set, but some were being missed.
String- and number-specific bitwise ops are no longer experimental
The new string-specific ("&. |. ^. ~.") and number-specific ("& | ^ ~") bitwise operators introduced in Perl 5.22 that are available within the scope of "use feature 'bitwise'" are no longer experimental. Because the number-specific ops are spelled the same way as the existing operators that choose their behaviour based on their operands, these operators must still be enabled via the “bitwise” feature, in either of these two ways:
use feature "bitwise";
use v5.28; # "bitwise" now included
They are also now enabled by the -E command-line switch.
The “bitwise” feature no longer emits a warning. Existing code that disables the “experimental::bitwise” warning category that the feature previously used will continue to work.
One caveat that module authors ought to be aware of is that the numeric operators now pass a fifth TRUE argument to overload methods. Any methods that check the number of operands may croak if they do not expect so many. XS authors in particular should be aware that this:
SV *
bitop_handler (lobj, robj, swap)
may need to be changed to this:
SV *
bitop_handler (lobj, robj, swap, ...)
Locales are now thread-safe on systems that support them
These systems include Windows starting with Visual Studio 2005, and in POSIX 2008 systems.
The implication is that you are now free to use locales and change them in a threaded environment. Your changes affect only your thread. See “Multi-threaded operation” in perllocale
New read-only predefined variable ${^SAFE_LOCALES}
This variable is 1 if the Perl interpreter is operating in an environment where it is safe to use and change locales (see perllocale.) This variable is true when the perl is unthreaded, or compiled in a platform that supports thread-safe locale operation (see previous item).
Security
[CVE-2017-12837] Heap buffer overflow in regular expression compiler
Compiling certain regular expression patterns with the case-insensitive modifier could cause a heap buffer overflow and crash perl. This has now been fixed. [perl #131582] <https://rt.perl.org/Public/Bug/Display.html?id=131582>
[CVE-2017-12883] Buffer over-read in regular expression parser
For certain types of syntax error in a regular expression pattern, the error message could either contain the contents of a random, possibly large, chunk of memory, or could crash perl. This has now been fixed. [perl #131598] <https://rt.perl.org/Public/Bug/Display.html?id=131598>
[CVE-2017-12814] $ENV{$key} stack buffer overflow on Windows
A possible stack buffer overflow in the %ENV code on Windows has been fixed by removing the buffer completely since it was superfluous anyway. [perl #131665] <https://rt.perl.org/Public/Bug/Display.html?id=131665>
Default Hash Function Change
Perl 5.28.0 retires various older hash functions which are not viewed as sufficiently secure for use in Perl. We now support four general purpose hash functions, Siphash (2-4 and 1-3 variants), and Zaphod32, and StadtX hash. In addition we support SBOX32 (a form of tabular hashing) for hashing short strings, in conjunction with any of the other hash functions provided.
By default Perl is configured to support SBOX hashing of strings up to 24 characters, in conjunction with StadtX hashing on 64 bit builds, and Zaphod32 hashing for 32 bit builds.
You may control these settings with the following options to Configure:
-DPERL_HASH_FUNC_SIPHASH
-DPERL_HASH_FUNC_SIPHASH13
-DPERL_HASH_FUNC_STADTX
-DPERL_HASH_FUNC_ZAPHOD32
To disable SBOX hashing you can use
-DPERL_HASH_USE_SBOX32_ALSO=0
And to set the maximum length to use SBOX32 hashing on with:
-DSBOX32_MAX_LEN=16
The maximum length allowed is 256. There probably isn’t much point in setting it higher than the default.
Incompatible Changes
Subroutine attribute and signature order
The experimental subroutine signatures feature has been changed so that subroutine attributes must now come before the signature rather than after. This is because attributes like ":lvalue" can affect the compilation of code within the signature, for example:
sub f : lvalue($a = do { $x = “abc”; return substr($x,0,1) })
{
…
}
Note that this the second time they have been flipped:
sub f : lvalue($a, $b){…};
# 5.20;
5.28 onwards
sub f($a, $b)
: lvalue{…};
# 5.22 – 5.26
Comma-less variable lists in formats are no longer allowed
Omitting the commas between variables passed to formats is no longer allowed. This has been deprecated since Perl 5.000.
The :locked and :unique attributes have been removed
These have been no-ops and deprecated since Perl 5.12 and 5.10, respectively.
\N{} with nothing between the braces is now illegal
This has been deprecated since Perl 5.24.
Opening the same symbol as both a file and directory handle is no longer allowed
Using "open()" and "opendir()" to associate both a filehandle and a dirhandle to the same symbol (glob or scalar) has been deprecated since Perl 5.10.
Use of bare << to mean << is no longer allowed
Use of a bare terminator has been deprecated since Perl 5.000.
Setting $/ to a reference to a non-positive integer no longer allowed
This used to work like setting it to "undef", but has been deprecated since Perl 5.20.
Unicode code points with values exceeding IV_MAX are now fatal
This was deprecated since Perl 5.24.
The B::OP::terse method has been removed
Use "B::Concise::b_terse" instead.
Use of inherited AUTOLOAD for non-methods is no longer allowed
This was deprecated in Perl 5.004.
Use of strings with code points over 0xFF is not allowed for bitwise string operators
Code points over 0xFF do not make sense for bitwise operators and such an operation will now croak, except for a few remaining cases. See perldeprecation.
This was deprecated in Perl 5.24.
Setting ${^ENCODING} to a defined value is now illegal
This has been deprecated since Perl 5.22 and a no-op since Perl 5.26.
Backslash no longer escapes colon in PATH for the -S switch
Previously the "-S" switch incorrectly treated backslash (“\”) as an escape for colon when traversing the "PATH" environment variable. [perl #129183] <https://rt.perl.org/Ticket/Display.html?id=129183>
the -DH (DEBUG_H) misfeature has been removed
On a perl built with debugging support, the "H" flag to the "-D" debugging option has been removed. This was supposed to dump hash values, but has been broken for many years.
Yada-yada is now strictly a statement
By the time of its initial stable release in Perl 5.12, the "..." (yada-yada) operator was explicitly intended to serve as a statement, not an expression. However, the original implementation was confused on this point, leading to inconsistent parsing. The operator was accidentally accepted in a few situations where it did not serve as a complete statement, such as
... . "foo";
... if $a < $b;
The parsing has now been made consistent, permitting yada-yada only as a statement. Affected code can use "do{...}" to put a yada-yada into an arbitrary expression context.
Sort algorithm can no longer be specified
Since Perl 5.8, the sort pragma has had subpragmata "_mergesort", "_quicksort", and "_qsort" that can be used to specify which algorithm perl should use to implement the sort builtin. This was always considered a dubious feature that might not last, hence the underscore spellings, and they were documented as not being portable beyond Perl 5.8. These subpragmata have now been deleted, and any attempt to use them is an error. The sort pragma otherwise remains, and the algorithm-neutral "stable" subpragma can be used to control sorting behaviour. [perl #119635] <https://rt.perl.org/Ticket/Display.html?id=119635>
Over-radix digits in floating point literals
Octal and binary floating point literals used to permit any hexadecimal digit to appear after the radix point. The digits are now restricted to those appropriate for the radix, as digits before the radix point always were.
Return type of unpackstring()
The return types of the C API functions "unpackstring()" and "unpack_str()" have changed from "I32" to "SSize_t", in order to accommodate datasets of more than two billion items.
Deprecations
Use of vec on strings with code points above 0xFF is deprecated
Such strings are represented internally in UTF-8, and "vec" is a bit-oriented operation that will likely give unexpected results on those strings.
Some uses of unescaped { in regexes are no longer fatal
Perl 5.26.0 fatalized some uses of an unescaped left brace, but an exception was made at the last minute, specifically crafted to be a minimal change to allow GNU Autoconf to work. That tool is heavily depended upon, and continues to use the deprecated usage. Its use of an unescaped left brace is one where we have no intention of repurposing "{" to be something other than itself.
That exception is now generalized to include various other such cases where the "{" will not be repurposed.
Note that these uses continue to raise a deprecation message.
Use of unescaped { immediately after a ( in regular expression patterns is deprecated
Using unescaped left braces is officially deprecated everywhere, but it is not enforced in contexts where their use does not interfere with expected extensions to the language. A deprecation is added in this release when the brace appears immediately after an opening parenthesis. Before this, even if the brace was part of a legal quantifier, it was not interpreted as such, but as the literal characters, unlike other quantifiers that follow a "(" which are considered errors. Now, their use will raise a deprecation message, unless turned off.
Assignment to $[ will be fatal in Perl 5.30
Assigning a non-zero value to $[ has been deprecated since Perl 5.12, but was never given a deadline for removal. This has now been scheduled for Perl 5.30.
hostname() won’t accept arguments in Perl 5.32
Passing arguments to "Sys::Hostname::hostname()" was already deprecated, but didn’t have a removal date. This has now been scheduled for Perl 5.32. [perl #124349] <https://rt.perl.org/Ticket/Display.html?id=124349>
Module removals
The following modules will be removed from the core distribution in a future release, and will at that time need to be installed from CPAN. Distributions on CPAN which require these modules will need to list them as prerequisites.
The core versions of these modules will now issue "deprecated"-category warnings to alert you to this fact. To silence these deprecation warnings, install the modules in question from CPAN.
Note that these are (with rare exceptions) fine modules that you are encouraged to continue to use. Their disinclusion from core primarily hinges on their necessity to bootstrapping a fully functional, CPAN-capable Perl installation, not usually on concerns over their design.
- B::Debug
- Locale::Codes and its associated Country, Currency and Language modules
Performance Enhancements
- •
- The start up overhead for creating regular expression patterns with Unicode properties ("\p{...}") has been greatly reduced in most cases.
- •
- Many string concatenation expressions are now considerably faster, due to the introduction internally of a "multiconcat" opcode which combines multiple concatenations, and optionally a "=" or ".=", into a single action. For example, apart from retrieving $s, $a and $b, this whole expression is now handled as a single op:
$s .= "a=$a b=$b "As a special case, if the LHS of an assignment is a lexical variable or "my $s", the op itself handles retrieving the lexical variable, which is faster.
In general, the more the expression includes a mix of constant strings and variable expressions, the longer the expression, and the more it mixes together non-utf8 and utf8 strings, the more marked the performance improvement. For example on a "x86_64" system, this code has been benchmarked running four times faster:
my $s;
my $a = “ab
