perl5300delta - what is new for perl v5.30.0
This document describes differences between the 5.28.0 release and the 5.30.0 release.
If you are upgrading from an earlier release such as 5.26.0, first read perl5280delta, which describes differences between 5.26.0 and 5.28.0.
sv_utf8_(downgrade|decode) are no longer marked as experimental. [GH #16822].
Using a lookbehind assertion (like (?<=foo?)
or (?<!ba{1,9}r)
previously would generate an error and refuse to compile. Now it compiles (if the maximum lookbehind is at most 255 characters), but raises a warning in the new experimental::vlb
warnings category. This is to caution you that the precise behavior is subject to change based on feedback from use in the field.
See "(?<=pattern)" in perlre and "(?<!pattern)" in perlre.
"n"
specifiable in a regular expression quantifier of the form "{m,n}"
has been doubled to 65534
The meaning of an unbounded upper quantifier "{m,}"
remains unchanged. It matches 2**31 - 1 times on most platforms, and more on ones where a C language short variable is more than 4 bytes long.
Because of a change in Unicode release cycles, Perl jumps from Unicode 10.0 in Perl 5.28 to Unicode 12.1 in Perl 5.30.
For details on the Unicode changes, see https://www.unicode.org/versions/Unicode11.0.0/ for 11.0; https://www.unicode.org/versions/Unicode12.0.0/ for 12.0; and https://www.unicode.org/versions/Unicode12.1.0/ for 12.1. (Unicode 12.1 differs from 12.0 only in the addition of a single character, that for the new Japanese era name.)
The Word_Break property, as in past Perl releases, remains tailored to behave more in line with expectations of Perl users. This means that sequential runs of horizontal white space characters are not broken apart, but kept as a single run. Unicode 11 changed from past versions to be more in line with Perl, but it left several white space characters as causing breaks: TAB, NO BREAK SPACE, and FIGURE SPACE (U+2007). We have decided to continue to use the previous Perl tailoring with regards to these.
You can now do something like this in a regular expression pattern
qr! \p{nv= /(?x) \A [0-5] \z / }!
which matches all Unicode code points whose numeric value is between 0 and 5 inclusive. So, it could match the Thai or Bengali digits whose numeric values are 0, 1, 2, 3, 4, or 5.
This marks another step in implementing the regular expression features the Unicode Consortium suggests.
Most properties are supported, with the remainder planned for 5.32. Details are in "Wildcards in Property Values" in perlunicode.
Previously it was an error to evaluate a named character \N{...}
within a single quoted regular expression pattern (whose evaluation is deferred from the normal place). This restriction is now removed.
Turkic languages have different casing rules than other languages for the characters "i"
and "I"
. The uppercase of "i"
is LATIN CAPITAL LETTER I WITH DOT ABOVE (U+0130); and the lowercase of "I"
is LATIN SMALL LETTER DOTLESS I (U+0131). Unicode furnishes alternate casing rules for use with Turkic languages. Previously, Perl ignored these, but now, it uses them when it detects that it is operating under a Turkic UTF-8 locale.
Previously, these calls were only used when the perl was compiled to be multi-threaded. To always enable them, add
-Accflags='-DUSE_THREAD_SAFE_LOCALE'
to your Configure flags.
This macro is still defined but no longer used in core
-Drv
now means something on -DDEBUGGING
builds
Now, adding the verbose flag (-Dv
) to the -Dr
flag turns on all possible regular expression debugging.
$[
is fatal
Setting $[
to a non-zero value has been deprecated since Perl 5.12 and now throws a fatal error. See "Assigning non-zero to $[
is fatal" in perldeprecation.
See "Use of unassigned code point or non-standalone grapheme for a delimiter." in perldeprecation
"{"
in regular expression patterns are now illegalBut to avoid breaking code unnecessarily, most instances that issued a deprecation warning, remain legal and now have a non-deprecation warning raised. See "Unescaped left braces in regular expressions" in perldeprecation.
Calling sysread(), syswrite(), send() or recv() on a :utf8
handle, whether applied explicitly or implicitly, is now fatal. This was deprecated in perl 5.24.
There were two problems with calling these functions on :utf8
handles:
:utf8
flag. Other layers were completely ignored, so a handle with :encoding(UTF-16LE)
layer would be treated as UTF-8. Other layers, such as compression are completely ignored with or without the :utf8
flag.
Declarations such as my $x if 0
are no longer permitted.
These special variables, long deprecated, now throw exceptions when used.
The dump()
function, long discouraged, may no longer be used unless it is fully qualified, i.e., CORE::dump()
.
The File::Glob::glob()
function, long deprecated, has been removed and now throws an exception which advises use of File::Glob::bsd_glob()
instead.
pack()
no longer can return malformed UTF-8It croaks if it would otherwise return a UTF-8 string that contains malformed UTF-8. This protects against potential security threats. This is considered a bug fix as well. [GH #16035].
There are several sets of digits in the Common script. [0-9]
is the most familiar. But there are also [\x{FF10}-\x{FF19}]
(FULLWIDTH DIGIT ZERO - FULLWIDTH DIGIT NINE), and several sets for use in mathematical notation, such as the MATHEMATICAL DOUBLE-STRUCK DIGITs. Any of these sets should be able to appear in script runs of, say, Greek. But the design of 5.30 overlooked all but the ASCII digits [0-9]
, so the design was flawed. This has been fixed, so is both a bug fix and an incompatibility. [GH #16704].
All digits in a run still have to come from the same set of ten digits.
As JSON::XS 4.0 changed its policy and enabled allow_nonref by default, JSON::PP also enabled allow_nonref by default.
This deprecation was scheduled to become fatal in 5.30, but has been delayed to 5.32 due to problems that showed up with some CPAN modules. For details of what's affected, see perldeprecation.
ord("\x7fff")
now requires 12% fewer instructions than before. The performance of checking that a sequence of bytes is valid UTF-8 is similarly improved, again by using a DFA.IV
to UV
conversions. [GH #16761].qr/[^a]/
is significantly sped up, where a is any ASCII character. Other classes can get this speed up, but which ones is complicated and depends on the underlying bit patterns of those characters, so differs between ASCII and EBCDIC platforms, but all case pairs, like qr/[Gg]/
are included, as is [^01]
.USE_THREAD_SAFE_LOCALE
.
Data::Dumper now avoids leaking when croak
ing.
OUTLIST
parameters are no longer incorrectly included in the automatically generated function prototype. [GH #16746].
$File::Find::dont_use_nlink
now defaults to 1 on all platforms. [GH #16759].
Variables $Is_Win32
and $Is_VMS
are being initialized.
Silence Cwd warning on Android builds if targetsh
is not defined.
Adds support for IO::Uncompress::Zstd
and IO::Uncompress::UnLzip
.
The BinModeIn
and BinModeOut
options are now no-ops. ALL files will be read/written in binmode.
JSON::PP as JSON::XS 4.0 enables allow_nonref
by default.
bnok()
now supports the full Kronenburg extension. [cpan #95628].
Changes to B::Op_private and Config
Properly clean up temporary directories after testing.
Debugging threaded code no longer deadlocks in DB::sub
nor DB::lsub
.
Warnings enabled by setting the WARN_ON_ERR
flag in $PerlIO::encoding::fallback
are now only produced if warnings are enabled with use warnings "utf8";
or setting $^W
.
Storable no longer probes for recursion limits at build time. [GH #16780] and others.
Metasploit exploit code was included to test for CVE-2015-1592 detection, this caused anti-virus detections on at least one AV suite. The exploit code has been removed and replaced with a simple functional test. [GH #16778]
Added support for extra tracing of locking, this requires a -DDEBUGGING
and extra compilation flags.
vars.pm
no longer disables non-vars strict when checking if strict vars is enabled. [GH #15851].
The following modules will be removed from the core distribution in a future release, and will at that time need to be installed from CPAN. Distributions on CPAN which require these modules will need to list them as prerequisites.
The core versions of these modules will now issue "deprecated"
-category warnings to alert you to this fact. To silence these deprecation warnings, install the modules in question from CPAN.
Note that these are (with rare exceptions) fine modules that you are encouraged to continue to use. Their disinclusion from core primarily hinges on their necessity to bootstrapping a fully functional, CPAN-capable Perl installation, not usually on concerns over their design.
B::Debug
.Locale::Codes
[GH #16660].We have attempted to update the documentation to reflect the changes listed in this document. If you find any we have missed, send email to perlbug@perl.org.
AvFILL()
was wrongly listed as deprecated. This has been corrected. [GH #16586]tr
when the delimiter is an apostrophe has been clarified. In particular, hyphens aren't special, and \x{}
isn't interpolated. [GH #15853]reset EXPR
.ref(qr/xx/)
returns Regexp
rather than REGEXP
and why. [GH #16801].The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see perldiag.
<-- HERE
in m/%s/" has been changed to the non-deprecation warning "Unescaped left brace in regex is passed through in regex; marked by <-- HERE
in m/%s/".\o{}
without anything between the braces now yields the fatal error message "Empty \o{}". Previously it was "Number with no digits". This means the same wording is used for this kind of error as with similar constructs such as \p{}
.use re 'strict'
, specifying \x{}
without anything between the braces now yields the fatal error message "Empty \x{}". Previously it was "Number with no digits". This means the same wording is used for this kind of error as with similar constructs such as \p{}
. It is legal, though not wise to have an empty \x
outside of re 'strict'
; it silently generates a NUL character.Attempts to push, pop, etc on a hash or glob now produce this message rather than complaining that they no longer work on scalars. [GH #15774].
The file and line number is now reported for this error. [GH #16697]
-Dr
(or use re 'Debug'
) the compiled regex engine program is displayed. It used to use two different spellings for infinity, INFINITY
, and INFTY
. It now uses the latter exclusively, as that spelling has been around the longest.PROTOTYPES: ENABLE
) would include OUTLIST
parameters, but these aren't arguments to the perl function. This has been rectified. [GH #16746].-Accflags='-DUSE_THREAD_SAFE_LOCALE'
option to Configure.
separate error for push
, etc. on hash/glob.
Add test for goto &sub
in overload leaking.
An obscure problem in pack()
when compiling with HP C-ANSI-C has been fixed by disabling optimizations in pp_pack.c.
Perl's build and testing process on Mac OS X for -Duseshrplib
builds is now compatible with Mac OS X System Integrity Protection (SIP).
SIP prevents binaries in /bin (and a few other places) being passed the DYLD_LIBRARY_PATH
environment variable. For our purposes this prevents DYLD_LIBRARY_PATH
from being passed to the shell, which prevents that variable being passed to the testing or build process, so running perl
couldn't find libperl.dylib.
To work around that, the initial build of the perl executable expects to find libperl.dylib in the build directory, and the library path is then adjusted during installation to point to the installed library.
Some support for Minix3 has been re-added.
Cygwin doesn't make cuserid
visible.
C99 math functions are now available.
USE_CPLUSPLUS
build option which has long been available in win32/Makefile (for nmake) and win32/makefile.mk (for dmake) is now also available in win32/GNUmakefile (for gmake).CCTYPE
since there is no obvious choice of which modern version to default to instead. Failure to specify CCTYPE
will result in an error being output and the build will stop.
(The dmake and gmake makefiles will automatically detect which compiler is being used, so do not require CCTYPE
to be set. This feature has not yet been added to the nmake makefile.)
sleep()
with warnings enabled for a USE_IMP_SYS
build no longer warns about the sleep timeout being too large. [GH #16631].$!
if the protocol, address family and socket type combination is not found. [GH #16849].my_strtod
" in perlapi or its synonym, Strtod(), is now available with the same signature as the libc strtod(). It provides strotod() equivalent behavior on all platforms, using the best available precision, depending on platform capabilities and Configure options, while handling locale-related issues, such as if the radix character should be a dot or comma.newSVsv_nomg()
to copy a SV without processing get magic on the source. [GH #16461].PTRDIFF_T_MAX
bytes. Much code (including C optimizers) assumes that all data structures will not be larger than this, so this catches such attempts before overflow happens.EXACT_ONLY8
, and EXACTFU_ONLY8
. They're equivalent to EXACT
and EXACTFU
, except that they contain a code point which requires UTF-8 to represent/match. Hence, if the target string isn't UTF-8, we know it can't possibly match, without needing to try.print_bytes_for_locale()
is now defined if DEBUGGING
, Prior, it didn't get defined unless LC_COLLATE
was defined on the platform.-DPERL_MEM_LOG
and -DNO_LOCALE
have been fixed.index()
optimization when comparing to -1 (or indirectly, e.g. >= 0). When this optimization was triggered inside a when
clause it caused a warning ("Argument %s isn't numeric in smart match"). This has now been fixed. [GH #16626]pack "u", "invalid uuencoding"
now properly NUL terminates the zero-length SV produced. [GH #16343].-Dm
. [GH #16653].$^X
, Perl failed to fall back to the generic technique when the platform-specific one fails (for example, a Linux system with /proc not mounted). This was a regression in Perl 5.28.0. [GH #16715].binmode($fh);
or binmode($fh, ':raw');
now properly removes the :utf8
flag from the default :crlf
I/O layer on Win32. [GH #16730].\(@a[3,5,7]) = \(....);
was being interpreted as:
local \(@a[3,5,7]) = \(....);
sort SUBNAME
within an eval EXPR
when EXPR
was UTF-8 upgraded could panic if the SUBNAME
was non-ASCII. [GH #16979].errno
on success so that the modification isn't visible to the perl user, since realloc() is called implicitly by the interpreter. This modification is permitted by the C standard, but has only been observed on FreeBSD 13.0-CURRENT. [GH #16907].getcwd
as Internals::getcwd()
if available. This is intended for use by Cwd.pm
during bootstrapping and may be removed or changed without notice. This fixes some bootstrapping issues while building perl in a directory where some ancestor directory isn't readable. [GH #16903].pack()
no longer can return malformed UTF-8. It croaks if it would otherwise return a UTF-8 string that contains malformed UTF-8. This protects against potential security threats. [GH #16035].$^R
. This could result in length($^R)
returning an incorrect value.This can prevent stack overflow when processing extremely deep op trees.
\p{}
properties (see "User-Defined Character Properties" in perlunicode) has been rewritten to be in C (instead of Perl). This speeds things up, but in the process several inconsistencies and bug fixes are made.An EXACTFish regnode has a finite length it can hold for the string being matched. If that length is exceeded, a second node is used for the next segment of the string, for as many regnodes as are needed. Care has to be taken where to break the string, in order to deal multi-character folds in Unicode correctly. If we want to break a string at a place which could potentially be in the middle of a multi-character fold, we back off one (or more) characters, leaving a shorter EXACTFish regnode. This backing off mechanism contained an off-by-one error. [GH #16806].
eof
call with no previous file handle now returns true. [GH #16786]$?
) is zero, perl will now treat the in-place edit as successful, replacing the input file with any output produced.This allows code like:
perl -i -ne 'print "Foo"; last'
to replace the input file, while code like:
perl -i -ne 'print "Foo"; die'
will not. Partly resolves [GH #16748].
close(STDIN); open(CHILD, "|wc -l")'
because the child's stdin would be closed on exec. This has now been fixed.
-DNO_LOCALE_NUMERIC
and -DNO_LOCALE_COLLATE
. [GH #16771]./di
nodes ending or beginning in s are now EXACTF
. We do not want two EXACTFU
to be joined together during optimization, and to form a ss
, sS
, Ss
or SS
sequence; they are the only multi-character sequences which may match differently under /ui
and /di
.Perl 5.30.0 represents approximately 11 months of development since Perl 5.28.0 and contains approximately 620,000 lines of changes across 1,300 files from 58 authors.
Excluding auto-generated files, documentation and release tools, there were approximately 510,000 lines of changes to 750 .pm, .t, .c and .h files.
Perl continues to flourish into its fourth decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.30.0:
Aaron Crane, Abigail, Alberto Simões, Alexandr Savca, Andreas König, Andy Dougherty, Aristotle Pagaltzis, Brian Greenfield, Chad Granum, Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn Ilmari Mannsåker, Dan Book, Dan Dedrick, Daniel Dragan, Dan Kogai, David Cantrell, David Mitchell, Dominic Hargreaves, E. Choroba, Ed J, Eugen Konkov, François Perrad, Graham Knop, Hauke D, H.Merijn Brand, Hugo van der Sanden, Jakub Wilk, James Clarke, James E Keenan, Jerry D. Hedden, Jim Cromie, John SJ Anderson, Karen Etheridge, Karl Williamson, Leon Timmermans, Matthias Bethke, Nicholas Clark, Nicolas R., Niko Tyni, Pali, Petr Písař, Phil Pearl (Lobbes), Richard Leach, Ryan Voots, Sawyer X, Shlomi Fish, Sisyphus, Slaven Rezic, Steve Hay, Sullivan Beck, Tina Müller, Tomasz Konojacki, Tom Wyant, Tony Cook, Unicode Consortium, Yves Orton, Zak B. Elep.
The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of most of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Noteworthy in this release were the large number of bug fixes made possible by Sergey Aleynikov's high quality perlbug reports for issues he discovered by fuzzing with AFL.
Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.
For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.
If you find what you think is a bug, you might check the perl bug database at https://rt.perl.org/. There may also be information at http://www.perl.org/, the Perl Home Page.
If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V
, will be sent off to perlbug@perl.org to be analysed by the Perl porting team.
If the bug you are reporting has security implications which make it inappropriate to send to a publicly archived mailing list, then see "SECURITY VULNERABILITY CONTACT INFORMATION" in perlsec for details of how to report the issue.
If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you can do so by running the perlthanks
program:
perlthanks
This will send an email to the Perl 5 Porters list with your show of thanks.
The Changes file for an explanation of how to view exhaustive details on what changed.
The INSTALL file for how to build Perl.
The README file for general stuff.
The Artistic and Copying files for copyright information.