I have managed to determine that perl’s substitution operator no longer considers “\n” to be matched with the “.” regular expression character. Since when is \n not part of “any” character?! WTF??? Actually, it’s even worse than that; having “\n” in a string just seems to break regular expression matching altogether. This makes NO sense and should NOT be happening. I just rebooted too.
I am quite literally in the process of trying to remove “Elvis.” from appearing twice in a string. Don’t ask, it’s not important. What’s important is perl isn’t working how perl has worked for 10 years. Or I’m crazy. Or I’m missing something obvious:
Here’s my little debug code — it prints out$s, it then checks if it has Elvis in it and says what’s up, it then checks if it has TWO elvis’es in it, and says what’s up.
print “s is originally \”$s\”!<BR>”;
if ($s =~ /Elvis/i) { print “\”$s\” HAS ELVIS!!!<BR>\n”; }
if ($s =~ /Elvis.*Elvis/i) { print “\”$s\” HAS TWO ELVISES: /Elvis.*Elvis/ TOO!<BR>\n”; }#what I was originally trying to do: $s =~ s/(Elvis\..*)Elvis/$1/ig;
print “S IS FINALLY: \”$s\”!!!<BR><BR>”;
So basically, you feed it $s, and it tells you if it has elvis, and it tells you if it has 2 elvises.
For various values of $s …
$s = “Elvis Elvis”; #succeeds
$s = “Elvis. Elvis.”; #succeeds
$s = “Elvis.\nElvis.\n”; #fails to detect 2 elvises
$s = “Elvis Elvis.\n”; #succeeds
$s = “Elvis b Elvis.\n”; #succeeds
$s = “Elvis \n Elvis.\n”; #fails - including checking for [.\n]*
$s = “Elvis. asdfklajsdfasdf Elvis.\n”; #succeeds
$s = “Elvis. asdfkl\njsdfasdf Elvis.\n”; #fails
There is no fucking way that the presence of a \n character should break a regex that includes “.*”, or “any string of any length comprised of any character”. The set of “Any characters” includes “\n”!
This is making ZERO sense. This is like… programming in basic, and waking up one day to discover that the GOTO command doesn’t work for odd numbers. This just makes NO sense.
Furthremore, I changed my line of code to this:
if ($s =~ /Elvis[.\n]*Elvis/i) { print “\”$s\” HAS TWO ELVISES: /Elvis.*Elvis/ TOO!<BR>\n”; }
Notice the bold – i specifically made a regular expression saying “any character that is either any character or a \n”. I mean, you can’t get much more explicit than that. And it STILL DIDN’T DETECT THE 2ND ELVIS.
I’m fucking confused. Guess no pictures will be uploaded today.
This is perl, v5.8.4 built for MSWin32-x86-multi-thread
(with 3 registered patches, see perl -V for more detail)
Summary of my perl5 (revision 5 version 8 subversion 4) configuration:
Platform:
osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread
uname=”
config_args=’undef’
hint=recommended, useposix=true, d_sigaction=undef
usethreads=undef use5005threads=undef useithreads=define usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc=’cl’, ccflags =’-nologo -Gf -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DNO_HASH_SEED -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX’,
optimize=’-MD -Zi -DNDEBUG -O1′,
cppflags=’-DWIN32′
ccversion=”, gccversion=”, gccosandvers=”
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10
ivtype=’long’, ivsize=4, nvtype=’double’, nvsize=8, Off_t=’__int64′, lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld=’link’, ldflags =’-nologo -nodefaultlib -debug -opt:ref,icf -libpath:”c:\Perl\lib\CORE” -machine:x86′
libpth=C:\PROGRA~1\MICROS~3\VC98\lib
libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib
perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib
libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib
gnulibc_version=’undef’
Dynamic Linking:
dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=’ ‘
cccdlflags=’ ‘, lddlflags=’-dll -nologo -nodefaultlib -debug -opt:ref,icf -libpath:”c:\Perl\lib\CORE” -machine:x86′
Characteristics of this binary (from libperl):
Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS
Locally applied patches:
ActivePerl Build 810
22751 Update to Test.pm 1.25
21540 Fix backward-compatibility issues in if.pm
Built under MSWin32
Compiled at Jun 1 2004 11:52:21
%ENV:
PERL=”perl”
PERL4=”c:\util\perl4.exe”
@INC:
C:/perl/lib
C:/perl/site/lib
.
December 1, 2007 at 4:02 PM
I just refuse to believe that all of a sudden, I don’t know how to do regular expressions in Perl. Esp with having done coding in the past week using it! WTF!
December 1, 2007 at 4:08 PM
http://perldoc.perl.org/perlre.html
we have to use “/s” to have a regex cover past a newline chracter?! since when?!?!?!
My global picture caption substitutions have been working pretty damn well despite newlines in them all this time!
I’m not sure how I haven’t run into the situation of needing “/s” in 10 yrs of perl programming, considernig most of it CENTERS around regular expressions.
I really am flabbergasted at how I existed NOT knowing this. WTF?
December 1, 2007 at 4:08 PM
CONFUSION SUCKS!