regex - How can I remove all spaces between two XML tags? -


i have perl script want amend remove spaces between 2 xml tags.

example xml:

<tag> <tag1><tag2>abc 123 def 456 ... </tag2></tag1><tag1><tag2>xyz 987 ... </tag> 

i'd remove occurrences of spaces between tag2 tags. tried following:

$vmodstrg =~ s/(<tag2>(.*?)<\/tag2>)/<tag2>zzzzzz<\/tag2>/g; 

but replaces whole match zzzzz. how can tell perl remove spaces match occurrences of tag2?

regular expressions bad tool job, because parsing xml requires recursion. can newer versions of regex, @ best leads complicated , hard read regular expressions, , ones edge cases they'll break.

see: why it's not possible use regex parse html/xml: formal explanation in layman's terms

so use parser - remove 'spaces between <tag2> elements':

#!/usr/bin/env perl  use strict; use warnings;  use xml::twig;   #parse data our "data" filehandle.  #you might want "parsefile('somefilename.xml')" instead.  $twig = xml::twig -> parse ( \*data );  #iterate 'text' below "tag2" anywhere in document.  foreach $tag ( $twig -> get_xpath ('//tag2/#text') ) {     #modify tag.      $tag -> set_text($tag -> text =~ s/\s+//gr );  } #set output options $twig -> set_pretty_print('indented_a'); #print stdout. might want: #print {$output_fh} $twig -> sprint;  $twig -> print;  __data__ <root>    <tag2>words spaces</tag2>    <tag2>        <child>wordswordswords more words        </child>    </tag2>    <tag1>some more words spaces</tag1>    <tag2>something here        <another_child att="fish" />    </tag2> </root> 

this outputs:

<root>   <tag2>wordswithspaces</tag2>   <tag2>     <child>wordswordswords more words        </child>   </tag2>   <tag1>some more words spaces</tag1>   <tag2>somethinghere<another_child att="fish" /></tag2> </root> 

so can see - correctly modifying text between <tag2> elements, , leaving other stuff untouched. , bonus points, it's @ least clear it's doing equivalent regex be!


Comments

Popular posts from this blog

how to insert data php javascript mysql with multiple array session 2 -

multithreading - Exception in Application constructor -

windows - CertCreateCertificateContext returns CRYPT_E_ASN1_BADTAG / 8009310b -