regex - Insertion of characters in strings in R -
i insert "&" between letters (upper-case , lower-case), not before or after letters, , replace each lower-case letter x tt$x==0, each upper-case letter x tt$x==1, , each + )|(, plus opening bracket , closing bracket around entire string, expression can evaluated in r. example, have string
st <- "abc + de + fghij" the result should this:
"(tt$a==1 & tt$b==0 & tt$c==1) | (tt$d==0 & tt$e==0) | (tt$f==1 & tt$g==1 & tt$h==1 & tt$i==1 & tt$j==1)" could gsub() function?
a bunch of regexps elegant, , hard debug. above regexp solution fails if there's not exact spacing between elements.
> tt("abc+b") [1] "(tt$a==0 & tt$b==1 & tt$c==0+tt$b==0)" > tt("abc + b") [1] "(tt$a==0 & tt$b==1 & tt$c==0) | (tt$b==0)" sometimes have split bits , process them. here's solution:
dochar = vectorize( function(c){ sprintf("tt$%s==%s",toupper(c),ifelse(c %in% letters,"1","0")) } ) doword = vectorize(function(w){ cs = strsplit(w,"")[[1]] paste0("(", paste(dochar(cs),collapse=" & "), ")") }) processstring = function(st){ parts = strsplit(st,"\\+")[[1]] parts = gsub(" ","",parts) paste0(doword(parts),collapse=" | ") } there's many ways make better, has benefit of being bit easier debug (you can test parts) , looks less line noise :)
for sample string given returns same tt function function wrapper of regexp solution:
> tt(st)==processstring(st) [1] true but handles spacing:
> processstring("abc + def") == processstring("abc+def") [1] true its idea write code bit flexible in inputs accepts. might notice tt part of output elements appears once, if want output foo$a instead of tt$a there's 1 change needed. regexp solution has in 3 places (or maybe 4 if i've missed one!).
Comments
Post a Comment