Extracting NLP part-of-speech labels of customers' review in R -
i have following dataframe contains reviews customer have left on restaurant website:
id<-c(1,2,3,4,5,6) review<- c("the food delicious , hearty - perfect warm during freezing winters day", "excellent service usual","love place!", "service , quality of food first class"," customer services exceptional staff","excellent services") df<-data.frame(id, review) now looking way (preferably without using for loop) find part-of-speech labels in each customer's review in r.
this pretty straightforward adaption of example on maxent_pos_tag_annotator page.
df<-data.frame(id, review, stringsasfactors=false) library(nlp) library(opennlp) review.pos <- sapply(df$review, function(ii) { a2 <- annotation(1l, "sentence", 1l, nchar(ii)) a2 <- annotate(ii, maxent_word_token_annotator(), a2) a3 <- annotate(ii, maxent_pos_tag_annotator(), a2) a3w <- subset(a3, type == "word") tags <- sapply(a3w$features, `[[`, "pos") sprintf("%s/%s", as.string(ii)[a3w], tags) }) which results in output:
#[[1]] # [1] "the/dt" "food/nn" "was/vbd" "very/rb" "delicious/jj" # [6] "and/cc" "hearty/nn" "-/:" "perfect/jj" "to/to" #[11] "warm/vb" "up/rp" "during/in" "a/dt" "freezing/jj" #[16] "winters/nns" "day/nn" # #[[2]] #[1] "excellent/jj" "service/nn" "as/in" "usual/jj" # #[[3]] #[1] "love/vb" "this/dt" "place/nn" "!/." # #[[4]] #[1] "service/nnp" "and/cc" "quality/nn" "of/in" "food/nn" #[6] "first/jj" "class/nn" # #[[5]] #[1] "customer/nn" "services/nns" "was/vbd" "exceptional/jj" #[5] "by/in" "all/dt" "staff/nn" # #[[6]] #[1] "excellent/jj" "services/nns" it should relatively straightforward adapt whatever format want.
Comments
Post a Comment