You've got a large array argument in this function: func stringInArray(a string, list [214]string) bool{ for _, b := range list{ if b == a{ return true; } } return false } The array of stopwords gets copied each time you call this function. Mostly in Go, you should uses...
I agree a dictionary would be a more natural solution to this problem, but if you need your pos_tags in order a more explicit solution would be: for word, pos in pos_names: for i, (tagged_word, tagged_pos) in enumerate(pos_tags): if word == tagged_word: pos_tags[i] = (word,pos) (A dictionary would probaby be...
python,nlp,tuples,part-of-speech
I'm not really sure if this is what you are looking for because I don't know if the order of the patterns is always the same in rule1, rule2 and rule3, but try this: def function(): sent_pos = [('increasing', 'VBG'), ('of', 'IN'), ('mutation', 'NN')] rule1 = [('', 'VBG'), ('', 'IN'),...
python,nltk,stemming,part-of-speech
Why don't you try it out? Here's an example: >>> from nltk.stem import PorterStemmer >>> from nltk import word_tokenize, pos_tag >>> sent = "This is a messed up sentence from the president's Orama and it's going to be sooo good, you're gonna laugh." This is the outcome of tokenizing. >>>...
This isn't really about CoreNLP, it's about whether you are using the Stanford POS tagger or the Stanford Parser (the PCFG parser) to do the POS tagging. (The PCFG parser usually does POS tagging as part of its parsing algorithm, although it can also use POS tags given from elsewhere.)...
python,string,nlp,nltk,part-of-speech
How about changing print tagged to print [(word, tag) for word, tag in tagged if tag in ('NN', 'VB')] ...
python,nlp,stanford-nlp,pos-tagger,part-of-speech
We use the tag set of the (Penn/LDC/Brandeis/UC Boulder) Chinese Treebank. See here for details on the tag set: http://www.cis.upenn.edu/~chinese/ This was documented in the parser FAQ, but I'll add it to the tagger FAQ....