When defining the conditional probability, he took a shortcut:
So I took a shortcut: I defined a trivial model that says all known words of edit distance 1 are infinitely more probable than known words of edit distance 2, and infinitely less probable than a known word of edit distance 0. By "known word" I mean a word that we have seen in the language model training data -- a word in the dictionary. We can implement this strategy as follows:
def known(words): return set(w for w in words if w in NWORDS) def correct(word): candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word] return max(candidates, key=NWORDS.get)
I don't see how this code implements his strategy. To me the last line of return is just returing the word has a highest counts/prior, instead of the priority list in his model.
and also in defining his word counting dictionary:
def train(features): model = collections.defaultdict(lambda: 1) for f in features: model[f] += 1 return model
Why didn't he start from 0? I mean shouldn't the default_factory be (lambda:0) or (int)?
Can anyone explain? You can find the full article here: http://norvig.com/spell-correct.html