machine-learning,probability,mle,language-model

The likelihood function describes the probability of generating a set of training data given some parameters and can be used to find those parameters which generate the training data with maximum probability. You can create the likelihood function for a subset of the training data, but that wouldn't be represent...

The most common data structures in language models are tries and hash tables. You can take a look at Kenneth Heafield's paper on his own language model toolkit KenLM for more detailed information about the data structures used by his own software and related packages.