Navigation
  • Home
  • Recent
  • Most Active
  • Popular
  • Blog
  • Credits
  • RSS
  •   Interaction
  • Register
  • Statistics
  •   Help
  • Suggestions
  • Contact Us
  • How to Edit
  • Help



  • [Edit]


    Statistical language models are probability distributions defined on sequences of words, P(w1..n). Language modeling has been used in many NLP applications such as part-of-speech tagging, parsing, speech recognition, machine translation and information retrieval. Estimating sequences can become expensive in corpora where phrases or sentences can be arbitrarily long (data sparseness problem), and so these models are most often approximated using smoothed N-gram models based on unigrams, bigrams and/or trigrams.
    In speech recognition, these models refer to a probabilistic distribution capturing the statistics of the generation of a language, and attempt to predict the next word in a speech sequence.

    When used in information retrieval, a language model is associated with a document in a collection. With query Q as input, retrieved documents are ranked based on the probability that the document's language model would generate the terms of the query, P(Q|Md).


        Language model
     
    Search more:
     

       
    Source Privacy License Download Contact Us Atlas
    Scientus.org Dictionary (Yet Another Wiki) RC : 1.39
    This article is licensed under the GNU Free Documentation License [copyleft]. It uses material from the Wikipedia article "Language model". link