Semantic analysis (machine learning)

In machine learning, semantic analysis of a text corpus is the task of building structures that approximate concepts from a large set of documents. It generally does not involve prior semantic understanding of the documents.

Semantic analysis strategies include:

Metalanguages based on first-order logic, which can analyze the speech of humans.^[1]^: 93-
Understanding the semantics of a text is symbol grounding: if language is grounded, it is equal to recognizing a machine-readable meaning. For the restricted domain of spatial analysis, a computer-based language understanding system was demonstrated.^[2]^: 123
Latent semantic analysis (LSA), a class of techniques where documents are represented as vectors in a term space. A prominent example is probabilistic latent semantic analysis (PLSA).
Latent Dirichlet allocation, which involves attributing document terms to topics.
n-grams and hidden Markov models, which work by representing the term stream as a Markov chain, in which each term is derived from preceding terms.

Stochastic semantic analysis

Stochastic semantic analysis is an approach used in computer science as a semantic component of natural language understanding.

Stochastic models generally use the definition of segments of words as basic semantic units for the semantic models, and in some cases involve a two layered approach.^[3]

Example applications have a wide range. In machine translation, it has been applied to the translation of spontaneous conversational speech among different languages.^[4] In the area of spoken language understanding the fact that spoken sentences often do not follow the grammar of a language and involve self-corrections, repetitions, and other irregularities, the use of stochastic semantic has been suggested as a natural fit to achieve robustness to deal with noise due to the spontaneous nature of spoken language.^[5]

References

^ Nitin Indurkhya; Fred J. Damerau (22 February 2010). Handbook of Natural Language Processing. CRC Press. ISBN 978-1-4200-8593-8.
^ Michael Spranger (15 June 2016). The evolution of grounded spatial language. Language Science Press. ISBN 978-3-946234-14-2.
^ Language Understanding Using Two-Level Stochastic Models by F. Pla, et al, 2001, Springer Lecture Notes in Computer Science ISBN 978-3-540-42557-1
^ W. Minkera, M. Gavaldàb and A. Waibel Stochastically-based semantic analysis for machine translation in Computer Speech & Language Volume 13, Issue 2, April 1999, Pages 177-194
^ R. De Mori et al, Spoken language understanding in IEEE Signal Processing Magazine, May 2008 Volume: 25 Issue: 3, pages 50 - 58 ISSN 1053-5888

This machine learning–related article is a stub. You can help Wikipedia by adding missing information.

[IndurkhyaDamerau2010-1] Nitin Indurkhya; Fred J. Damerau (22 February 2010). Handbook of Natural Language Processing. CRC Press. ISBN 978-1-4200-8593-8.

[Spranger2016-2] Michael Spranger (15 June 2016). The evolution of grounded spatial language. Language Science Press. ISBN 978-3-946234-14-2.

[3] Language Understanding Using Two-Level Stochastic Models by F. Pla, et al, 2001, Springer Lecture Notes in Computer Science ISBN 978-3-540-42557-1

[4] W. Minkera, M. Gavaldàb and A. Waibel Stochastically-based semantic analysis for machine translation in Computer Speech & Language Volume 13, Issue 2, April 1999, Pages 177-194

[5] R. De Mori et al, Spoken language understanding in IEEE Signal Processing Magazine, May 2008 Volume: 25 Issue: 3, pages 50 - 58 ISSN 1053-5888

[1]

[2]

[3]

[4]

[5]

Stochastic semantic analysis

See also

References