Semantic analysis (machine learning)
In machine learning, semantic analysis of a text corpus is the task of building structures that approximate concepts from a large set of documents. It generally does not involve prior semantic understanding of the documents. Semantic analysis strategies include: Metalanguages based on first-order logic, which can analyze the speech of humans. Understanding the semantics of a text is symbol grounding: if language is grounded, it is equal to recognizing a machine-readable meaning. For the restricted domain of spatial analysis, a computer-based language understanding system was demonstrated. Latent semantic analysis (LSA), a class of techniques where documents are represented as vectors in a term space. A prominent example is probabilistic latent semantic analysis (PLSA). Latent Dirichlet allocation, which involves attributing document terms to topics. n-grams and hidden Markov models, which work by representing the term stream as a Markov chain, in which each term is derived from preceding terms.
This article needs additional citations for verification. (January 2021) |
| Semantics | ||||||||
|---|---|---|---|---|---|---|---|---|
|
||||||||
|
Semantics of programming languages | ||||||||
|
||||||||
In machine learning, semantic analysis of a text corpus is the task of building structures that approximate concepts from a large set of documents. It generally does not involve prior semantic understanding of the documents.
Semantic analysis strategies include:
- Metalanguages based on first-order logic, which can analyze the speech of humans.[1]: 93-
- Understanding the semantics of a text is symbol grounding: if language is grounded, it is equal to recognizing a machine-readable meaning. For the restricted domain of spatial analysis, a computer-based language understanding system was demonstrated.[2]: 123
- Latent semantic analysis (LSA), a class of techniques where documents are represented as vectors in a term space. A prominent example is probabilistic latent semantic analysis (PLSA).
- Latent Dirichlet allocation, which involves attributing document terms to topics.
- n-grams and hidden Markov models, which work by representing the term stream as a Markov chain, in which each term is derived from preceding terms.
Stochastic semantic analysis
[edit]Stochastic semantic analysis is an approach used in computer science as a semantic component of natural language understanding.
Stochastic models generally use the definition of segments of words as basic semantic units for the semantic models, and in some cases involve a two layered approach.[3]
Example applications have a wide range. In machine translation, it has been applied to the translation of spontaneous conversational speech among different languages.[4] In the area of spoken language understanding the fact that spoken sentences often do not follow the grammar of a language and involve self-corrections, repetitions, and other irregularities, the use of stochastic semantic has been suggested as a natural fit to achieve robustness to deal with noise due to the spontaneous nature of spoken language.[5]
See also
[edit]- Explicit semantic analysis
- Information extraction
- Semantic similarity
- Stochastic semantic analysis
- Ontology learning
References
[edit]- ^ Nitin Indurkhya; Fred J. Damerau (22 February 2010). Handbook of Natural Language Processing. CRC Press. ISBN 978-1-4200-8593-8.
- ^ Michael Spranger (15 June 2016). The evolution of grounded spatial language. Language Science Press. ISBN 978-3-946234-14-2.
- ^ Language Understanding Using Two-Level Stochastic Models by F. Pla, et al, 2001, Springer Lecture Notes in Computer Science ISBN 978-3-540-42557-1
- ^ W. Minkera, M. Gavaldàb and A. Waibel Stochastically-based semantic analysis for machine translation in Computer Speech & Language Volume 13, Issue 2, April 1999, Pages 177-194
- ^ R. De Mori et al, Spoken language understanding in IEEE Signal Processing Magazine, May 2008 Volume: 25 Issue: 3, pages 50 - 58 ISSN 1053-5888