• Decrease Text SizeIncrease Text Size

Statistical Learning

Statistical learning techniques attempt to construct statistical models of an entity based on surface features drawn from a large corpus of examples. These techniques generally operate independent of specific domain knowledge, training instead on a set of features that characterize an input example. In the domain of natural language, for example, statistics of language usage (e.g., word trigram frequencies) are compiled from large collections of input documents and are used to categorize or make predictions about new text. Systems trained through statistical learning have the advantage of not requiring human-engineered domain modeling. This strong dependence on the input corpus has the disadvantage of limiting their applicability to new domains, requiring access to large corpora of examples and a retraining step for each domain of interest. Statistical techniques thus tend to have high precision within a domain at the cost of generality across domains.


Related Keywords:
Statistical Learning, Statistical learning techniques attempt to construct statistical models of an entity based on surface features drawn from a large corpus of examples. These techniques generally operate independent of specific domain knowledge, training instead on a set of features that characterize an input example. In the domain of natural language, for example, statistics of language usage (e.g., word trigram frequencies) are compiled from large collections of input documents and are used to categorize or make predictions about new text. Systems trained through statistical learning have the advantage of not requiring human-engineered domain modeling. This strong dependence on the input corpus has the disadvantage of limiting their applicability to new domains, requiring access to large corpora of examples and a retraining step for each domain of interest. Statistical techniques thus tend to have high precision within a domain at the cost of generality across domains.,