Back

Chomsky and the Two Cultures of Statistical Learning (2011)

5 days ago norvig.com

Story Summary Story

Last updated: 11 hours ago

The discussion centers on a critique of purely statistical methods in machine learning, particularly in linguistics, where the emphasis is on empirical success over underlying explanatory principles. A core argument against these methods is that engineering success (accurate prediction) is irrelevant to true scientific understanding, which requires insight into generative mechanisms.

Counterarguments assert that engineering success provides evidence for scientifically successful models, and that gathering data is a dominant mode in science, not just "butterfly collecting." While acknowledging that large statistical models can be opaque, insight can still be gained by analyzing their behavior and failures. The essay argues that natural language interpretation is inherently probabilistic, and that complex, trained models better capture linguistic facts than older, categorical, rule-based systems.

The critique further claims statistical models cannot handle the complexity of language due to limitations like dependence on short sequences (Markov chains), but modern probabilistic models address these issues. Ultimately, the preference for simple, interpretable models over those that accurately reflect the messiness of actual language use is seen as prioritizing abstract mathematical formalism over empirical reality, akin to a Platonic view that ignores observable phenomena.

Comments Summary Comments (74)