Corpus analysis can be expanded and scaled up by incorporating
computational methods from natural language processing. This
Element shows how text classification and text similarity models
can extend our ability to undertake corpus linguistics across very
large corpora. These computational methods are becoming
increasingly important as corpora grow too large for more
traditional types of linguistic analysis. We draw on five case
studies to show how and why to use computational methods, ranging
from usage-based grammar to authorship analysis to using social
media for corpus-based sociolinguistics. Each section is
accompanied by an interactive code notebook that shows how to
implement the analysis in Python. A stand-alone Python package is
also available to help readers use these methods with their own
data. Because large-scale analysis introduces new ethical problems,
this Element pairs each new methodology with a discussion of
potential ethical implications.
General
Is the information for this product incomplete, wrong or inappropriate?
Let us know about it.
Does this product have an incorrect or missing image?
Send us a new image.
Is this product missing categories?
Add more categories.
Review This Product
No reviews yet - be the first to create one!