The most common document formalisation for text classi?cation is
the vector space model founded on the bag of words/phrases
representation. The main advantage of the vector space model is
that it can readily be employed by classi?cation - gorithms.
However, the bag of words/phrases representation is suited to
capturing only word/phrase frequency; structural and semantic
information is ignored. It has been established that structural
information plays an important role in classi?cation accuracy [14].
An alternative to the bag of words/phrases representation is a
graph based rep- sentation, which intuitively possesses much more
expressive power. However, this representation introduces an
additional level of complexity in that the calculation of the
similarity between two graphs is signi?cantly more computationally
expensive than between two vectors (see for example [16]). Some
work (see for example [12]) has been done on hybrid representations
to capture both structural elements (- ing the graph model) and
signi?cant features using the vector model. However the
computational resources required to process this hybrid model are
still extensive.
General
Is the information for this product incomplete, wrong or inappropriate?
Let us know about it.
Does this product have an incorrect or missing image?
Send us a new image.
Is this product missing categories?
Add more categories.
Review This Product
No reviews yet - be the first to create one!