Technical Publication Aggregator Enhancement
One of our partners is an international publishing house. Part of their enterprise is a web-based aggregator that makes available technical publications across a wide range of topics within a number of engineering disciplines. One of the search tools for the aggregator used a taxonomy that the publishing house developed via a time-consuming and labor intensive manual process. The success of this aggregator depends on a researcher’s ability to sort through the tremendous number of documents available and return the relatively few that relate to their specific interest. Using of large scale text classification tools to perform analysis of our partners data and taxonomy, we were to discover a significant number of elements (discovered terms) in the natural language of the corpus as a whole that were mapped, by virtue of the latent semantic content of the natural language in which the terms were found, back to our partners existing taxonomy, thus improving the search results and making their aggregator more effective in an automated, efficient and demonstrable way.