![]()
The Unreasonable Effectiveness of Data
http://www.computer.org/portal/cms_docs_intelligent/intelligent/homepage/2009/x2exp.pdf
Halevy, Alon, Peter Norvig and Fernando Pereira
IEEE Intelligent Systems
2009
Abstract:
Follow the data. Choose a representation that can use unsupervised
learning on unlabeled data, which is so much more plentiful than labeled data. Represent all the data with a data. Of course, we’ll find immense opportunities to create interesting data sets if we can automatically combine data from multiple tables in this collection. This is an area of active research. Another opportunity is to combine data from multiple tables with data from other sources, such as unstructured Web pages or Web search queries.