Document worth reading: “A Survey on Data Collection for Machine Learning: a Big Data – AI Integration Perspective”
Data assortment is a essential bottleneck in machine learning and an energetic evaluation topic in a variety of communities. There are largely two causes data assortment has these days become a essential concern. First, as machine learning is turning into additional widely-used, we’re seeing new capabilities that do not basically have enough labeled data. Second, not like standard machine learning the place perform engineering is the bottleneck, deep learning methods routinely generate choices, nevertheless as a substitute require large portions of labeled data. Interestingly, present evaluation in data assortment comes not solely from the machine learning, pure language, and computer imaginative and prescient communities, however moreover from the data administration group due to the importance of coping with large portions of data. In this survey, we stock out a full study of data assortment from a data administration standpoint. Data assortment largely consists of data acquisition, data labeling, and enchancment of present data or fashions. We current a evaluation panorama of these operations, current pointers on which strategy to utilize when, and decide fascinating evaluation challenges. The integration of machine learning and knowledge administration for data assortment is a a part of a greater sample of Big data and Artificial Intelligence (AI) integration and opens many options for new evaluation. A Survey on Data Collection for Machine Learning: a Big Data – AI Integration Perspective