Published: Oct 28, 2020
Oct 2020

Modern ML models are very complex and require massive amounts of labeled training data sets to learn from. Snorkel started at the Stanford AI lab with the realization that manually labeling data is very expensive and often not feasible. Snorkel allows us to label training data programmatically via the creation of labeling functions. Snorkel employs supervised learning techniques to assess the accuracies and correlations of these labeling functions, and then reweighs and combines their output labels, leading to high-quality training labels. The creators of Snorkel have since come out with a commercial platform called Snorkel Flow. While Snorkel itself is no longer actively developed, it's still significant for its ideas on the use of weakly supervised methods to label data.