Three of the best use cases for models and humans to work together on workflows today.
This is the second article in a two-part series. In Part 1, we discussed the limitations and challenges of using machine learning (ML) models in labeling and annotation.
In this post, we’ll cover the best use cases for models and humans to work together on workflows today.
Labeling and annotating data is crucial for training ML models and provides the essential ground truth for accurate predictions. Although humans are typically more accurate, particularly for complex or ambiguous scenarios, ML models are faster and can handle large data sets at scale without requiring significant resources or driving up costs.
That’s why models are a good fit for pre-annotation workflows. In these workflows, client images are first processed through an ML model to generate pre-annotations that help pinpoint which data is most valuable to label and annotate in the client model, saving human labelers time compared to starting from scratch. These techniques generally fall under the umbrella of data curation.
These pre-annotations, however, are usually insufficient for producing rich training data that can significantly enhance the client model’s capabilities and often have gaps or errors. Humans can come in and validate the predictions and make adjustments, such as removing incorrect bounding boxes or adding missing elements in the case of false negatives.
Another way ML tools can help is to relieve human labelers and annotators from tasks that are cognitively taxing but don’t add significant new value.
This is where large language models (LLMs) especially come into play, for example, when improving a response to be more concise or better structured. In this case, humans-in-the-loop would extract or inject new raw data, while the LLM performs the more cognitively strenuous part of repackaging and organizing the data, perhaps even offering suggestions on areas of improvement. Humans can then edit and refine the output.
ML models are also extremely helpful for other repetitive tasks like labeling for object tracking. Instead of humans making minor bounding box adjustments from frame to frame, an ML model can perform the work, with a human validating the results.
While ML models are invaluable for speeding up the data labeling and annotation process, especially when handling large-scale datasets and repetitive tasks, human oversight remains essential for ensuring high-quality results. This collaboration between ML and human expertise optimizes the process, balancing speed and accuracy to product quality training data.
At Sama, we are always exploring, prototyping, and productizing tools that blend cutting edge models with a deep understanding of where humans still provide the most value in labeling and annotation processes.
Learn more about our perspective on automation in our free e-book, Machines Still Need Us.
Image credit: Yutong Liu & Kingston School of Art / Better Images of AI / Talking to AI 2.0 / Liscenced by CC-BY 4.0
For the majority of model developers, a combination of the two — human and automation — is where you’ll see the best balance between quality and accuracy versus lower costs and efficiency. We’ll explore why humans still need to be in the loop today.