AI of the Week: Let’s not forget the humble data annotator

Keeping up with an industry as rapidly changing as AI is a challenge. So until AI can do it for you, here's a quick roundup of recent stories in the world of machine learning, as well as notable research and experiments that we didn't cover independently.

This week in AI, we'd like to spotlight labeling and annotation startups. Startups like Scale AI are reportedly in talks to raise new funding at a valuation of $13 billion. Labeling and annotation platforms may not get the same attention as flashy new generative AI models like OpenAI's Sora. But they are essential. Without them, modern AI models probably wouldn't exist.

The data on which many models are trained must be labeled. why? Labels, or tags, help the model understand and interpret the data during the training process. For example, labels for training image recognition models may take the form of markings around objects, “bounding boxes,” or captions that refer to each person, place, or object depicted in the image.

Label accuracy and quality have a significant impact on the performance and reliability of the trained model. Annotation is also a large-scale task, requiring thousands to millions of labels for the larger and more sophisticated data sets used.

So you might think that data annotators would be well-treated, paid a living wage, and given the same benefits that the engineers themselves who build the models enjoy. However, the opposite is often true, a product of the harsh working conditions that many annotation and labeling startups foster.

Multi-billion dollar companies like OpenAI have relied on annotators in third world countries who make just a few dollars an hour. Some of these annotators are not given time off (usually because they are contractors) or access to mental health resources, despite being exposed to highly disturbing content such as graphic images. Some people don't.

A great article in NY Mag specifically peels back the curtain on Scale AI. Scale AI employs annotators in countries as far away as Nairobi and Kenya. Some of Scale AI's tasks require a labeler to work multiple 8-hour shifts without breaks and are paid as little as $10. And these workers are at the mercy of the platform's whims. An annotator may work for long periods of time without receiving any work, or he may be rudely forced to launch the Scale AI. This is what recently happened to contractors in Thailand, Vietnam, Poland and Pakistan.

Some annotation and labeling platforms claim to offer “fair trade” work. In fact, they've made it a central part of their branding. But as Kate Kaye of MIT Tech Review points out, there is no regulation and weak industry standards on what constitutes ethical labeling practices, and companies' own definitions vary widely.

So what should I do? Unless there are major technological advances, the need to annotate and label data for AI training will not go away. While we can expect platforms to self-regulate, a more realistic solution seems to be policymaking. That in itself is a difficult prospect, but I would argue that it is the best bet we have in changing things for the better. Or at least it's starting to do so.

Here are some other notable AI stories from the past few days.

OpenAI builds voice clones: OpenAI is previewing a new AI-powered tool called Voice Engine. This tool allows users to clone audio from a 15-second recording of someone speaking. However, the company has chosen not to make it widely available (yet) due to the risk of misuse and abuse. Amazon doubles down on Anthropic: Amazon invested an additional $2.75 billion in growing AI powerhouse Anthropic, taking over the option it left last September. Google.org launches accelerator: Google's philanthropic arm Google.org launches new $20 million, six-month, philanthropic arm to help fund nonprofits developing technology that leverages generative AI Start the program. New model architecture: AI startup AI21 Labs has released Jamba, a generative AI model that employs a novel(ish) model architecture (state-space model, or SSM) to improve efficiency. Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model similar to OpenAI's GPT series and Google's Gemini. The company claims to have achieved state-of-the-art results on a number of popular AI benchmarks, including several measurement inferences. Uber Eats and UK AI regulation: Natasha writes about how Uber Eats delivery drivers struggle with AI bias and show how difficult it is to get justice under UK AI regulation. EU Election Security Guidelines: The European Union on Tuesday published draft election security guidelines for around two dozen platforms regulated under the Digital Services Act. This includes guidelines for preventing the spread of AI-based disinformation generated by content recommendation algorithms (also known as political deepfakes). Grok gets upgraded: X's Grok chatbot will soon get an upgraded base model, his Grok-1.5. At the same time, all premium subscribers of X will have access to his Grok. (Grok was previously exclusive to X Premium+ customers.) Adobe expands Firefly: This week, Adobe announced Firefly Services, a set of over 20 new generative and creative APIs, tools, and services. Did. He has also launched a custom model that allows companies to tweak his Firefly model based on their assets. It's part of Adobe's new GenStudio suite.

More machine learning

What's the weather like? AI is increasingly able to tell us this. A few months ago, we mentioned some work on hourly, weekly, and century-by-century predictions, but like all things in AI, this field is changing rapidly. The team behind MetNet-3 and GraphCast has published a paper describing a new system called his SEEDS for the Scalable Ensemble Envelope Diffusion Sampler.

An animation showing how more forecasts creates a more even distribution of weather forecasts.

SEEDS uses diffusion to generate an “ensemble” of plausible weather outcomes for an area based on inputs (perhaps radar readings or orbital images) much faster than physically-based models. As the number of ensembles increases, you can cover more edge cases (such as an event that occurs in only one of 100 possible scenarios) and have more confidence in more likely situations. can.

Fujitsu also hopes to better understand the natural world by applying AI image processing technology to underwater images and LIDAR data collected by underwater autonomous vehicles. Improving image quality allows other less sophisticated processes (such as 3D transformations) to work better on the target data.

Image credit: Fujitsu

The idea is to build a “digital twin” of a body of water that can help simulate and predict new developments. We are far from there, but we have to start somewhere.

The researchers discovered that LLM mimics intelligence in an even simpler way than expected: by a linear function. Frankly, the math doesn't make sense to me (it's about vectors of many dimensions), but this article at MIT makes it pretty clear that the reproduction mechanism for these models is very basic. It will be.

These models are very complex nonlinear functions, trained on large amounts of data, and very difficult to understand, but there may be very simple mechanisms operating under the hood. This is one example of that,” said co-lead author Evan Hernandez. If you're interested in something more technical, check out the paper here.

One of the ways these models can fail is because they don't understand context and feedback. Even a really competent LLM might not “get” you if you tell them your name is pronounced a certain way because they don't actually know or understand anything. If it's important, such as human-robot interaction, people may be uncomfortable if the robot behaves that way.

Disney Research has been studying automated character interactions for a long time, and just published a paper on name pronunciation and reuse a while ago. It may seem obvious, but a smart approach is to extract the phonemes when someone introduces themselves and encode that rather than just their written name.

Image credit: Disney Research

Finally, as AI and search increasingly overlap, it is worth reassessing how these tools are used and whether there are new risks posed by this unholy union. Safiya Umoja Noble has been an important voice in AI and search ethics for many years, and her opinions are always enlightening. She gave a great interview with her UCLA News team about how her own work has evolved and why she has to stay calm when it comes to bias and bad habits in her search. He told me.

Source link

Subscribe to Updates

What's Hot

AI of the Week: Let’s not forget the humble data annotator

More machine learning

Related Posts

Leave A Reply Cancel Reply

Subscribe to Updates