Close Menu
TechBrunchTechBrunch
  • Home
  • AI
  • Apps
  • Crypto
  • Security
  • Startups
  • TechCrunch
  • Venture

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

iOS 19: All the rumor changes that Apple could bring to the new operating system

June 7, 2025

The Trump administration is aiming for Biden and Obama's cybersecurity rules

June 7, 2025

WWDC 2025: What to expect from this year's meeting

June 7, 2025
Facebook X (Twitter) Instagram
TechBrunchTechBrunch
  • Home
  • AI

    OpenAI seeks to extend human lifespans with the help of longevity startups

    January 17, 2025

    Farewell to the $200 million woolly mammoth and TikTok

    January 17, 2025

    Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

    January 17, 2025

    Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

    January 16, 2025

    Apple suspends AI notification summaries for news after generating false alerts

    January 16, 2025
  • Apps

    iOS 19: All the rumor changes that Apple could bring to the new operating system

    June 7, 2025

    WWDC 2025: What to expect from this year's meeting

    June 7, 2025

    Trump Mask feud was perfect for X and jumped on the app store chart

    June 6, 2025

    iOS 19: All the rumor changes that Apple could bring to the new operating system

    June 6, 2025

    WWDC 2025: What to expect from this year's meeting

    June 6, 2025
  • Crypto

    xNotify Polymarket as partner in the official forecast market

    June 6, 2025

    Circle IPOs are giving hope to more startups waiting to be published to more startups

    June 5, 2025

    GameStop bought $500 million in Bitcoin

    May 28, 2025

    Vote for the session you want to watch in 2025

    May 26, 2025

    Save $900 + 90% from 2 tickets to destroy 2025 in the last 24 hours

    May 25, 2025
  • Security

    The Trump administration is aiming for Biden and Obama's cybersecurity rules

    June 7, 2025

    After data is wiped out, Kiranapro co-founders cannot rule out external hacks

    June 7, 2025

    Humanity appoints national security experts to governing trusts

    June 6, 2025

    Italian lawmakers say Italy used spyware to target immigrant activists' mobile phones, but not for journalists

    June 6, 2025

    Humanity unveils custom AI models for US national security customers

    June 5, 2025
  • Startups

    7 days left: Founders and VCs save over $300 on all stage passes

    March 24, 2025

    AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

    March 24, 2025

    20 Hottest Open Source Startups of 2024

    March 22, 2025

    Andrill may build a weapons factory in the UK

    March 21, 2025

    Startup Weekly: Wiz bets paid off at M&A Rich Week

    March 21, 2025
  • TechCrunch

    OpenSea takes a long-term view with a focus on UX despite NFT sales remaining low

    February 8, 2024

    AI will save software companies' growth dreams

    February 8, 2024

    B2B and B2C are not about who buys, but how you sell

    February 5, 2024

    It's time for venture capital to break away from fast fashion

    February 3, 2024

    a16z's Chris Dixon believes it's time to focus on blockchain use cases rather than speculation

    February 2, 2024
  • Venture

    Why investing in a growing AI startup is risky and more complicated

    June 6, 2025

    Startup Battlefield 200: Only 3 days left

    June 6, 2025

    Book all TC Stage Exhibitor Tables before ending today

    June 6, 2025

    Less than 48 hours left until display at TC at all stages

    June 5, 2025

    TC Session: AI will be on sale today at Berkeley

    June 5, 2025
TechBrunchTechBrunch

AI of the Week: Let’s not forget the humble data annotator

TechBrunchBy TechBrunchMarch 30, 20248 Mins Read
Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Telegram Email


Keeping up with an industry as rapidly changing as AI is a challenge. So until AI can do it for you, here's a quick roundup of recent stories in the world of machine learning, as well as notable research and experiments that we didn't cover independently.

This week in AI, we'd like to spotlight labeling and annotation startups. Startups like Scale AI are reportedly in talks to raise new funding at a valuation of $13 billion. Labeling and annotation platforms may not get the same attention as flashy new generative AI models like OpenAI's Sora. But they are essential. Without them, modern AI models probably wouldn't exist.

The data on which many models are trained must be labeled. why? Labels, or tags, help the model understand and interpret the data during the training process. For example, labels for training image recognition models may take the form of markings around objects, “bounding boxes,” or captions that refer to each person, place, or object depicted in the image.

Label accuracy and quality have a significant impact on the performance and reliability of the trained model. Annotation is also a large-scale task, requiring thousands to millions of labels for the larger and more sophisticated data sets used.

So you might think that data annotators would be well-treated, paid a living wage, and given the same benefits that the engineers themselves who build the models enjoy. However, the opposite is often true, a product of the harsh working conditions that many annotation and labeling startups foster.

Multi-billion dollar companies like OpenAI have relied on annotators in third world countries who make just a few dollars an hour. Some of these annotators are not given time off (usually because they are contractors) or access to mental health resources, despite being exposed to highly disturbing content such as graphic images. Some people don't.

A great article in NY Mag specifically peels back the curtain on Scale AI. Scale AI employs annotators in countries as far away as Nairobi and Kenya. Some of Scale AI's tasks require a labeler to work multiple 8-hour shifts without breaks and are paid as little as $10. And these workers are at the mercy of the platform's whims. An annotator may work for long periods of time without receiving any work, or he may be rudely forced to launch the Scale AI. This is what recently happened to contractors in Thailand, Vietnam, Poland and Pakistan.

Some annotation and labeling platforms claim to offer “fair trade” work. In fact, they've made it a central part of their branding. But as Kate Kaye of MIT Tech Review points out, there is no regulation and weak industry standards on what constitutes ethical labeling practices, and companies' own definitions vary widely.

So what should I do? Unless there are major technological advances, the need to annotate and label data for AI training will not go away. While we can expect platforms to self-regulate, a more realistic solution seems to be policymaking. That in itself is a difficult prospect, but I would argue that it is the best bet we have in changing things for the better. Or at least it's starting to do so.

Here are some other notable AI stories from the past few days.

OpenAI builds voice clones: OpenAI is previewing a new AI-powered tool called Voice Engine. This tool allows users to clone audio from a 15-second recording of someone speaking. However, the company has chosen not to make it widely available (yet) due to the risk of misuse and abuse. Amazon doubles down on Anthropic: Amazon invested an additional $2.75 billion in growing AI powerhouse Anthropic, taking over the option it left last September. Google.org launches accelerator: Google's philanthropic arm Google.org launches new $20 million, six-month, philanthropic arm to help fund nonprofits developing technology that leverages generative AI Start the program. New model architecture: AI startup AI21 Labs has released Jamba, a generative AI model that employs a novel(ish) model architecture (state-space model, or SSM) to improve efficiency. Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model similar to OpenAI's GPT series and Google's Gemini. The company claims to have achieved state-of-the-art results on a number of popular AI benchmarks, including several measurement inferences. Uber Eats and UK AI regulation: Natasha writes about how Uber Eats delivery drivers struggle with AI bias and show how difficult it is to get justice under UK AI regulation. EU Election Security Guidelines: The European Union on Tuesday published draft election security guidelines for around two dozen platforms regulated under the Digital Services Act. This includes guidelines for preventing the spread of AI-based disinformation generated by content recommendation algorithms (also known as political deepfakes). Grok gets upgraded: X's Grok chatbot will soon get an upgraded base model, his Grok-1.5. At the same time, all premium subscribers of X will have access to his Grok. (Grok was previously exclusive to X Premium+ customers.) Adobe expands Firefly: This week, Adobe announced Firefly Services, a set of over 20 new generative and creative APIs, tools, and services. Did. He has also launched a custom model that allows companies to tweak his Firefly model based on their assets. It's part of Adobe's new GenStudio suite.

More machine learning

What's the weather like? AI is increasingly able to tell us this. A few months ago, we mentioned some work on hourly, weekly, and century-by-century predictions, but like all things in AI, this field is changing rapidly. The team behind MetNet-3 and GraphCast has published a paper describing a new system called his SEEDS for the Scalable Ensemble Envelope Diffusion Sampler.

An animation showing how more forecasts creates a more even distribution of weather forecasts.

SEEDS uses diffusion to generate an “ensemble” of plausible weather outcomes for an area based on inputs (perhaps radar readings or orbital images) much faster than physically-based models. As the number of ensembles increases, you can cover more edge cases (such as an event that occurs in only one of 100 possible scenarios) and have more confidence in more likely situations. can.

Fujitsu also hopes to better understand the natural world by applying AI image processing technology to underwater images and LIDAR data collected by underwater autonomous vehicles. Improving image quality allows other less sophisticated processes (such as 3D transformations) to work better on the target data.

Image credit: Fujitsu

The idea is to build a “digital twin” of a body of water that can help simulate and predict new developments. We are far from there, but we have to start somewhere.

The researchers discovered that LLM mimics intelligence in an even simpler way than expected: by a linear function. Frankly, the math doesn't make sense to me (it's about vectors of many dimensions), but this article at MIT makes it pretty clear that the reproduction mechanism for these models is very basic. It will be.

These models are very complex nonlinear functions, trained on large amounts of data, and very difficult to understand, but there may be very simple mechanisms operating under the hood. This is one example of that,” said co-lead author Evan Hernandez. If you're interested in something more technical, check out the paper here.

One of the ways these models can fail is because they don't understand context and feedback. Even a really competent LLM might not “get” you if you tell them your name is pronounced a certain way because they don't actually know or understand anything. If it's important, such as human-robot interaction, people may be uncomfortable if the robot behaves that way.

Disney Research has been studying automated character interactions for a long time, and just published a paper on name pronunciation and reuse a while ago. It may seem obvious, but a smart approach is to extract the phonemes when someone introduces themselves and encode that rather than just their written name.

Image credit: Disney Research

Finally, as AI and search increasingly overlap, it is worth reassessing how these tools are used and whether there are new risks posed by this unholy union. Safiya Umoja Noble has been an important voice in AI and search ethics for many years, and her opinions are always enlightening. She gave a great interview with her UCLA News team about how her own work has evolved and why she has to stay calm when it comes to bias and bad habits in her search. He told me.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

OpenAI seeks to extend human lifespans with the help of longevity startups

January 17, 2025

Farewell to the $200 million woolly mammoth and TikTok

January 17, 2025

Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

January 17, 2025

Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

January 16, 2025

Apple suspends AI notification summaries for news after generating false alerts

January 16, 2025

Nvidia releases more tools and guardrails to help enterprises adopt AI agents

January 16, 2025

Leave A Reply Cancel Reply

Top Reviews
Editors Picks

7 days left: Founders and VCs save over $300 on all stage passes

March 24, 2025

AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

March 24, 2025

20 Hottest Open Source Startups of 2024

March 22, 2025

Andrill may build a weapons factory in the UK

March 21, 2025
About Us
About Us

Welcome to Tech Brunch, your go-to destination for cutting-edge insights, news, and analysis in the fields of Artificial Intelligence (AI), Cryptocurrency, Technology, and Startups. At Tech Brunch, we are passionate about exploring the latest trends, innovations, and developments shaping the future of these dynamic industries.

Our Picks

iOS 19: All the rumor changes that Apple could bring to the new operating system

June 7, 2025

The Trump administration is aiming for Biden and Obama's cybersecurity rules

June 7, 2025

WWDC 2025: What to expect from this year's meeting

June 7, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 TechBrunch. Designed by TechBrunch.
  • Home
  • About Tech Brunch
  • Advertise with Tech Brunch
  • Contact us
  • DMCA Notice
  • Privacy Policy
  • Terms of Use

Type above and press Enter to search. Press Esc to cancel.