Close Menu
TechBrunchTechBrunch
  • Home
  • AI
  • Apps
  • Crypto
  • Security
  • Startups
  • TechCrunch
  • Venture

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Khosla ventures among VCS experimenting with AI injection rollups in mature companies

May 23, 2025

Apple CEO reportedly urged the Texas governor to abandon the online child safety bill

May 23, 2025

Digg Founder Kevin Rose offers to buy a pocket from Mozilla

May 23, 2025
Facebook X (Twitter) Instagram
TechBrunchTechBrunch
  • Home
  • AI

    OpenAI seeks to extend human lifespans with the help of longevity startups

    January 17, 2025

    Farewell to the $200 million woolly mammoth and TikTok

    January 17, 2025

    Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

    January 17, 2025

    Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

    January 16, 2025

    Apple suspends AI notification summaries for news after generating false alerts

    January 16, 2025
  • Apps

    Digg Founder Kevin Rose offers to buy a pocket from Mozilla

    May 23, 2025

    Bluesky begins to check for “notable” users

    May 22, 2025

    Mozilla shuts down its Read-It-Later app pocket

    May 22, 2025

    Opening a Social Web Browser Surf makes it easy for anyone to create custom feeds

    May 22, 2025

    Anthropic's new Claude4 AI model can be inferred in many steps

    May 22, 2025
  • Crypto

    Only 3 days left to save up to $900 to destroy the 2025 pass

    May 23, 2025

    Starting from up to $900 from Ticep, 90% off +1 in 2025

    May 22, 2025

    Early savings for 2025 will end on May 25th

    May 21, 2025

    Coinbase says its data breach will affect at least 69,000 customers

    May 21, 2025

    There are 6 days to save $900 to destroy 2025 tickets

    May 20, 2025
  • Security

    Apple CEO reportedly urged the Texas governor to abandon the online child safety bill

    May 23, 2025

    Artemis Seaford and Ion Stoica cover the ethical crisis in their sessions: AI

    May 23, 2025

    Mysterious hacking group Careto was run by the Spanish government, sources say

    May 23, 2025

    Microsoft says Lumma Password Stealer Malware found on 394,000 Windows PCs

    May 22, 2025

    Signal's new Windows update prevents the system from capturing screenshots of chat

    May 22, 2025
  • Startups

    7 days left: Founders and VCs save over $300 on all stage passes

    March 24, 2025

    AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

    March 24, 2025

    20 Hottest Open Source Startups of 2024

    March 22, 2025

    Andrill may build a weapons factory in the UK

    March 21, 2025

    Startup Weekly: Wiz bets paid off at M&A Rich Week

    March 21, 2025
  • TechCrunch

    OpenSea takes a long-term view with a focus on UX despite NFT sales remaining low

    February 8, 2024

    AI will save software companies' growth dreams

    February 8, 2024

    B2B and B2C are not about who buys, but how you sell

    February 5, 2024

    It's time for venture capital to break away from fast fashion

    February 3, 2024

    a16z's Chris Dixon believes it's time to focus on blockchain use cases rather than speculation

    February 2, 2024
  • Venture

    Khosla ventures among VCS experimenting with AI injection rollups in mature companies

    May 23, 2025

    Klarna CEO and Sutter Hill wins lap after Jony Ive's Openai deal

    May 22, 2025

    Wild story of how Moxxie-led Intestinal Toilet Startup Sloan was registered as a gut toilet startup throne

    May 22, 2025

    Submitted submission raises $17 million to automate tax preparation dr voyages

    May 21, 2025

    In a busy VC landscape, Elizabeth Weil's graffiti venture shows that networks are still important

    May 21, 2025
TechBrunchTechBrunch

Following the success of AgentGPT, Reworkd pivots to web scraping AI agents

TechBrunchBy TechBrunchJuly 24, 20248 Mins Read
Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Telegram Email


The Reworkd founders launched AgentGPT last year, a free tool for building AI agents that went viral on GitHub and gained over 100,000 daily users in a week. This led them to be accepted into Y Combinator's Summer 2023 class, but the co-founders quickly realized that building a general-purpose AI agent was too broad a scope. So now Reworkd is a web scraping company, specifically building AI agents that extract structured data from the public web.

AgentGPT offered a simple in-browser interface that allowed users to create autonomous AI agents, and soon everyone was hyping agents as the future of computing.

When the tool took off, Asim Shrestha, Adam Watkins, and Srijan Subedi were still living in Canada and Reworkd didn't exist. The huge influx of users caught them off guard. Subedi, now COO at Reworkd, said the tool's API calls were costing them $2,000 a day. So they created Reworkd and had to raise funds, fast. One of the most common use cases for AgentGPT was creating web scrapers, a relatively simple but high-volume task, so Reworkd specialized in that.

Web scrapers have become invaluable in the AI ​​era. According to a new report from Bright Data, the top reason organizations will use public web data in 2024 will be to build AI models. The problem is that web scrapers are traditionally built by humans and must be customized for specific web pages, making them costly. But Reworkd's AI agents can scrape more of the web with less human intervention.

Customers provide Reworkd with a list of hundreds or even thousands of websites to scrape and specify the type of data they're interested in. Reworkd's AI agents then convert this into structured data using multi-modal code generation. The agents generate unique code to scrape each website and extract that data for the customer to use as they wish.

For example, say you want stats for every player in the NFL, but each team's website has a different layout. Instead of building a scraper for each website, Reworkd's agents will run it for you, just by providing a link and a description of the data you want to extract. With 32 teams, that could save you hours, but with 1,000 teams, it could save you weeks.

Reworkd has raised $2.75 million in new seed funding from investors including Paul Graham, AI Grant (Nat Friedman and Daniel Gross' startup accelerator), SV Angel, General Catalyst and Panache Ventures, the startup told TechCrunch exclusively. Combined with $1.25 million in pre-seed funding from Panache Ventures and Y Combinator last year, this brings Reworkd's total funding to date to $4 million.

AI that can utilize the Internet

Shortly after founding Reworkd and relocating to San Francisco, the team hired Rohan Pandey as a founding research engineer, who now lives at AGI House SF, one of the Bay Area's most popular AI-era hacker houses. One investor described Pandey as “a one-man lab within Reworkd.”

“We see it as the culmination of a 30-year-old dream of the Semantic Web,” Pandey said in an interview with TechCrunch, referencing World Wide Web inventor Tim Berners-Lee's vision of the entire internet being readable by computers. “Some websites don't have markup, but LLM can understand websites the same way a human can, so we can basically expose any website as an API. So in a way, Reworkd is like a universal API layer for the internet.”

Reworkd says it can capture the long end of customer data needs, which means its AI agents are particularly well-suited to scrape the thousands of small, public websites that are often ignored by larger competitors. Other companies, such as Bright Data, have already built scrapers for large websites like LinkedIn and Amazon, but building scrapers for each small website by humans may not be worth the effort. Reworkd addresses this concern, but it may raise others.

What exactly is “public” web data?

Web scrapers have been around for decades, but they have stirred up controversy in the AI ​​era. Unrestricted scraping of huge amounts of data has landed OpenAI and Perplexity in legal trouble. Press and media outlets have alleged that AI companies are extracting intellectual property from paid content and widely reproducing it without compensation. Reworkd takes precautions to avoid such issues.

“We see this as increasing accessibility to publicly available information,” Reworkd co-founder and CEO Shrestha told TechCrunch in an interview. “We're only allowing publicly available information, and we're not going through any sign-in barriers or anything like that.”

Going a step further, Reworkd avoids news scraping altogether and is selective about who it works with: Watkins, the company's CTO, said that it's not the company's focus because there are better tools available for aggregating news content.

As an example, Reworkd described its work with Axis, a company that helps policy teams comply with government regulations. Axis uses Reworkd's AI to extract data from thousands of government regulatory documents across many countries in the European Union. Axis then trains and fine-tunes AI models based on this data and offers them as products to its customers.

Starting a web scraping company today can be treading into risky territory, according to Aaron Fiske, a partner at Silicon Valley-based law firm Gunderson Dettmer. Right now, the situation is somewhat fluid, and the jury is still out on how “exposed” web data really is to AI models. But Reworkd's approach, in which clients decide which websites to scrape, could potentially protect them from legal liability, Fiske says.

“It's kind of like how the photocopier was invented, and it turns out that the use of making copies is very valuable economically, but very questionable legally,” Fisk told TechCrunch in an interview. “Web scrapers providing services to AI companies aren't necessarily risky, but working with AI companies that are genuinely interested in harvesting copyrighted content is probably problematic.”

That's why Reworkd is being careful about who it works with. Web scrapers have previously obscured liability in potential AI-related copyright infringement cases. In the OpenAI case, Fisk notes, the New York Times sued the company that allegedly copied the articles, not the web scraper that collected them. But even in that case, the decision is still open as to whether OpenAI's actions were indeed copyright infringement.

In the midst of the AI ​​boom, there is more evidence that web scrapers are legally OK. A court recently ruled in favor of Bright Data, which scraped Facebook and Instagram profiles over the web. One example of the case was a dataset of 615 million records of Instagram user data that Bright Data was selling for $860,000. Meta sued the company, claiming that this violated their terms of use. However, the court ruled that the data was publicly available and therefore scrapable.

Investors think Reworkd can become as big as the big players

Reworkd has attracted big-name early investors, from Y Combinator and Paul Graham to Daniel Gross and Nat Friedman, some of whom say that's because Reworkd's technology promises to improve and get cheaper with new models. The startup says that OpenAI's GPT-4o is currently the best fit for multimodal code generation, and that much of Reworkd's technology wasn't possible until just a few months ago.

“I think as a founder, you're going to struggle if you're trying to compete with the rate of advancement in technology, instead of building on top of it,” General Catalyst's Viet Le said in an interview with TechCrunch. “Reworkd has the mindset of building our solutions around the rate of advancement.”

Reworkd creates AI agents that address specific gaps in the market. AI is advancing rapidly, so companies need more data. As more companies build custom AI models specific to their business, Reworkd will gain more customers. Fine-tuning the models requires large amounts of quality, structured data.

Reworkd says its approach is “self-healing,” meaning its web scrapers won't break with webpage updates. The company claims that Reworkd's agents generate the code to scrape websites, avoiding the hallucination problems that traditionally plague AI models. While the AI ​​can make mistakes and pull incorrect data from websites, the Reworkd team created an open-source evaluation framework, Banana-lyzer, that periodically evaluates its accuracy.

Reworkd doesn't have many employees — just four people on its team — but there are significant inference costs to run its AI agents. The startup expects its prices to become more competitive as these costs fall. OpenAI just released GPT-4o mini, a miniature version of its industry-leading model with competitive benchmarks. Innovations like this could make Reworkd even more competitive.

Paul Graham and AI Grant did not respond to TechCrunch's request for comment.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Khosla ventures among VCS experimenting with AI injection rollups in mature companies

May 23, 2025

Klarna CEO and Sutter Hill wins lap after Jony Ive's Openai deal

May 22, 2025

Wild story of how Moxxie-led Intestinal Toilet Startup Sloan was registered as a gut toilet startup throne

May 22, 2025

Submitted submission raises $17 million to automate tax preparation dr voyages

May 21, 2025

In a busy VC landscape, Elizabeth Weil's graffiti venture shows that networks are still important

May 21, 2025

A comprehensive list of 2025 tech layoffs

May 21, 2025

Leave A Reply Cancel Reply

Top Reviews
Editors Picks

7 days left: Founders and VCs save over $300 on all stage passes

March 24, 2025

AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

March 24, 2025

20 Hottest Open Source Startups of 2024

March 22, 2025

Andrill may build a weapons factory in the UK

March 21, 2025
About Us
About Us

Welcome to Tech Brunch, your go-to destination for cutting-edge insights, news, and analysis in the fields of Artificial Intelligence (AI), Cryptocurrency, Technology, and Startups. At Tech Brunch, we are passionate about exploring the latest trends, innovations, and developments shaping the future of these dynamic industries.

Our Picks

Khosla ventures among VCS experimenting with AI injection rollups in mature companies

May 23, 2025

Apple CEO reportedly urged the Texas governor to abandon the online child safety bill

May 23, 2025

Digg Founder Kevin Rose offers to buy a pocket from Mozilla

May 23, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 TechBrunch. Designed by TechBrunch.
  • Home
  • About Tech Brunch
  • Advertise with Tech Brunch
  • Contact us
  • DMCA Notice
  • Privacy Policy
  • Terms of Use

Type above and press Enter to search. Press Esc to cancel.