Close Menu
TechBrunchTechBrunch
  • Home
  • AI
  • Apps
  • Crypto
  • Security
  • Startups
  • TechCrunch
  • Venture

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Databricks, co-founder of Prperxity, pledges $100 million to a new fund for AI researchers

June 23, 2025

Apple's liquid glass interface improves with iOS 26 Beta 2 release

June 23, 2025

According to Canada, the carrier was breached by China-related spying hacking

June 23, 2025
Facebook X (Twitter) Instagram
TechBrunchTechBrunch
  • Home
  • AI

    OpenAI seeks to extend human lifespans with the help of longevity startups

    January 17, 2025

    Farewell to the $200 million woolly mammoth and TikTok

    January 17, 2025

    Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

    January 17, 2025

    Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

    January 16, 2025

    Apple suspends AI notification summaries for news after generating false alerts

    January 16, 2025
  • Apps

    Apple's liquid glass interface improves with iOS 26 Beta 2 release

    June 23, 2025

    Senators urge FTC to investigate Spotify's higher priced bundled subscriptions

    June 23, 2025

    SNAP gets Saturn, a social calendar app for high school and university students

    June 20, 2025

    The X app code refers to the physical card that comes to X money

    June 20, 2025

    Deezer begins labeling AI-generated music to tackle streaming scams

    June 20, 2025
  • Crypto

    Stablecoin Evangelist: Katie Haun's Battle of Digital Dollars

    June 22, 2025

    Hackers steal and destroy millions of Iran's biggest crypto exchanges

    June 18, 2025

    Unique, a new social media app

    June 17, 2025

    xNotify Polymarket as partner in the official forecast market

    June 6, 2025

    Circle IPOs are giving hope to more startups waiting to be published to more startups

    June 5, 2025
  • Security

    According to Canada, the carrier was breached by China-related spying hacking

    June 23, 2025

    US insurance giant AFLAC says customer personal data was stolen during a cyber attack

    June 23, 2025

    Iran's government says it will shut down the internet to protect against cyber attacks

    June 20, 2025

    According to web surveillance companies, the internet will collapse across Iran

    June 18, 2025

    Pro-Israel hacktivist group claims responsiveness to alleged Iranian bank hacks

    June 17, 2025
  • Startups

    7 days left: Founders and VCs save over $300 on all stage passes

    March 24, 2025

    AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

    March 24, 2025

    20 Hottest Open Source Startups of 2024

    March 22, 2025

    Andrill may build a weapons factory in the UK

    March 21, 2025

    Startup Weekly: Wiz bets paid off at M&A Rich Week

    March 21, 2025
  • TechCrunch

    OpenSea takes a long-term view with a focus on UX despite NFT sales remaining low

    February 8, 2024

    AI will save software companies' growth dreams

    February 8, 2024

    B2B and B2C are not about who buys, but how you sell

    February 5, 2024

    It's time for venture capital to break away from fast fashion

    February 3, 2024

    a16z's Chris Dixon believes it's time to focus on blockchain use cases rather than speculation

    February 2, 2024
  • Venture

    Databricks, co-founder of Prperxity, pledges $100 million to a new fund for AI researchers

    June 23, 2025

    Four months after valuation of $300 million, HarveyAI will increase to $5 billion

    June 23, 2025

    Destruction 2025 Builder's Stage Agenda is now alive and in shape

    June 23, 2025

    Want to know where the VC will invest next? See 2025 suspension

    June 23, 2025

    TC Last time to save all stage paths

    June 22, 2025
TechBrunchTechBrunch

Following the success of AgentGPT, Reworkd pivots to web scraping AI agents

TechBrunchBy TechBrunchJuly 24, 20248 Mins Read
Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Telegram Email


The Reworkd founders launched AgentGPT last year, a free tool for building AI agents that went viral on GitHub and gained over 100,000 daily users in a week. This led them to be accepted into Y Combinator's Summer 2023 class, but the co-founders quickly realized that building a general-purpose AI agent was too broad a scope. So now Reworkd is a web scraping company, specifically building AI agents that extract structured data from the public web.

AgentGPT offered a simple in-browser interface that allowed users to create autonomous AI agents, and soon everyone was hyping agents as the future of computing.

When the tool took off, Asim Shrestha, Adam Watkins, and Srijan Subedi were still living in Canada and Reworkd didn't exist. The huge influx of users caught them off guard. Subedi, now COO at Reworkd, said the tool's API calls were costing them $2,000 a day. So they created Reworkd and had to raise funds, fast. One of the most common use cases for AgentGPT was creating web scrapers, a relatively simple but high-volume task, so Reworkd specialized in that.

Web scrapers have become invaluable in the AI ​​era. According to a new report from Bright Data, the top reason organizations will use public web data in 2024 will be to build AI models. The problem is that web scrapers are traditionally built by humans and must be customized for specific web pages, making them costly. But Reworkd's AI agents can scrape more of the web with less human intervention.

Customers provide Reworkd with a list of hundreds or even thousands of websites to scrape and specify the type of data they're interested in. Reworkd's AI agents then convert this into structured data using multi-modal code generation. The agents generate unique code to scrape each website and extract that data for the customer to use as they wish.

For example, say you want stats for every player in the NFL, but each team's website has a different layout. Instead of building a scraper for each website, Reworkd's agents will run it for you, just by providing a link and a description of the data you want to extract. With 32 teams, that could save you hours, but with 1,000 teams, it could save you weeks.

Reworkd has raised $2.75 million in new seed funding from investors including Paul Graham, AI Grant (Nat Friedman and Daniel Gross' startup accelerator), SV Angel, General Catalyst and Panache Ventures, the startup told TechCrunch exclusively. Combined with $1.25 million in pre-seed funding from Panache Ventures and Y Combinator last year, this brings Reworkd's total funding to date to $4 million.

AI that can utilize the Internet

Shortly after founding Reworkd and relocating to San Francisco, the team hired Rohan Pandey as a founding research engineer, who now lives at AGI House SF, one of the Bay Area's most popular AI-era hacker houses. One investor described Pandey as “a one-man lab within Reworkd.”

“We see it as the culmination of a 30-year-old dream of the Semantic Web,” Pandey said in an interview with TechCrunch, referencing World Wide Web inventor Tim Berners-Lee's vision of the entire internet being readable by computers. “Some websites don't have markup, but LLM can understand websites the same way a human can, so we can basically expose any website as an API. So in a way, Reworkd is like a universal API layer for the internet.”

Reworkd says it can capture the long end of customer data needs, which means its AI agents are particularly well-suited to scrape the thousands of small, public websites that are often ignored by larger competitors. Other companies, such as Bright Data, have already built scrapers for large websites like LinkedIn and Amazon, but building scrapers for each small website by humans may not be worth the effort. Reworkd addresses this concern, but it may raise others.

What exactly is “public” web data?

Web scrapers have been around for decades, but they have stirred up controversy in the AI ​​era. Unrestricted scraping of huge amounts of data has landed OpenAI and Perplexity in legal trouble. Press and media outlets have alleged that AI companies are extracting intellectual property from paid content and widely reproducing it without compensation. Reworkd takes precautions to avoid such issues.

“We see this as increasing accessibility to publicly available information,” Reworkd co-founder and CEO Shrestha told TechCrunch in an interview. “We're only allowing publicly available information, and we're not going through any sign-in barriers or anything like that.”

Going a step further, Reworkd avoids news scraping altogether and is selective about who it works with: Watkins, the company's CTO, said that it's not the company's focus because there are better tools available for aggregating news content.

As an example, Reworkd described its work with Axis, a company that helps policy teams comply with government regulations. Axis uses Reworkd's AI to extract data from thousands of government regulatory documents across many countries in the European Union. Axis then trains and fine-tunes AI models based on this data and offers them as products to its customers.

Starting a web scraping company today can be treading into risky territory, according to Aaron Fiske, a partner at Silicon Valley-based law firm Gunderson Dettmer. Right now, the situation is somewhat fluid, and the jury is still out on how “exposed” web data really is to AI models. But Reworkd's approach, in which clients decide which websites to scrape, could potentially protect them from legal liability, Fiske says.

“It's kind of like how the photocopier was invented, and it turns out that the use of making copies is very valuable economically, but very questionable legally,” Fisk told TechCrunch in an interview. “Web scrapers providing services to AI companies aren't necessarily risky, but working with AI companies that are genuinely interested in harvesting copyrighted content is probably problematic.”

That's why Reworkd is being careful about who it works with. Web scrapers have previously obscured liability in potential AI-related copyright infringement cases. In the OpenAI case, Fisk notes, the New York Times sued the company that allegedly copied the articles, not the web scraper that collected them. But even in that case, the decision is still open as to whether OpenAI's actions were indeed copyright infringement.

In the midst of the AI ​​boom, there is more evidence that web scrapers are legally OK. A court recently ruled in favor of Bright Data, which scraped Facebook and Instagram profiles over the web. One example of the case was a dataset of 615 million records of Instagram user data that Bright Data was selling for $860,000. Meta sued the company, claiming that this violated their terms of use. However, the court ruled that the data was publicly available and therefore scrapable.

Investors think Reworkd can become as big as the big players

Reworkd has attracted big-name early investors, from Y Combinator and Paul Graham to Daniel Gross and Nat Friedman, some of whom say that's because Reworkd's technology promises to improve and get cheaper with new models. The startup says that OpenAI's GPT-4o is currently the best fit for multimodal code generation, and that much of Reworkd's technology wasn't possible until just a few months ago.

“I think as a founder, you're going to struggle if you're trying to compete with the rate of advancement in technology, instead of building on top of it,” General Catalyst's Viet Le said in an interview with TechCrunch. “Reworkd has the mindset of building our solutions around the rate of advancement.”

Reworkd creates AI agents that address specific gaps in the market. AI is advancing rapidly, so companies need more data. As more companies build custom AI models specific to their business, Reworkd will gain more customers. Fine-tuning the models requires large amounts of quality, structured data.

Reworkd says its approach is “self-healing,” meaning its web scrapers won't break with webpage updates. The company claims that Reworkd's agents generate the code to scrape websites, avoiding the hallucination problems that traditionally plague AI models. While the AI ​​can make mistakes and pull incorrect data from websites, the Reworkd team created an open-source evaluation framework, Banana-lyzer, that periodically evaluates its accuracy.

Reworkd doesn't have many employees — just four people on its team — but there are significant inference costs to run its AI agents. The startup expects its prices to become more competitive as these costs fall. OpenAI just released GPT-4o mini, a miniature version of its industry-leading model with competitive benchmarks. Innovations like this could make Reworkd even more competitive.

Paul Graham and AI Grant did not respond to TechCrunch's request for comment.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Databricks, co-founder of Prperxity, pledges $100 million to a new fund for AI researchers

June 23, 2025

Four months after valuation of $300 million, HarveyAI will increase to $5 billion

June 23, 2025

Destruction 2025 Builder's Stage Agenda is now alive and in shape

June 23, 2025

Want to know where the VC will invest next? See 2025 suspension

June 23, 2025

TC Last time to save all stage paths

June 22, 2025

2 days left to save up to $210 with TC All Stage Pass

June 21, 2025

Leave A Reply Cancel Reply

Top Reviews
Editors Picks

7 days left: Founders and VCs save over $300 on all stage passes

March 24, 2025

AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

March 24, 2025

20 Hottest Open Source Startups of 2024

March 22, 2025

Andrill may build a weapons factory in the UK

March 21, 2025
About Us
About Us

Welcome to Tech Brunch, your go-to destination for cutting-edge insights, news, and analysis in the fields of Artificial Intelligence (AI), Cryptocurrency, Technology, and Startups. At Tech Brunch, we are passionate about exploring the latest trends, innovations, and developments shaping the future of these dynamic industries.

Our Picks

Databricks, co-founder of Prperxity, pledges $100 million to a new fund for AI researchers

June 23, 2025

Apple's liquid glass interface improves with iOS 26 Beta 2 release

June 23, 2025

According to Canada, the carrier was breached by China-related spying hacking

June 23, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 TechBrunch. Designed by TechBrunch.
  • Home
  • About Tech Brunch
  • Advertise with Tech Brunch
  • Contact us
  • DMCA Notice
  • Privacy Policy
  • Terms of Use

Type above and press Enter to search. Press Esc to cancel.