OpenAI's ChatGPT is starting to integrate with other apps on your computer.
On Thursday, the startup announced that its ChatGPT desktop app for MacOS can now read code from several coding apps for developers, including VS Code, Xcode, TextEdit, Terminal, and iTerm2.
This means developers no longer need to copy and paste code into ChatGPT, which is a common way to use chatbots. When this feature is enabled, OpenAI automatically sends the section of code you are working on through the chatbot as context, along with a prompt.
However, unlike popular AI coding tools like Cursor and GitHub Copilot, ChatGPT currently cannot write code directly into developer apps on your behalf.
The feature, called Work with Apps, is far from an AI agent, but OpenAI says making ChatGPT understand other apps is a “key building block” toward building an agent system. One of the biggest challenges facing AI agents today is getting them to understand the rest of a computer screen, rather than prompts or their own responses.
OpenAI says it is focusing this feature on getting started coding apps. This is likely because AI coding assistants have established themselves as one of the most popular use cases for LLMs. This feature is currently available to Plus and Teams users and will be rolling out to Enterprise and Edu in the coming weeks. OpenAI says ChatGPT will now be able to integrate with other types of apps, specifically text-based apps that can be used to create tasks.
You can now choose some coding apps that work with chatgpt (Image: OpenAI)
In a TechCrunch demo, an OpenAI employee opened an Xcode environment with a ChatGPT app and a simple project modeling a solar system, but the solar system didn't have an Earth. The employee selected the Xcode tab within ChatGPT, told the AI chatbot to check the app, and prompted the chatbot to “add the missing planet.” The chatbot was able to complete the task and wrote a line of code to represent the globe that matches the format of the rest of the project. However, I still had to paste the ChatGPT answers into the environment.
According to Alexander Embiricos, OpenAI Desktop Product Lead, OpenAI primarily relies on the MacOS Accessibility API to read text and convert it to ChatGPT to read various apps. The MacOS screen reader that helps Apple's VoiceOver feature work has been around for nearly 20 years. It is generally considered to be fairly reliable for most popular apps, but not all.
For some apps, such as Microsoft's VS Code, Work with Apps requires users to install special extensions to query content. Also, as the name suggests, Apple's screen reader can only read text, so it won't help ChatGPT understand visual elements like photos, object orientation, or video.
Interact with your app and send the last 200 lines of code via ChatGPT along with all the prompts for your specific app. Otherwise, any code in the frontmost window will be used as input to the chatbot. You can highlight sections of code or text to help ChatGPT focus on the correct parts of your project, but ChatGPT also includes the text around it. This all seems like it uses a lot of input tokens.
Chatgpt working in xcode (Image: OpenAI)
It's unclear how OpenAI plans to roll out this feature to other apps that aren't compatible with Apple's screen reader. One of OpenAI's competitors, Anthropic, has released an AI system that analyzes screenshots of users' desktops to understand and use other apps. Frankly, Anthropic's approach leaves a lot to be desired as it stands. They are slow and make many mistakes. However, this is a more general-purpose version of the AI agent that doesn't rely on APIs and does more than just read text in a separate window.
“This is not intended to be an agent, but a way to get started in conjunction with coding tools, with more coming soon,” OpenAI desktop product lead Alexander Embirikos said in a press conference with TechCrunch. ” he said. “On the agent side, I think this is a really important component. The idea is that ChatGPT can understand and manipulate all of a user's content, so it can help with that.”
This work on agents is particularly notable given recent reports that OpenAI is close to releasing a general-purpose AI agent codenamed Operator, according to Bloomberg. The tool is expected to arrive in early 2025 and will rival other early attempts at general-purpose AI agents, such as Anthropic's computer-based and Google's reported “Jarvis” agents.
OpenAI is releasing these features for the first time on MacOS just before Apple begins integrating with ChatGPT in December. It's unclear when Work with Apps will be introduced to Windows, the operating system created by Microsoft, OpenAI's biggest backer.