OpenAI releases Operator artificial intelligence assistant

On January 23 local time, OpenAI just launched an artificial intelligence assistant called Operator, which can browse the web on its own. The tool, currently only available to ChatGPT Pro users in the U.S., represents a step toward an AI assistant that can work autonomously.

Operators can use GPT-4’s visual capabilities to “see” websites and interact with them based on screenshots – clicking, tapping, and scrolling through web pages without any special integration with the website itself. It is driven by a new AI model called Computer-Using-Agent (CUA), whose principles combine visual capabilities, reasoning capabilities, and the ability to interact with a graphical user interface (GUI).

The user simply tells the Operator what they want to accomplish, and it takes care of the rest in a separate browser window within the ChatGPT interface. The system allows users to customize their experience by adding their own custom instructions – whether for a specific page or across all sites. These tips can be saved on the home page for easy access, and users can run multiple tasks simultaneously in different chat Windows.

Under the hood, Operator runs on a new artificial intelligence model (CUA) that works by processing the content of the screen as raw data and controlling a virtual cursor and keyboard. The model combines GPT-4’s ability to process images with advanced reasoning skills developed through reinforcement learning.

The system operates in three stages: First, it takes screenshots of what you see on the screen. It then uses chain-of-mind reasoning to decide what to do next, taking into account what it is currently seeing and what it has done before.

These “inner monologues” help it reduce errors and improve accuracy – much like OpenAI’s O-type model. Finally, it performs operations by clicking, scrolling, or typing until the task is completed or the user is required to intervene.

What’s more, the CUA model also has the ability to self-correct errors. When a problem is encountered, the Operator can analyze and adjust the strategy independently, and if it cannot be solved, it will return the control to the user to ensure the smooth completion of the task.

The release of Operator is not only the advent of a new product, but also marks a new stage in the development of AI technology. Operator marks OpenAI has moved from Level 2 to Level 3, meaning that its AI technology has officially entered the stage of performing tasks. At Level 2, AI is primarily passive in answering questions or solving specific problems. At Level 3, AI begins to take the initiative to perform tasks and is no longer limited to a single field, but can use various capabilities to complete complex task chains.

PHP Code Snippets Powered By : XYZScripts.com