This technology allows Gemini to perform actions in apps such as Uber or Starbucks. Users simply give a command in natural language. The AI then carries out the necessary steps itself.
The development marks an important step in the evolution of AI assistants. Where previous assistants mainly answered questions, Gemini can now actually perform digital tasks.
According to initial tests, the function works surprisingly well. It is the first time that an AI assistant visibly navigates through apps and performs actions as if a user were tapping the screen themselves.
What is Gemini task automation
Table of Contents
Gemini task automation lets the AI operate apps in a kind of virtual window. The AI sees the same interface as the user and goes through the steps itself.
For example, users can say:
- “Order an Uber to the airport.”
- “Order a coffee and croissant at Starbucks.”
- “Arrange a ride to my appointment.”
Gemini then opens the appropriate app and automatically fills in fields.
During this process the user can:
- watch live
- pause the task
- take over the process
- cancel the command
The AI does not blindly perform actions. The user always remains in control.
Practical example: Order Uber via Gemini
In an initial test, Gemini was instructed to order an Uber to the airport.
The AI first asked a logical follow-up question: which airport the user wanted to travel to. Then Gemini automatically performed the following steps:
- Open the Uber app
- Enter the destination
- Select relevant options
- Preparing the ride
Before the ride was finally ordered, Gemini asked for confirmation.
This last point is important. Google prevents AI from independently making payments without permission.
Gemini even chooses how your croissant is heated
A second test went one step further.
The assignment was simple: order a coffee and croissant at Starbucks.
Gemini had to:
- scroll through the menu
- choose a drink
- add a croissant
- select options
During the process, the AI was even faced with a small choice: should the croissant be heated?
Gemini herself decided that the pastry was better served warm. According to the tester, that turned out to be exactly the right choice.
It shows that AI not only carries out steps, but can also make small decisions.
AI assistants are changing from chatbots to digital employees
This development fits into a larger trend. AI assistants are evolving from chatbots to digital task performers.
Until now, users always had to operate apps themselves. At most, AI could provide advice.
With task automation, that model fundamentally changes.
AI can immediately:
- schedule appointments
- order groceries
- booking trips
- arrange services
The goal is for users to have to perform fewer and fewer manual actions on their smartphones.
Why this step is important for AI
Many tech companies have been trying for years to build an AI assistant that can actually perform tasks.
Until now, these projects often got stuck due to technical problems.
Apps are designed for human interaction. Not for software that presses buttons independently.
Gemini solves that problem by simply using the interface as a human would. The AI looks at the screen and performs actions.
That makes the technology much more flexible.
New integrations can be added relatively quickly without having to adapt apps specifically for AI.
First impressions: impressive but still experimental
The function is currently still in a testing phase.
The first experiences are positive, but the system needs to be tested more extensively.
More complex assignments in particular can still cause problems. For example, when multiple apps are needed at the same time or when choices are unclear.
Yet experts see this development as an important step towards the next generation of digital assistants.
The smartphone is slowly turning into a device that performs tasks instead of just showing information.
