Baidu, known as the Google of China, unveiled an artificial intelligence (AI) agent that can be used in daily life, work, and learning. Baidu's new AI agent system 'GenFlow 2.0' combines AI into all processes from creation, editing, storage, and management to search, utilization, and sharing. Notably, the AI camera allows users to capture and upload anything without being limited by form, making it possible to work easily. Baidu stated, "There is nothing it cannot do and no place where it is not applicable (ubiquitous and omnipotent)" regarding the AI experience.

Wang Ying, Baidu's vice president, announces new services at the AI Day on Oct. 10. /Beijing = Lee Eun-young, correspondent

On the afternoon of the 10th, Baidu held a joint technology demonstration of 'Baidu Wenku' and 'Baidu Drive' during 'AI Day' near its headquarters in Haidian District, Beijing. Baidu Wenku is Baidu's document creation tool that supports the generation of documents, PPTs, posters, and more, based on AI. It supports the search of over 1.4 billion massive document resources, including practical documents, educational materials, qualification exam resources, and contract templates. This document work service operates in conjunction with Baidu Drive.

Baidu Wenku has significantly improved its content generation capabilities by introducing R1, a generative AI model from DeepSeek, and has been reborn as an AI agent optimized for daily life, work, and learning by incorporating the multi-intelligent agent system GenFlow 2.0 developed by Baidu.

The GenFlow 2.0 unveiled that day is an AI system that intelligently orchestrates users' complex and diverse instructions. It structures user data into short-term and long-term memory, working based on users' past tasks and preferences. Understanding the context of users' work instructions allows it to efficiently handle a variety of complex tasks in an optimal flow. It supports integration not only with Baidu Wenku and Drive but also with external platforms.

The AI camera operated by Baidu Drive. After capturing real estate documents and entering 'summarize', the results appeared within 2-3 seconds. All of Baidu's generative AI features are available on the left camera screen. /Baidu Drive app capture

Baidu also introduced an 'AI camera' that day. The AI camera is the industry's first 'multimodal input-processing-output integrated system' covering daily life, learning, and work. By inputting different data such as text, photos, and videos through the camera, the AI analyzes and processes it to implement it in the required form. Baidu explained, "With just a single capture using the camera, storage and automatic classification, search, scanning, correction, and printing can all be done in one-stop content management solution."

It can assist with generating and editing text and documents, summarizing, translating, as well as drawing and learning. For instance, it can go beyond photo editing using AI and recognize images of math problem-solving to create error notebooks or organize paper documents like receipts into Excel. Detailed features include inserting signatures into documents, removing watermarks, converting file formats, and producing identification photos based on usage and size.

The biggest advantage is that all these features are consolidated in one place. Users can access Baidu Wenku or Baidu Drive and simply press the camera button to capture anything or retrieve work, thus accessing all functionalities without needing to switch between different mobile applications.

Wang Ying, Baidu's vice president, announces the achievements of the Baidu Wenku AI service on Oct. 10. /Courtesy of Baidu

According to Baidu, the number of monthly active users (MAU) of Baidu Wenku's AI services has reached 97 million. Baidu Drive's MAU has surpassed 150 million, ranking first among applications in China. Regarding the protection of information uploaded to the Drive, it has obtained four international certifications.

Wang Ying, Baidu's vice president overseeing the Baidu Wenku and Drive businesses, noted, "Simply having an excellent large language model (LLM) is not enough for AI services to be competitive. Many models perform well, but the actual service success rate is not even half. The response is slow and usability is poor," adding, "Ultimately, what matters is a deep understanding and observation of users. Understanding and observation are the most crucial factors that enabled us to accomplish this work."

Vice President Wang stated, "Our ultimate goal is to establish an automated pipeline for input, processing, and output based on AI. This pipeline will be provided to users through various channels such as the Drive, document tools, portals, smartphone photo albums, and Internet of Things (IoT) devices," and remarked, "Baidu aims to create an environment where users can naturally experience AI functionalities anytime and anywhere, through any device."