The 7th World New Energy Vehicle Conference will be held in Haikou, Hainan from September 27th to 29th, 2025. The conference will focus on international cooperation, industrial development, and technological progress of new energy vehicles under the theme of "Industrial Transformation and Sustainable Development", and invite representatives from government, industry, academia, and research circles from multiple countries to engage in dialogue and exchange.
On September 28th, at the special forum of "AI Empowering Automotive Software and Hardware Integration Innovation" at the conference, Xie Tian, General Manager of Baidu Maps Business Unit, delivered a keynote speech, sharing Baidu Maps' thoughts and innovative practices in the direction of automotive intelligence, andreleased Baidu Maps' newly upgraded travel intelligent agent - Xiaodu Thinking 2.0. He said, "The competition for automotive intelligence has evolved from traditional 'in car voice assistants' to truly intelligent AI agents with understanding, reasoning, and action capabilities. The intelligence in the car is no longer just a conversation tool, but a partner who can understand you and serve you. The outbreak point of large-scale intelligent agents getting on the car has arrived! ”
The industry's first travel intelligent agent that deeply integrates end-to-end speech language models
Baidu Maps relies on its three core advantages of advanced basic models, massive scarce data, and large-scale user base to achieve breakthroughs at key nodes of intelligent agent boarding, transforming the explosion of user demand into a comprehensive upgrade of travel experience. In terms of advanced basic models, relying on Baidu's latest generation Wenxin big model X1.1, we have achieved a leapfrog improvement in factual accuracy, instruction compliance, and intelligent agent capabilities, providing a solid foundation for in vehicle semantic understanding and intelligent decision-making; In terms of massive scarce data, Baidu Maps has been deeply cultivating for 20 years, possessing industry scarce ultra large scale spatiotemporal data, covering real lane level data in more than 360 cities across the country and over 300 million high-quality POI data, endowing small-scale intelligent agents with unique spatiotemporal understanding and scene perception capabilities; In terms of large-scale user base, Baidu Maps is a national level AI travel application that serves hundreds of millions of users. At the same time, Baidu AI voice has been mass-produced and implemented in over 15 million cars and more than 800 models, supporting over 500 million in car voice interactions per month, forming an industry-leading advantage verified by real users and large-scale practice. From digital travel to intelligent agent travel, Baidu Maps is driving a paradigm shift in travel experience.
Xie Tian said, "On this basis, Baidu Maps launched Xiaodu Thinking 1.0 in April this year, becoming the world's first travel intelligent agent with the full chain capability of 'memory reasoning decision-making'. Now, the newly upgraded Xiaodu Think 2.0 has been officially released. As the industry's first travel intelligent agent that deeply integrates end-to-end voice language models, it can not only listen, see, and understand, but also efficiently collaborate between different intelligent agents to transform user needs into optimal solutions, truly making travel simpler and AI better understand you. ”
The core advantages of Xiaodu Think 2.0 product architecture are mainly reflected in the following three aspects: firstly, introducing exclusive map travel knowledge base and Baidu real-time search data to further enhance the understanding and reasoning of complex travel intentions; Secondly, build cross end memory that supports scene memory, session memory, persistent memory, and system memory, ensuring consistency and continuity of services across mobile phones, car devices, and cross scene switching; Thirdly, we will comprehensively upgrade end-to-end cross modal interaction. The industry's first end-to-end speech language model can efficiently coordinate multiple vertical intelligent agents in the vehicle under multi-dimensional information input. Compared with traditional speech models, it has more comprehensive functions, ultimate speed, and more immersive experience.
The industry's first travel intelligent agent that supports cross end memory
In the evolution process of intelligent agents, what truly determines the level of intelligence is not only the speed and naturalness of interaction, but also whether they have sustained memory and comprehension abilities. Based on this judgment, Xiaodu Think 2.0 has launched cross end memory for the first time, aiming to provide users with a consistent and continuous memory experience across multiple terminals and scenarios such as mobile phones, car devices, and the cloud, allowing services to break through device boundaries and always meet user needs.
Xie Tian mentioned, "The core value of cross end memory is that it can not only record every user interaction, but also deeply understand the potential intentions and interests behind user behavior. With the help of large models, it can intelligently analyze and summarize user usage habits, and infer potential needs that are not explicitly expressed." This means that memory is no longer a passive information storage, but an active intelligent insight.
In practical scenarios, the ability of cross end memory is reflected in three levels: real-time actions. For example, after a user searches for a destination on their phone, they directly say "navigate to the location I just searched" to the car's system, and the system can seamlessly connect and quickly initiate navigation; In recent times, if a user requests to "take me to the restaurant last week and follow the route I am familiar with," the intelligent agent can remember and reuse the recent travel trajectory to provide tailored services; Long term preference refers to understanding users' interests through long-term accumulation, such as "finding a restaurant with a high rating within 3 kilometers based on my taste", to achieve personalized recommendations. Cross end memory makes Xiaodu think about 2.0, which truly realizes "the more you use, the better you use, and the more you understand".
Four major scenarios upgrade to new travel experiences in the intelligent era
Xiaodu Think 2.0 will integrate intelligent capabilities throughout the entire travel process, from itinerary planning, driving departure to journey companionship, covering four core scenarios: AI search, AI travel, AI navigation, and AI companion, bringing users a smarter and more intimate travel experience.
In the AI search scenario, Xiaodu Think 2.0 integrates large models with the entire network content, and combines users' historical behavior and preference memory to create efficient point finding capabilities for thousands of people. Users can not only search for 'which restaurant is the best to eat boiled lamb', but also make scenario based demands such as' a suitable place nearby to watch the sunset ', and the system can accurately understand and provide the best recommendations. AI search is no longer a cold keyword matching, but a process of understanding needs and actively providing optimal solutions.
In AI travel scenarios, for holidays and leisure travel scenarios, Xiaodu Think 2.0 will cover the entire travel chain with AI capabilities. Homepage recommendations can actively present personalized content based on user profiles, real-time weather, and current location; Travel planning integrates user preferences, travel big data, and authoritative rankings to generate exclusive plans; During the journey, you can also dynamically recommend scenic spots and restaurants along the route. The travel experience is evolving from 'people searching for information' to 'information understanding people'. Not just telling you where to find delicious and fun food, but understanding your preferences, proactively planning, recommending, and reminding you, making every trip easier and more intimate.
In the AI navigation scenario, Xiaodu Think 2.0 can deeply understand natural language instructions and achieve flexible adjustment of routes; Based on real lane level data and learning algorithms, dynamically recommend the optimal lane to achieve second following and second lane changing; At the same time, combined with road dynamic information, provide full range dynamic and static safety prompts to safeguard driving. This is not only an upgrade to navigation, but also a manifestation of AI real-time understanding and decision-making in the intelligent cockpit. Let the driving experience truly evolve from "route guidance" to full process safety protection.
In the AI companion scenario, Xiaodu Think 2.0 perfectly combines speech recognition, semantic understanding, emotion synthesis, and over 300 million map POI knowledge for the first time, deeply integrating end-to-end speech language models to achieve real-time dialogue and emotional interaction. Whether commuting or traveling, users can naturally communicate with AI and receive smarter and more intimate companionship. From then on, Xiaodu was no longer a cold tool, but a true warm passenger companion.
Xiaodu Thinking 2.0 is officially fully open for intelligent vehicles
As a pioneer in automotive intelligence, new energy vehicles lead the intelligent development direction of the entire automotive industry. Map navigation and cockpit voice, due to their strong user perception and high frequency of use, directly affect the driving experience of users and have become the forefront of the development and competition of automotive intelligence. However, in the past, map navigation and cockpit voice were considered as two relatively independent sub fields. Although they were combined, they did not achieve deep data integration and functional fusion, so the user experience was relatively fragmented. On the traditional technological path, deep integration of the two faces great technical challenges.
Baidu Maps, relying on its large model capabilities, deeply integrates and reconstructs map navigation and cockpit voice interaction, crossing the technological gap, and officially releases the fully open travel intelligent agent for the automotive industry - Xiaodu Thinking 2.0, leading the innovation of future travel paradigms and creating a new experience of intelligent travel interaction.
In recent years, Baidu Maps has been committed to long-term strategic investment in the intelligent automotive industry and has become the most trusted partner for customers. In the future, Baidu Maps will continue to open up more AI capabilities, work together with car companies and ecological partners to accelerate the AI equality and evolution of intelligent cars, and fully open up a new era of intelligent travel.