This article was interviewed from:Chang’anautomobile
1. Changan Automobile Business Scenario
Chongqing Changan Automobile Co., Ltd. is one of the four groups of China Automobile. It is an automobile company that develops, manufactures and sells a full range of passenger cars and commercial vehicles. Its main products include a full range of passenger cars, small commercial vehicles, light trucks, minivans and large and medium-sized buses, and a full range of engines. Changan Automobile always takes "leading automobile civilization and benefiting human life" as its mission, takes customers as its center and products as its main line, continuously provides high-quality products and services, and strives to promote the third venture-innovative business plan, transform itself into an intelligent low-carbon travel technology company, and strive to achieve world-renowned automobile enterprises.
At present, digital technology has deeply reconstructed the automobile. The automobile has gradually evolved from a mechanical product equipped with electronic functions to an electronic product equipped with mechanical functions. The integration of cloud data and AI technology has gradually turned the automobile into a large intelligent mobile terminal, data acquisition carrier, energy storage unit and a mobile multifunctional space. Generally speaking, the smart car has gradually transformed into a wheeled mobile robot with multi-functional space. In this context, the national policy strongly encourages the development of intelligent vehicles and self-driving driven by Industry-University-Research, and the domestic head and new power OEMs gradually take intelligent networked vehicles or self-driving direction as their core business.
Changan Automobile focuses on the development of intelligent vehicles, and its application panorama includes intelligent vehicle control (remote control analysis), intelligent cockpit (vehicle burial point, vehicle user behavior analysis),IoT Data access (access detection, vehicle activation monitoring) and IoT vehicle condition management (vehicle condition monitoring, power research, vehicle networking big screen), realizing business display including eye-catching system, vehicle remote diagnosis and remote early warning system, remote debugging system, etc.
In order to realize the construction of intelligent vehicles, Changan Automobile has built an intelligent vehicle data platform. This big data processing platform is divided into five layers: data access layer, data storage layer, resource scheduling layer, computing engine layer and business presentation layer. now IoTDB It is mainly used in the data storage layer to cope with the massive Internet of Vehicles.time series dataManagement.
The industrial scene of realizing intelligent automobile direction involves a large number of time series data acquisition of vehicle equipment and sensor layer, which is very important for time seriesData solutionThe perfection and efficiency of the system pose a great challenge. Changan Automobile’s current time series data solution has obvious limitations, so we hope to find a better solution for writing, storing, querying and analyzing the time series data in the field of intelligent automobiles.
Based on the characteristics and advantages of IoTDB time series database, Changan Automobile chose IoTDB as the time series data processing scheme of mass intelligent networked vehicles, which realized flexible expansion of mass data writing and storage, effectively improved query performance and reduced equipment and operation and maintenance costs.
2. Pain points of business requirements
210 million data measuring points, with over 10 million new data points per second.
Changan Automobile, as one of the four major automobile groups in China, has a huge business volume, and it needs to deal with a large number of vehicles and data. At present, Changan Automobile’s networked vehicle mass vehicle condition time series data management system has collected about 570,000 vehicles on the network, with a total of 150 million data measuring points and more than 10 million data points added every second. Therefore, Changan Automobile has high requirements for real-time writing, compression and storage of time series database under this data volume.
2.2 High-speed signal acquisition frequency is high
The vehicle acquisition signals of Changan Automobile can be divided into high-speed signals and conventional signals. High-speed signals need to be acquired in milliseconds, and conventional signals need to be acquired in 3-4 seconds. Changan Automobile’s time series data solution should be able to support these two different acquisition frequencies at the same time, and ensure the continuous operation of real-time low-delay acquisition of high-speed signals.
2.3 Low latency data query
The vehicle data query scenario of Changan Automobile mainly includes the effective query of multiple time series of bicycles and the query of the newest point of the full time series of bicycles, that is, the classic real-time and off-line vehicle condition/historical vehicle condition query scenario in the vehicle networking scenario. Changan Automobile’s time series data solution needs to effectively support real-time real-time query and query with a large amount of stored historical data with low delay.
Based on the previous time series data solution of Changan Automobile, a vehicle condition data entered HBase If you want to do analysis later, you need to completely unload the incremental data from HBase, and this batch of range reading process takes a long time. For example, if HBase wants to do offline processing of yesterday’s data, it needs to unload the operational data and put it into Hadoop for analysis this morning. In addition, with the increase of business volume, it takes longer and longer to use HBase for batch reading, which is not suitable for the rapidly expanding data volume scene.
2.4 High cost and maintenance difficulty
Changan Automobile initially adopted HBase as the solution of time series data management, but faced with tens of millions of data measuring points per second, HBase’s cluster data writing was undertaken by 25 data nodes, so many data nodes directly led to the high maintenance difficulty and cost of the system.
3. reasons for selecting IoTDB
3.1 The data structure ensures high scalability, low cost and high stability.
In view of the huge existing data volume and new data volume processing requirements of Changan Automobile, IoTDB’s exclusive native time series model of the Internet of Things stores time series data in layers of devices, measuring points/sensors. With the continuous increase of data volume, it only needs to directly expand the hardware equipment of the query node without interrupting the normal operation of the system, which can achieve second-level expansion and effectively reduce the management and operation costs.
3.2 to achieve tens of millions of writing speed
In view of Changan Automobile’s demand for frequent writing of time series data, IoTDB can write tens of millions of data points per second and handle multi-device billion-level data points, and the writing rate does not decrease with the increase of data volume, maintaining a stable high-speed level.
3.3 Support valid details and latest value query.
Changan Automobile will use IoTDB for detailed query of massive networked vehicle condition data in the scene of vehicle fault alarm. IoTDB adopts a storage architecture that combines the time stamps of data points from below the root node with multi-level storage paths, so that the time series data of different dimensions can be effectively classified and stored, and then the time series data in IoTDB can be determined quickly and uniquely during the query, so that the query efficiency can still be maintained under the condition of using detailed query.
3.4 Improve the batch reading rate and achieve real-time analysis.
In order to meet the demand of Changan Automobile for fast batch reading of data before data analysis, IoTDB can directly synchronize the underlying data files of IoTDB because of its time index, and use the seamless integration of IoTDB and other big data systems to directly analyze the stored file TsFile in real time with Spark engine, which reduces the number of exported copies required for data analysis and improves the analysis and calculation efficiency compared with the original scheme.
3.5 Timely operation and maintenance, and active verification
Changan Automobile Project believes that when the business scenario brings new challenges to the time series data solution, the efficiency and thinking of dealing with the problem is very important. The team behind an excellent time series data solution needs to be able to solve the problem quickly and make the database cover the performance improvement and rich functions required by the business scenario. When Changan Automobile encounters problems in the production environment, the R&D team of IoTDB will quickly coordinate with relevant R&D resources to help. As a team with more than ten years’ experience in researching and serving industrial users, the team members are particularly willing to use IoTDB to verify the ecological environment in the industrial Internet scene, so that the success of the project and the maturity of the product can achieve mutual benefit and win-win. Changan automobile project expressed its gratitude to the IoTDB development team.
It can be seen that the technical advantages of IoTDB can effectively solve the business pain point of Changan Automobile’s current management of time series data, so Changan Automobile chose IoTDB to build the data storage layer of Changan Smart Car Data Platform. This paper introduces and analyzes the solution architecture, time series model structure setting and query performance effect of large-scale time series data management built by Changan Automobile based on IoTDB.
4. Solution architecture
The Internet of Vehicles is a typical Internet of Things scenario. The main data of the Internet of Things scene is time series data, and the whole life cycle of time series data is divided into six stages: collection, caching, processing, storage, query analysis and visualization application.
In the scene of car networking in Chang ‘an, devices and sensors such as Tbox and THU are used to collect car networking data (such as EFI data of engines, rotational speed, vehicle speed, etc.). After the data goes to the cloud, the message is ingested based on the private TCP protocol of Changan Automobile and the gateway written based on Netty. After entering the TU-GW application of K8s through CLB, the message is parsed. The message is parsed and sent to the message queue, and the time sequence messages required by different services are distributed to different storage terminals. After being written into the storage engine, Changan Automobile TSP business system and APP will query the time series data according to the latest data and historical vehicle conditions.
At present, the architecture of large-scale time series data management of Changan Automobile is divided into 1.0 and 2.0 versions. In version 1.0, the time series storage engine mainly used for car condition scenes is HBase, which realizes the writing of historical car condition data. Due to the huge volume of historical data, 25 HBase data nodes need to write cluster data. At the same time, version 1.0 of HBase cluster is configured with 10 Region Server. Because the storage architecture of HBase cannot independently support the latest vehicle condition query, the latest vehicle condition query needs to be implemented based on Redis.
The architecture of version 2.0 adopts the architecture of writing data into IoTDB through Kafka. The test scenario adopts a single machine with high IO, that is, a cluster configuration with large memory (about 384 g) and full SSD (about 50 T). After the good data schema design of IoTDB, one IoTDB machine replaced the writing function of 25 HBase nodes, and successfully maintained high stability for more than one year. At present, the test data volume is about 1.5 million pieces of data per second, and one time series data involves 16-17 measuring points on average, which stably supports the overall writing data volume of about ten thousand levels.
At the same time, the powerful data query capability of IoTDB enables the version 2.0 based on IoTDB to realize a set of scenarios in which the engine supports the query of bicycle time range (real-time query) and the query of the newest point of bicycle full time series (latest vehicle condition query), and can stably achieve the millisecond return of the results of the two types of queries.
With the improvement of writing and querying ability, the time series data management architecture based on IoTDB greatly reduces the complexity of the original HBase+Redis scheme, and increases the volume of devices and data accessible to IoTDB. At present, Changan Automobile uses about 570,000 IoTDB stand-alone access devices, and the custody time series is about 150 million.
5. Time series model and query application
At present, in the design of the four-level node timing model of the root node, service name, equipment layer and sensor layer of IoTDB, the equipment layer of Changan Automobile adopts the TUID of TBOX, that is, the equipment ID as the third-level node; In the sensor/measuring point layer, except for the signal name, the CANID is connected with an underline as the fourth level node. The reason for the design of this storage structure is that it is convenient to use the select last * from root.CANID command supported by IoTDB to query all the data values of bicycles under a certain CANID, so that IoTDB can simultaneously realize the main query scenario of Changan Automobile: the combined query of real-time vehicle condition and historical vehicle condition.
With the support of this data structure, Changan Automobile uses the following statements to run common query scenarios:
1. Query the time range of a single device
The SQL statement is as follows:
Select canid _ signal from root. can _ condition. tuid where time > t1 and time6. Future prospect.
Since 2020, Changan Automobile and IoTDB have cooperated stably for 2 years, and a stable and effective intelligent networked vehicle remote monitoring system has been built. Because the intelligent networked automobile business is in the initial stage of explosion, it is foreseeable that the number and density of vehicles collected in the vehicle networking scene will greatly expand in the future, and the amount of time series data will also increase exponentially. Big source data is still the main problem that the vehicle networking system will face.
` In the future, Changan Automobile and IoTDB will work together to further expand the volume of vehicle data access, enrich the related applications and functions of IoTDB in vehicle networking scenarios, and realize effective management of more vehicle networking business time series data.
(promotion)
关于作者