项⽬背景
随着电池技术进步和产业化推⼴,我国新能源汽车产业已进⼊蓬勃发展的快车道,各级政府先后发布政策持续⽀持新能源汽车技术和产业发展,全球车企对新能源汽车发展和应
⽤也都充满热情,不断进⾏探索和试验。相较于传统汽车,新能源汽车电⽓化、智能化、⽹联化、共享化程度更⾼,可采集的数据更丰富,可以⽀持多⽅⾯、深层次的数据分析
需求。
与此同时,在新⼀轮信息技术变⾰趋势下,车联⽹及⼤数据技术的应⽤为新能源汽车数据采集、运⾏分析、电池管理等领域带来了新的发展引擎和动能。
本项⽬拟对上海市新能源汽车公共数据采集与监测研究中⼼提供的新能源汽车运⾏数据展开分析,希望可以到影响新能源汽车电池状态以及能耗的重要因素,通过⽤户的驾驶
⾏为判断其使⽤风险等。
数据说明
数据集分为2个csv⽂件,其中:
SHEVDC_OV6N7709.csv为纯电汽车的运⾏数据
SHEVDC_0C023H25.csv为混动汽车的运⾏数据
各字段释义如下:
数据采集频率为每10s⼀次。
⼀、数据导⼊及预处理
1 数据导⼊
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#电动汽车数据
data_electric = pd.read_csv('SHEVDC_OV6N7709.csv')
data_electric.head()
time vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent ax_cell_volt min_volt_num min_volt_cell_id
2019-
01-10
01:12:00
1410.039938.0397. 4.147175
1
2019-
01-10
01:12:10
1410.039938.0397. 4.147175
2
2019-
01-10
01:12:20
1410.039938.0397. 4.147175
3
2019-
01-10
01:12:30
1410.039938.0397. 4.147175
4
2019-
01-10
01:12:40
1410.039938.0397. 4.147175
5 rows × 24 columns
#混动汽车数据
data_hybrid = pd.read_csv('SHEVDC_0C023H25.csv')
data_hybrid.head()
time vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent ax_cell_volt min_volt_num min_volt_cell_id 2019-
001-06
15:36:27
13179.769788.0361. 3.769196
1
2019-
01-06
15:36:37
13178.669789.0360. 3.753196
2
2019-
01-06
15:36:47
13174.269789.0361. 3.765196
3
2019-
01-06
15:36:57
13181.869789.0350. 3.663196
4
2019-
01-06
15:37:07
13174.169789.0361.2 3.789196
time vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent ax_cell_volt min_volt_num min_volt_cell_id
5 rows × 27 columns
2 数据检查
2.1 是否包含空值
#电动汽车
data_electric.info()
<class 'frame.DataFrame'>
RangeIndex: 6231 entries, 0 to 6230
Data columns (total 24 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 time 6231 non-null object
1 vehiclestatus 6231 non-null int64
2 chargestatus 6231 non-null int64
3 runmodel 6231 non-null int64
4 speed 6231 non-null float64
5 summileage 6231 non-null object
6 sumvoltage 6231 non-null float64
7 sumcurrent 6231 non-null float64
8 soc 6231 non-null int64
9 dcdcstatus 6231 non-null int64
10 gearnum 6231 non-null int64
11 insulationresistance 6231 non-null int64
12 max_volt_num 6231 non-null int64
13 max_volt_cell_id 6231 non-null int64
14 max_cell_volt 6231 non-null float64
15 min_volt_num 6231 non-null int64
16 min_volt_cell_id 6231 non-null int64
17 min_cell_volt 6231 non-null float64
18 max_temp_num 6231 non-null int64
19 max_temp_probe_id 6231 non-null int64
20 max_temp 6231 non-null int64
21 min_temp_num 6231 non-null int64
22 min_temp_probe_id 6231 non-null int64
23 min_temp 6231 non-null int64
dtypes: float64(5), int64(17), object(2)
memory usage: 1.1+ MB
电动车运⾏数据共6231条,不含空值,但summileage字段数据类型为object,将它转化为float64⽅便接下来的分析。#summileage字段转化为float64类型
data_electric['summileage']= pd.to_numeric(data_electric['summileage'],errors='coerce')
#向下填充值
data_electric['summileage']=data_electric['summileage'].fillna(method='ffill')
data_electric['summileage']
0 39938.0
1 39938.0
2 39938.0
3 39938.0
4 39938.0
...
6226 40152.0
6227 40152.0
6228 40152.0
6229 40152.0
6230 40152.0
Name: summileage, Length: 6231, dtype: float64
#混动汽车
data_hybrid.info()
<class 'frame.DataFrame'>
RangeIndex: 3121 entries, 0 to 3120
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 time 3121 non-null object
1 vehiclestatus 3121 non-null int64
2 chargestatus 3121 non-null int64
3 runmodel 3121 non-null int64
4 speed 3121 non-null float64
5 summileage 3121 non-null float64
6 sumvoltage 3121 non-null float64
7 sumcurrent 3121 non-null float64
8 soc 3121 non-null int64
9 dcdcstatus 3121 non-null int64
10 gearnum 3121 non-null int64
11 insulationresistance 3121 non-null int64
12 enginestatus 1689 non-null float64
13 grankshaftspeed 1689 non-null float64
14 enginefuelconsumptionrate 1689 non-null float64
15 max_volt_num 3121 non-null int64
16 max_volt_cell_id 3121 non-null int64
17 max_cell_volt 3121 non-null float64
18 min_volt_num 3121 non-null int64
19 min_volt_cell_id 3121 non-null int64
20 min_cell_volt 3121 non-null float64
21 max_temp_num 3121 non-null int64
22 max_temp_probe_id 3121 non-null int64
23 max_temp 3121 non-null int64
24 min_temp_num 3121 non-null int64
25 min_temp_probe_id 3121 non-null int64
26 min_temp 3121 non-null int64
dtypes: float64(9), int64(17), object(1)
memory usage: 658.5+ KB
混动汽车运⾏数据共3231条,其中出现了enginestatus/grankshaftspeed/enginefuelconsumptionrate三个字段存在部分空值的情形,从前⾯的数据说明中我们了解到这三个
字段是描述发动机状态的,既当混动汽车采取电动模式运⾏时这部分字段为空,是合理的,此处⽆需特殊处理。
2.2数据采集时间
#电动汽车
print("最早时间:",data_electric['time'].min())
print("最晚时间:",data_electric['time'].max())
最早时间: 2019-01-10 01:12:00
最晚时间: 2019-01-11 12:16:18
#混动汽车
print("最早时间:",data_hybrid['time'].min())
print("最晚时间:",data_hybrid['time'].max())
混动汽车最早时间: 2019-01-06 15:36:27
最晚时间: 2019-01-07 00:31:28
2.3统计性描述
#电动汽车
data_electric.describe()
vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc ax_cell_volt min_volt_n count6231.0000006231.0000006231.06231.0000006231.0000006231.0000006231.0000006231.0000006231.6231.00000
mean 1.700690 1.563633 1.010.12670540083.661692363.7034180.94386158.357407 1.686888 3.79234 std0.4579930.8899760.021.66699249.31409317.01415726.37998326.3931230.463797 0.17684 min 1.000000 1.000000 1.00.00000039938.000000322.200000-113.1000007.000000 1. 3.38200 25% 1.000000 1.000000 1.00.00000040091.000000348.500000-9.20000035.000000 1. 3.63200 50% 2.000000 1.000000 1.00.00000040091.000000363.000000-8.80000064.000000 2.0000000.00000
< 3.78600 75% 2.000000 3.000000 1.00.00000040091.000000377.7000000.80000080.000000 2. 3.93800 max 2.000000 4.000000 1.0100.70000040152.000000397.500000240.300000100.000000 2. 4.14700
8 rows × 23 columns
以上可见:
·该电动汽车的⾏驶速度最⼤为100.7km/h,累计⾥程从39938km增长为40152km(共⾏驶214km)
·⾏驶过程中的总电压在322.2V~397.5V之间变化,总电流在-113.1A~240.3A之间变化
·SOC(剩余电量)最⼩为7%,最⼤为100%,平均电量为58%
·电池单体电压在3.35V~4.15V之间变化,电池温度在5~13℃之间变化
#混动汽车
data_hybrid.describe()
vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc dcdcstatus gea
count3121.0000003121.0000003121.0000003121.0000003121.0000003121.0000003121.0000003121.0000003121.3121.000000
mean 1.458827 1.947453 1.73630213.63585469853.980455361.539250-0.64197457.543095 1. 3.767752 vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc ax_cell_volt std0.4983820.9287750.45431623.72320442.58298815.28466517.98588426.0510870.0.158410
min 1.000000 1.000000 1.0000000.00000069788.000000330.000000-108.00000018.000000 1. 3.450000
25% 1.000000 1.000000 1.0000000.00000069822.000000348.200000-8.30000032.000000 1. 3.630000
50% 1.000000 2.000000 2.0000000.00000069833.000000359.500000-5.70000060.000000 1. 3.744000
75% 2.000000 3.000000 2.00000022.60000069909.000000374.700000 1.60000081.000000 2. 3.903000
max 2.000000 3.000000 3.000000103.60000069909.000000390.200000105.100000100.000000 2. 4.065000
8 rows × 26 columns
以上可见:
·该混动汽车的最⼤⾏驶速度为103.6km/h,累计⾥程由69788km增长为69909km(共⾏驶121km)
·⾏驶过程中的总电压在330.0V~390.2V之间变化,总电流在-108.0A~105.1A之间变化(总电流最⼤值明显低于电动汽车)
·SOC(剩余电量)最⼩为18%,最⼤为100%,平均为57.5%
·电池单体电压在3.42V~4.07V之间变化(变化幅度⼩于电动汽车),电池温度在22~33℃之间变化(明显⾼于电动汽车)
3 数据预处理
由于数据采集频率为每10s⼀次,间隔过⼩,不利于后续分析,因此我们对time字段只取⼩时,对每个⼩时内的数据取平均值即可。
#电动汽车
def hour(time):
return time[5:13]
data_electric['time']= data_electric['time'].apply(hour)
electric_group = upby('time').mean()
electric_group.head()
vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc ax_cell_volt min_volt_num
time
01-
10
1.000000 3.093190 1.014.95591439940.24372839
2.69641612.27741997.379928 1. 4.097606 1.0
01
01-
1.000000
2.864865 1.042.98108139974.094595371.70135127.29549578.216216 1.
3.878932 1.0
10
02
01-
1.000000
2.808989 1.04
3.9957874001
4.907303352.33960729.9682585
5.339888 1.0000014.09550
< 3.677239 1.0
10
03
01-
1.000000
2.816667 1.046.77694440057.422222340.89666735.08611129.002778 1.
3.558089 1.0
10
04
01-
1.707246 1.539130 1.010.65101440089.739130337.776522
2.3982619.944928 1.66087 4.79420
< 3.525594 1.0
10
05
5 rows × 23 columns
electric_group
vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc ax_cell_volt min_volt_num
time
01-
1.000000 3.093190 1.014.95591439940.24372839
2.69641612.27741997.379928 1. 4.097606 1.0
10
01
01-
1.000000
2.864865 1.042.98108139974.094595371.70135127.29549578.216216 1.
3.878932 1.0
10
02
01-
1.000000
2.808989 1.04
3.9957874001
4.907303352.33960729.9682585
5.339888 1.00000014.09550
< 3.677239 1.0
10
03
01-
1.000000
2.816667 1.046.77694440057.422222340.89666735.08611129.002778 1.
3.558089 1.0
10
04
01-
10
1.707246 1.539130 1.010.65101440089.739130337.776522
2.3982619.944928 1.660870 4.79420
< 3.525594 1.0
05
01-
10
2.000000 1.000000 1.00.00000040091.000000344.259824-9.36422316.454545 1.9882700.17595
< 3.589308 1.0
06
01-
10
2.000000 1.000000 1.00.00000040091.000000347.098291-9.25783524.185185 1.
3.616860 1.0
07
01-
10
08
2.000000 1.000000 1.00.00000040091.000000348.884644-9.29475731.902622 1.
3.635509 1.0
01-
10
09
2.000000 1.000000 1.00.00000040091.000000350.999251-9.32771539.928839 1.
3.657670 1.0
01-
10
10
2.000000 1.000000 1.00.00000040091.00000035
3.709705-9.23291147.611814 1. 3.686333 1.0
01-
10
11
2.000000 1.000000 1.00.00000040091.000000358.276364-9.19272755.724242 1.9515150.72727
< 3.734936 1.0
01-
10
12
2.000000 1.000000 1.00.00000040091.000000364.793910-9.0208336
3.089744 1. 3.802901 1.0
01-
10
13
2.000000 1.000000 1.00.00000040091.000000371.663333-8.82700071.036667 1.
3.873410 1.0
01-
10
14
2.000000 1.000000 1.00.00000040091.000000377.689337-8.70086578.190202 1.
3.936225 1.0
01-
10
15
2.000000 1.000000 1.00.00000040091.000000384.202194-8.32915485.435737 1. 4.004536 1.0
01-
10
16
2.000000 1.000000 1.00.00000040091.000000390.633010-8.20032492.022654 1.9870550.19417
< 4.071887 1.0
01-
10
17
1.092105
2.714912 1.027.12850940100.767544386.64254416.19868491.315789 1.08333312.89912
<
4.033289 1.0
01-
10
18
1.000000
2.838889 1.025.69000040130.00555637
3.0816671
4.09666777.366667 1. 3.890811 1.0
01-
11
08
1.000000
2.921053 1.014.30614040142.903509364.17456117.14649169.570175 1.
3.799772 1.0
01-
11
09
1.795556 1.395556 1.0 3.41244440151.77333336
2.735111-
3.9768896
4.471111 1.764444 3.782591 1.0
01-
11
10
2.000000 1.000000 1.00.00000040152.000000370.540553-8.76866471.603687 1.
3.861834 1.0
01-
11
11
2.000000 1.000000 1.00.00000040152.000000376.910744-8.67231478.942149 1.9958680.06198
< 3.928339 1.0 01-
11 12
2.000000 1.000000 1.00.00000040152.000000380.900000-8.5614468
3.445783 1. 3.969976 1.0 vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc ax_cell_volt min_volt_num
time
23 rows × 23 columns
#查看聚合后的数据形状
electric_group.shape
(23, 23)
混动汽车数据量较⼩,我们对其每15分钟内的数据做⼀次汇总即可。
#混动汽车
def quarter(time):
m =int(time[14:16])//15+1
return time[5:13]+' '+str(m)
data_hybrid['time']= data_hybrid['time'].apply(quarter)
hybrid_group = upby('time').mean()
hybrid_group.head()
vehiclestatus chargestatus runmodel speed summileage sumvoltage sumcurrent soc ax_cell_volt min_volt_num time
01-
06
15
3
1.0
2.711538 1.26923135.17115469791.730769361.0692317.44807769.769231 1.
3.765519 1.0
01-
06
15
4
1.0
2.844444 1.7555569.181********.433333360.187778 2.19555666.111111 1.
3.753611 1.0
01-
发布评论