Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea |
| |
Affiliation: | 1. University of Wisconsin-Madison, 1415 Engineering Drive, Madison, WI 53706, United States;2. University of Iowa Hospitals and Clinics, 200 Hawkins Drive, Iowa City, IA 52242, United States;3. University of Iowa, 145 N. Riverside Drive, Iowa City, IA 52242, United States;4. University of Nebraska Medical Center, 42nd and Emile, Omaha, NE 68198, United States;1. Ministry of Transportation and Communications, 6 Shengfu Road, Chung-Shin Village, Nantou City 54045, Taiwan, ROC;2. Department of Transportation Technology and Management, Feng Chia University, 100 Wenhwa Road, Seatwen, Taichung City 54407, Taiwan, ROC;3. Center for Advanced Transportation Management Systems, Feng Chia University, 100 Wenhwa Road, Taichung City 40724, Taiwan, ROC;1. School of Transportation Science and Engineering, Harbin Institute of Technology, Harbin 150001, China;2. Department of Civil and Environmental Engineering, University of Washington, Seattle, WA 98195-2700, USA;3. School of Energy and Transportation Engineering, Inner Mongolia Agricultural University, Hohhot 010018, China;1. Postdoctoral Scholar, Massachusetts Institute of Technology, USA;2. Beaman Distinguished Professor & Transportation Program Coordinator, University of Tennessee, Knoxville, TN 37996, USA;3. Research Staff Member, Imaging, Signals, and Machine Learning Group, Oak Ridge National Laboratory, TN, USA |
| |
Abstract: | One challenge in using naturalistic driving data is producing a holistic analysis of these highly variable datasets. Typical analyses focus on isolated events, such as large g-force accelerations indicating a possible near-crash. Examining isolated events is ill-suited for identifying patterns in continuous activities such as maintaining vehicle control. We present an alternative approach that converts driving data into a text representation and uses topic modeling to identify patterns across the dataset. This approach enables the discovery of non-linear patterns, reduces the dimensionality of the data, and captures subtle variations in driver behavior. In this study topic models were used to concisely described patterns in trips from drivers with and without untreated obstructive sleep apnea (OSA). The analysis included 5000 trips (50 trips from 100 drivers; 66 drivers with OSA; 34 comparison drivers). Trips were treated as documents, and speed and acceleration data from the trips were converted to “driving words.” The identified patterns, called topics, were determined based on regularities in the co-occurrence of the driving words within the trips. This representation was used in random forest models to predict the driver condition (i.e., OSA or comparison) for each trip. Models with 10, 15 and 20 topics had better accuracy in predicting the driver condition, with a maximum AUC of 0.73 for a model with 20 topics. Trips from drivers with OSA were more likely to be defined by topics for smaller lateral accelerations at low speeds. The results demonstrate topic modeling as a useful tool for extracting meaningful information from naturalistic driving datasets. |
| |
Keywords: | Machine learning Topic modeling Driver behavior Drowsiness Sleep apnea Naturalistic driving data |
本文献已被 ScienceDirect 等数据库收录! |
|