A Dual-input deep learning architecture for classification and latency estimation in ABR signals

Darahem, Youssef; Yılmaz, Oğuz; Saldırım, Halil; Mutlu, Berna; Ateş, Hasan; Güntürk, Bahadır

doi:10.3389/fmed.2025.1693921

A Dual-input deep learning architecture for classification and latency estimation in ABR signals

Darahem Y., Yılmaz O., Saldırım H. B., Mutlu B. Ö., Ateş H. F., Güntürk B. K.

FRONTIERS IN MEDICINE, cilt.12, ss.1-10, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 12
Basım Tarihi: 2025
Doi Numarası: 10.3389/fmed.2025.1693921
Dergi Adı: FRONTIERS IN MEDICINE
Derginin Tarandığı İndeksler: Scopus, Science Citation Index Expanded (SCI-EXPANDED), EMBASE, Directory of Open Access Journals
Sayfa Sayıları: ss.1-10
İstanbul Medipol Üniversitesi Adresli: Evet

Özet

Introduction Auditory brainstem response (ABR) is an objective neurophysiological evaluation designed to measure the electrical activity originating from the auditory nerve and brainstem in response to auditory stimulation. This assessment objectively records synchronous neural activity as it propagates along the auditory pathway. It is characterized by several distinct waves, most notably waves I, III, and V. Wave V plays a central clinical role since its presence and latency are routinely used to assess a patient's hearing status. However, manual identification and localization of wave V are time consuming and subjective. Previous work has explored automated detection methods to reduce this burden. Methods In this paper, we make two primary contributions. First, we propose a multi-task deep learning pipeline that simultaneously (i) detects the presence of wave V and (ii) predicts its latency, thus eliminating the need for separate manual interpretation steps and enhancing clinical usability. Second, inspired by the audiologist's practice of comparing responses at multiple click sound intensities—specifically, using responses at high intensities, where waves are more prominent, as reference—we introduce a paired-signal approach. Each input to our deep learning model consists of the test signal together with its corresponding 80 dB reference from the same recording session. This provides the model with richer contextual information, and we show that the paired-signal approach improves over the single input approach. For multi-task learning, we design a network that consists of a backbone and two branches, one for latency prediction and the other for classification of whether wave V exists or not. We first train a latency-prediction network and then freeze its feature extraction layers to initialize a classification branch. Finally, we fine-tune the entire network using a joint loss function that balances classification and regression objectives. Results Experimental results demonstrate that our joint model 1 outperforms conventional single-task approaches. For classification, it achieves an F1-score of 0.92; for latency regression, it attains an R 2 of 0.90. Discussion Our findings highlight the promise of convolutional neural networks for enhancing ABR analysis and underscore their potential to streamline clinical workflows in the diagnosis of auditory disorders.