Executive Summary : | The technological advancements have led to increased cybercrime, requiring digital forensics to analyze digital evidence. This research focuses on forensic voice comparison, a subtask of forensic speaker recognition. Traditional approaches, including auditory, spectrographic, acoustic-phonetic, and semi-automatic, are being replaced by automatic techniques. The research aims to identify suspect voices based on vowels matching and identify similarity indexes in time and frequency domains using deep learning techniques. Forensic Voice Comparison (FVC) is a system that analyzes speech patterns in recordings of unknown and known suspects to identify similarity indexes. The system uses machine learning and deep learning approaches to analyze the evidence and determine whether the same-speaker or different-speaker hypothesis is supported. Data collection involves analyzing datasets from the Forensic Voice Comparison Laboratory, which includes datasets from Australian English 500 speakers, forensic eval_01, and Standard Chinese 68 female speakers. Preprocessing involves enhancing speech quality, segmentation, and feature extraction. The hybrid speech segmentation algorithm is implemented for this research, breaking continuous speech into sequences of words or subwords. |