Invited Papers.- Preserving Language Heritage Through Speech Technology: The Case of Upper Sorbian.- Retrospective and Perspectives of TTS & STT Technology Development and Implementation for South Slavic Under-Resourced Languages.- Automatic Speech Recognition.- Comparison of Well- and Lower-Resourced Self-Training in ASR.- Towards a Livvi-Karelian End-to-End ASR System.- Advances in OpenASR21 Evaluation with Increased Temporal Resolution for Speech Self-Supervised Learning Models.- Benchmarking Whisper under Diverse Audio Transformations and Real-time Constraints.- AutoMode-ASR: Learning to Select ASR Systems for Better Quality and Cost.- Pre-Training and Adverse Audio Samples for Data-Efficient Wake Word Detection.- Cross-Lingual Summarization of Speech-to-Speech Translation: A Baseline.- Speech and Language Resources.- The ParlaSpeech Collection of Automatically Generated Speech and Text Corpora from Parliamentary Proceedings.- ESC Corpus of Spoken Russian: Everyday Student Conversations Captured through Continuous Speech Recording in Natural Communicative Environments.- OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled People.- Bulgarian Speech Resources in the CHILDES System.- Multiword Units in Russian Everyday Speech: Empirical Classification and Corpus-Based Studies.- Neurophysiological Correlates of Textual Modulation in Visual Stimuli: An Experimental Study of Russian and English Memes.- Speech Synthesis and Perception.- End-to-End Speech Synthesis for the Serbian Language Based on Tacotron.- ChildTinyTalks (CTT): A Benchmark Dataset and Baseline for Expressive Child Speech Synthesis.- Multidimensional Rhythm: Comparing Rhythmic Properties of Australian and New Zealand Monologues.- Influence of Linguistic and Sociolinguistic Factors on Speech Rate Perception.- Human and Machine Keyphrase Perception in Russian Text and Speech.- Assessment of Children’s Ability to Manifest Emotions in Facial Expressions, Voice and Speech by Humans, Automatic, and on a Likert Scale.- Speech Processing for Medicine.- Investigating the Utility of wav2vec 2.0 Hidden Layers for Detecting Multiple Sclerosis.- Cross-Cultural Automatic Depression Detection based on Audio Signals.- Depression Classification using Token Merging-based Speech Spectrotemporal Transformer.- Detecting Depression from Audio Data.- Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural Network.- Approach to Assessing the Quality of Syllable Pronunciation by Patients in the Process of Speech Rehabilitation Based on Comparison with Healthy Speakers.- A Comparative Study for Contextualized Spoken Answer Classification in German Medical Questionnaires.