BYR Achieve · 镜像论坛

萨里大学视觉语音信号处理中心（CVSSP）现招两名博士研究生，秋季入学，提供奖学金。感兴趣的同学可邮件联系我，并发送简历至 xubo.liu@surrey.ac.uk (1) Deep learning for audio-visual object separation (Prof. Wenwu Wang) To build deep models to characterise the coherence of audio and visual modalities, which allows the detection of activities of audio objects in video (e.g. speech, music, or other environmental sounds) and their separation from sound mixtures guided by video. More information: https://ai4me.surrey.ac.uk/training (2) Automatic sound labelling for broadcast audio （Prof. Mark D. Plumbley) The aim of this project is to develop new methods for automatic labelling of sound environments and events in broadcast audio, assisting production staff to find and search through content, and helping the general public access archive content. The project will undertake a combination of interviews and user profiling, analysis of audio search datasets, and categorisation by audio experts to determine the most useful terminology for production staff and the general public as user groups. The project will develop a taxonomy of labels, and examine the similarities and differences between each group. The project will also investigate the application of a labelled library in a production environment, examining workflows with common broadcast tools, then integrating and evaluating prototype systems. The project will also investigate methods for automatic subtitling of non-speech sounds, such as end-to-end encoder-decoder models with alignment, to directly map the acoustic signal to text sequences. Working with BBC R&D, the student will develop software tools to demonstrate the results, especially for broadcasting and the management of audiovisual archive data, and benchmark the results against human-assigned tags and descriptions of audio content. Using archive data provided by BBC R&D, the student will engage with audio production and research experts through Expert Panels, and potential end users through Focus Groups. As part of this PhD, you will have the opportunity for close day-to-day collaboration with the BBC as a member of the R&D Audio Team. Application Deadline: 1 August 2021 More information and how to apply: https://www.surrey.ac.uk/fees-and-funding/studentships/automatic-sound-labelling-broadcast-audio ----------------------------------------------------------------------------------- CVSSP aims to lead research and training in AI and Machine Perception for the benefit of society and is home to the largest cluster of research activity in Audio-Vision Machine Perception in Europe, with a project portfolio of ?24M of funding from EPSRC, EU, InnovateUK, charity and industry. The Centre is internationally unique in bringing together expertise in both audio and visual machine perception, with the central goal of creating machines that can see and hear to understand the world around them. CVSSP is a thriving community of 170 researchers with a shared mission of advancing the state-of-the-art in audio-visual signal processing, computer vision and machine perception to enable intelligent sensing technologies. The Centre has state-of-the-art acoustic capture and analysis facilities and a Visual Media Lab with video and audio capture facilities supporting research in real-time video and audio processing and visualisation, and an extensive computing infrastructure for audio-visual processing and storage. The Centre has an outstanding track record of technology transfer, licensing, and collaboration with industry, and has produced eight successful spin-out companies.

【博士申请】萨里大学CVSSP招两名秋季入学博士生