Framework for creation and performance testing of machine learning based models for the assessment of transmission network impact on speech quality for mobile packet-switched voice services
|Publication Date:||1 January 2020|
This Recommendation1 specifies a framework in the form of constraints, performance criteria and methods for the development of intrusive parametric, machine learning (ML) based models for the assessment of transmission network impact on speech quality for mobile packet-switched voice services.
Models developed according to this Recommendation estimate the speech quality based on IP-bit-stream and the temporal distribution of speech in the reference speech sample. The models use the adaptiveness of the jitter buffer in the end client as well as IP transport and underlying transport behaviour of typical voice services such as high definition voice over Internet protocol (HD VoIP) and VoLTE (including NB, WB, SWB and FB) and over the top (OTT) (e.g., WhatsApp, Skype, Viber, WeChat, among others).
This Recommendation specifies techniques using machine learning to predict speech quality based on what it has learnt in the controlled and verified environment of the framework. Continuous learning based on real time adaptation of the ML algorithm's coefficients is not used. In addition, the Recommendation explains how the framework should be used and what are the requirements to be met in order for a ML based predictor to conform to this Recommendation. Test datasets are provided and required to be used in order to prove that models developed based on the framework meet the minimum required performance as defined by the framework. The Recommendation also specifies conditions and requirements for an independent additional validation of models developed based on the framework.
1 This Recommendation includes electronic attachments containing detailed descriptions of generic jitter files and a reference speech sample (see Annex F).