TIA/EIA TSB 102.BABD
APCO Project 25 Vocoder Selection Process
|Publication Date:||1 May 1996|
This document will describe the evaluation procedure to be employed by the TIA in its assessment of various digital voice coding technology proposals for Project 25. The choice of a digital voice coder (vocoder) is important since the speech quality performance of the entire communication system depends upon the ability of the voice coder to operate satisfactorily in this environment. However, vocoder complexity and throughput delay are also important since they will impact the acceptability of the equipment to the user. The purpose of this evaluation is to provide a comparison of these attributes among the various candidate vocoder proposals.
The evaluation procedure is designed to give a fair and objective comparison among the candidate coder proposals. This is accomplished through the examination of the various vocoder atmbutes as well as conducting a subjective listening test in which the relative performance among the proposals is measured in a quantitative way. The purpose of the evaluation is to enable the recommendation of the best vocoder proposal for APCO/NASTD Project 25.
The evaluation procedure is based on an examination of complexity, throughput delay and speech quality performance of the four proposals. Complexity translates into both equipment cost as well as power consumption. Power consumption translates into battery life for portable equipment. Throughput delay is less of a serious consideration in a push to talk radio environment than a full duplex telephone system but it still has impact on user acceptability of the system. These particular atmbutes can be specified through a detailed examination of the proposals. The speech quality performance issue is more difficult to quantify since it is a subjective issue. There is no known objective measurements that can be performed to rate the acceptability of the speech coder performance to a human listener. Because of this the performance evaluation must rely upon subjective testing. The subjective testing involves the use of a panel of listeners who rate the coders' performance on a 5 point scale. Since listeners opinions will vary, the results from a number of listeners is obtained and averaged to obtain an overall score.
To determine the acceptability of the overall speech quality it is necessary to conduct an evaluation in a controlled manner so that unintentional variation in the scoring is avoided. The purpose of the testing is to determine differences in performance among the candidates. The confidence we have that the measured differences in performance are meaningful will depend upon how well we prevent differences from occumng in the testing of the various candidate proposals. In addition, the-judgement as to the effect of various impairments upon the candidates wiii depend upon how well the introduction of those impairments is controlled.
The purpose of the listening test is to evaluate the proposals under a variety of test conditions. The test conditions were chosen to be representative of those expected to be experienced in a land mobile radio environment. The test is organized so that each test condition tests only one aspect of system performance. Hence only a limited number of operating conditions are tested. To test all possible operating conditions would lead to test that would be too unwieldy to conduct.
The acceptability of the proposals to various channel impairments, various talkers and speaking volumes is determined through the use of a Mean Opinion Score (MOS) test. This test will be conducted to evaluate the various proposals. For the various background noise conditions a Degraded MOS (DMOS) test is conducted. For this test the input audio is degraded and the purpose is to see what additional degradation is introduced by the vocoder. The DMOS allows this by asking the listeners to score the quality difference between the unprocessed degraded speech and the same speech signal after processing. The background noise conditions will be chosen to be representative of the actual environment in which these systems will be used. However some noise conditions will not be tested because it would be difficult for the listeners to score them. For example, a shotgun blast would be disconcerting to the listeners and may effect their responses for some time afterward. Also, the listener may also not know how to quantify the quality in such a situation. For such anomalous conditions, a characterization test will be conducted after the formal listening tests. In the characterization test, the purpose is not to quantify the differences among proposals but to ensure that no anomalous behavior is exhibited.
Chapter 2 will discuss the speech data base that was chosen to be used in the vocoder testing and evaluation. Chapter 3 contains the details of the vocoder testing as well as the speech data base used for the testing. Chapter 4 discusses the design of the listening tests themselves. Chapter 5 discusses the overall evaluation procedure of the vocoders including the effects of speech quality, vocoder complexity and throughput delay. Chapter 6 describes the responsibilities of the organizations conducting the tests. Chapter 7 details the financial obligation of the vocoder parricipants for this evaluation. Lastly, appendices are attached which contain additional detail and copies of documents which form the basis for this document.