Objective quality scoring explained.
The differences between ITU objective quality scoring standards and why you should care.
They have its origins in ITU-T’s family of full reference objective voice quality measurements which started in 1997 with P.861 (PSQM), which was superseded by P.862 (PESQ) in 2001 [Wikipedia]. It was originally developed to test narrow band networks. For WebRTC and IP based calls, ITU-T P.862 PESQ was effectively superseded in 2010 by ITU-T P.863 POLQA.
For modern test requirements, the use of POLQA is strongly recommended. The reasoning behind this is because of POLQA's capability for wideband and super-wideband measurement accuracy and suitability to advanced IP based networks.
When testing modern IP networks, the only reason why a customer would want to test with PESQ would be if they needed to compare earlier results with those obtained now. Even in this case, they should be aware of possible issues with scores obtained if measuring over IP services. For modern stacks it is recommended to utilise POLQA for objective measurement, which is appropriate for WebRTC quality scoring.
"POLQA is the correct method of testing when working with wideband(WB) and super wideband(SWB) codecs such as Opus."
‍
— John Mitchem, Co-founder Operata.
Simply put, for VoIP and webRTC testing, stick with POLQA. PESQ can cause false positives and scores should be considered erroneous when used in wideband networks. Make sure you compare quality monitoring and testing tools to ensure they support POLQA for objective quality scoring.
To learn more, take a deep dive into the research that compares the algorithms.
‍