System Requirements
Minimum Configuration:
Microsoftâ Windowsâ 95, 98 or NT 4.0 (or higher)
90 Mhz (or faster) Intelâ Pentiumâ or compatible processor
8 Megabytes system RAM
25 Megabytes hard disk space
Windows-compatible sound hardware with line level inputs and outputs
About Sound Hardware:
Lexington’s STI Program is designed to work with most Windows-compatible sound cards. The sound hardware must support full duplex (simultaneous record and playback) operation, 16 bit resolution, and 44.1Khz sample rate. While most integrated sound devices and plug-in sound cards support these options, certain devices may not perform properly due to the inability of the device driver to maintain the data rate required by the STI program.
STI Program Files
The STI Program distribution contains seven files which must reside in the same directory.
Files required by STI:
Filename | Description | Size(bytes) |
sti.exe | STI application | 215K |
wmtx.bin | DFT coefficients (binary) | 168K |
mfcorr.bin | Calibration data (binary) | 1K |
level.wav | TEST Mode stimulus (10 second WAV) | 862K |
sti.wav | MEASURE Mode stimulus (2 minute WAV) | 10,753K |
sti.hlp | Windows Help | 36K |
sti.cnt | Windows Help contents | 1K |
Files created by STI:
Filename | Description |
swave.wav | TEST & MEASURE Mode response (WAV) |
sti.log | STI Measurement Results (text) |
sti.fts | Windows Help output |
sti.gid | Windows Help output (hidden) |
Files are created in the directory where sti.exe is run
Equipment Setup
The STI Program is capable of measuring the speech transmission quality of many different systems. While systems differ in their application, the STI hardware setup is essentially identical for each. The only difference will be the external equipment connected to the PC’s sound hardware.
Typical systems which can be tested with the STI program:
System Type Primary Transmission type
Public Address acoustic
Directional Mic acoustic
FM system radio frequency (FM)
Infra-red system infra-red (IR)
To test the STI program and the PC’s sound hardware a loopback connection can be made which requires no external hardware. When the loopback connection is used the program should give an STI of 1.0.
Setup Examples:
Figure 1: Loudspeaker/Microphone STI Measurement
Figure 2: Personal FM System STI Measurement
All WAV files used by the STI Program are mono format. Sound cards play mono files on both the left and right (TIP and RING) channels. Equipment used to present the stimulus may be connected to either the TIP or RING segments of the LINE OUT jack. Sound cards record mono files from the left channel only so external equipment used to record the response must be connected to the TIP segment of the LINE IN jack.
STI Program Output
The STI Program calculates effective signal to noise ratios (SNR) for seven one-octave frequency bands. In each band there are fourteen modulation frequencies. An example of the STI Program output is shown below. Header information contains Title, Date, and CommentIDH_MEASURE for the most recent measurement. The large grid of values (7×14) are effective signal to noise ratios (SNR’s) calcuated from modulationIDH_THEORY reduction factors. These values will always be between -15 and +15 dB. Following the Effective SNR table are Average SNR’s calculated for each column. The last entry in the file is the STI. The STI ranges from 0 to 1, where 0 represents the worst case (completely unintelligible) and 1 represents the best case (perfectly intelligible).
Example Output: STI.LOG
Lexington Center/School for the Deaf
SPEECH TRANSMISSION INDEX – Version 2.0 M. Steele
Data From: Wednesday, January 5, 2000, 06:55
Comment: Example STI Program Output
Octave Band Center Frequencies:
125 250 500 1000 2000 4000 8000
Effective SNR:
-4.576 15.000 15.000 15.000 15.000 15.000 15.000
-4.437 15.000 15.000 15.000 15.000 15.000 15.000
-4.310 11.092 14.758 15.000 15.000 15.000 15.000
-4.100 14.008 15.000 15.000 15.000 15.000 15.000
-5.390 15.000 15.000 15.000 15.000 15.000 15.000
-3.229 15.000 15.000 15.000 15.000 15.000 15.000
-5.940 15.000 13.084 15.000 15.000 15.000 15.000
-3.841 14.558 13.797 15.000 15.000 15.000 15.000
-3.062 15.000 11.828 15.000 15.000 15.000 15.000
-2.294 15.000 15.000 15.000 15.000 15.000 15.000
-4.177 14.589 15.000 15.000 15.000 15.000 15.000
-3.918 15.000 15.000 15.000 15.000 15.000 15.000
-2.766 14.502 11.981 15.000 15.000 15.000 15.000
-3.657 15.000 12.903 15.000 15.000 15.000 15.000
Average SNR:
-3.978 14.554 14.168 15.000 15.000 15.000 15.000
STI = 0.913
STI Theory
Lexington’s Speech Transmission Index program is based on the Modulation Transfer Function (MTF). The MTF, traditionally used in the study of room acoustics, has been successfully applied to the measurement of speech transmission quality (1)IDH_REFERENCES. The MTF is a measure of the reduction in modulation, of a test signal, due to additive noise and/or temporal and non-linear distortion resulting from transmission through the system under test. Figure 3 shows a graphical representation of the MTF measurement process. A more detailed description of the MTF can be found in (3) and (4)IDH_REFERENCESIDH_REFERENCES.
The stimulus signal used by the STI program consists of a band limited white noise carrier modulated by low frequency sine waves. Fourteen modulation frequencies are used in the STI stimulus file, sti.wav. The modulating sine waves range in frequency from 0.63Hz to 12.7Hz in 1/3 octave steps. The stimulus is designed to simulate intensity distributions found in running speech. The upper portion of Figure 3 shows one segment of the stimulus and the response of a typical system. Note that the response has been corrupted by additive noise. Other sources of corruption may include temporal (blurring), or non-linear (clipping) distortion. This corruption results in a response with lower modulation factors than the stimulus. The MTF is a measure of the amount of modulation reduction versus modulation frequency, Fm.
Figure 3: Calculating the Modulation Transfer Function
The STI program calculates modulation reduction factors for seven one-octave spectral bands which are important to the transmission of speech. These bands range from 125Hz to 8000Hz, in one octave steps. Within each of the seven octave bands, fourteen modulation reduction factors are calculated. These modulation reduction factors are then converted to effective signal to noise ratios. These SNR’s are weighted and averaged to obtain the STI. A detailed description of the conversion, weighting, and averaging process can be found in (3)IDH_REFERENCES.
References
1) Steeneken, H.J.M. and Houtgast, T. (1973). “The Modulation Transfer Function in Room Acoustics as a Predictor of Speech Intelligibility,” Acustica 28, 66-73 (1973)
2) Steeneken, H.J.M. and Houtgast, T. (1979). “A physical method for measuring speech-transmission quality,” Institute for Perception TNO, Soesterberg, the Netherlands.
3) Steeneken, H.J.M. and Houtgast, T. (1984). “A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria,” Institute for Perception TNO, Soesterberg, the Netherlands.
4) Studebaker, G.A. and Matesich, J.S. (1992). “A derivation of the Modulation Transfer Function,” Memphis State University, Memphis, Tennessee.
5) Rife, D.D. (1992). “Modulation Transfer Function Measurement with Maximum Length Sequences,” DRA Laboratories, Sterling, Virginia.
Lexington’s STI Program and documentation written by:
Michael Steele
Sr. Research Engineer
Lexington Center/School for the Deaf
30th Ave and 75th Street
Jackson Heights, NY 11370
Leave a Reply