3.1. Basics of Digital Audio
Digitization
of Sound
Introduction
to MIDI
Reference: K.C.
Pohlmann, "Principles of Digital Audio", 3rd ed., McGraw-Hill, 1995.
Reference: Chapter 3 of
Steinmetz and Nahrstedt
Facts about Sound
- Sound is a continuous wave that travels through the air.
- The wave is made up of pressure differences. Sound is detected by
measuring the pressure level at a location.
- Sound waves have normal wave properties (reflection, refraction,
diffraction, etc.).
- Human ears can hear in the range of 16 Hz to about 20 kHz. This changes
with age.
Hence, wavelengths vary from 21.3 m to 1.7 cm.
- The intensity of sound can be measured in terms of Sound Pressure Level
(SPL) in decibels (dBs).
intensity level = 10 log (P / P0) dB,
where P and P0 are values of acoustic power, and P0 will deliver an
intensity of sound at the threshold of hearing, which is 10-12
W/m2 (watts per square meter).
Digitization in General
- Microphones, video cameras produce analog signals
(continuous-valued voltages)
- To get audio or video into a computer, we must digitize it (convert
it into a stream of numbers)
So, we have to understand discrete sampling (both time and voltage)
- Sampling -- divide the horizontal axis (the time dimension) into
discrete pieces. Uniform sampling is ubiquitous.
Quantization -- divide the vertical axis (signal strength) into
pieces. Sometimes, a non-linear function is applied.
- 8 bit quantization divides the vertical axis into 256 levels. 16 bit
gives you 65536 levels.
Digitizing Audio
- Questions for producing digital audio (Analog-to-Digital Conversion):
- How often do you need to sample the signal?
- How good is the signal?
- How is audio data formatted?
Nyquist Theorem
- Suppose we are sampling a sine wave. How often do we need to sample it to
figure out its frequency?
- If we sample only once per cycle, we may think the signal is a constant.
- If we sample at another low rate, e.g., 1.5 times per cycle, we may think
it's a lower frequency sine wave --> Alias
- Nyquist rate -- It can be proved that a bandwidth-limited signal
can be fully reconstructed from its samples, if the sampling rate is at least
twice of the highest frequency in the signal.
Signal to Noise Ratio (SNR)
- In any analog system, some of the voltage is what you want to measure
(signal), and some of it is random fluctuations (noise).
- Ratio of the power of the two is called the signal to noise ratio
(SNR). SNR is a measure of the quality of the signal.
- SNR is usually measured in decibels (dB).
Signal to Quantization Noise Ratio (SQNR)
- The precision of the digital audio sample is determined by the number of
bits per sample, typically 8 or 16 bits.
The quality of the quantization can be measured by the Signal to
Quantization Noise Ratio (SQNR).
- The quantization error (or quantization noise) is the
difference between the actual value of the analog signal at the sampling time
and the nearest quantization interval value.
The largest (worst) quantization error is half of the interval.
- Given N to be the number of bits per sample, the range of the digital
signal is - 2 exp (N-1) to 2 exp (N-1).
In other words, each bit adds about 6 dB of resolution, so 16 bits enable a
maximum SQNR = 96 dB.
(** The above is for the worst case. Assume the input signal is sinusoidal,
and the quantization error is statistically independent and its magnitude is
uniformly distributed between 0 and half of the interval,
SQNR = 6.02N +
1.76. [Pohlmann95, p. 37])
Linear and Non-linear Quantization
- Samples are typically stored as raw numbers (linear format ), or as
logarithms (u-law (or A-law in Europe)).
- Logarithmic quantization approximates perceptual non-uniformity.
Typical Audio Formats
- Popular audio file formats include .au (Unix workstations), .aiff (MAC,
SGI), .wav (PC, DEC workstations)
- A simple and widely used audio compression method is Adaptive Delta Pulse
Code Modulation (ADPCM). Based on past samples, it predicts the next sample
and encodes the difference between the actual value and the predicted value.
Audio Quality vs. Data Rate
Quality Sample Rate Bits per Mono/ Data Rate Frequency
(KHz) Sample Stereo (if Uncompressed) Band
--------- ----------- -------- -------- ----------------- ------------
Telephone 8 8 Mono 8 KBytes/sec 200-3,400 Hz
AM Radio 11.025 8 Mono 11.0 KBytes/sec
FM Radio 22.050 16 Stereo 88.2 KBytes/sec
CD 44.1 16 Stereo 176.4 KBytes/sec 20-20,000 Hz
DAT 48 16 Stereo 192.0 KBytes/sec 20-20,000 Hz
DVD Audio 192 24 Stereo 1,152.0 KBytes/sec 20-20,000 Hz
- Telephone uses u-law encoding, others use linear. So the dynamic
range of digital telephone signals is effectively 13 bits rather than 8 bits.
- CD quality stereo sound --> 10.6 MB / min.
Synthetic Sounds
- FM
(Frequency Modulation) Synthesis -- used in low-end Sound Blaster cards,
OPL-4 chip
- Wavetable synthesis -- wavetable generated from sound waves of real
instruments
- FM Synthesis is good for creating new sounds. Wavetables can store
sounds of existing instruments nicely.
- The wavetables are stored in memory on the sound card and they can be
manipulated by software.
- To save memory space, a variety of special techniques, such as sample
looping, pitch shifting, mathematical interpolation, and polyphonic digital
filtering can be applied.
Further Exploration
CD audio file
formats
Definition of MIDI: a protocol that enables computer, synthesizers,
keyboards, and other musical device to communicate with each other.
1. Terminologies:
Synthesizer:
- It is a sound generator (various pitch, loudness, tone color).
- A good (musician's) synthesizer often has a microprocessor, keyboard,
control panels, memory, etc.
Sequencer:
- It can be a stand-alone unit or a software program for a personal
computer. (It used to be a storage server for MIDI data. Nowadays it is more a
software music editor on the computer.)
- It has one or more MIDI INs and MIDI OUTs.
Track:
- Track in sequencer is used to organize the recordings.
- Tracks can be turned on or off on recording or playing back.
Channel:
- MIDI channels are used to separate information in a MIDI system.
- There are 16 MIDI channels in one cable.
- Channel numbers are coded into each MIDI message.
Timbre:
- The quality of the sound, e.g., flute sound, cello sound, etc.
- Multitimbral -- capable of playing many different sounds at the same time
(e.g., piano, brass, drums, etc.)
Pitch:
- musical note that the instrument plays
Voice:
- Voice is the portion of the synthesizer that produces sound.
- Synthesizers can have many (16, 20, 24, 32, 64, etc.) voices.
- Each voice works independently and simultaneously to produce sounds of
different timbre and pitch.
Patch:
- the control settings that define a particular timbre.
2. Hardware Aspects of MIDI
MIDI connectors:
-- three 5-pin ports found on the back of every MIDI unit
- MIDI IN: the connector via which the device receives all MIDI data.
- MIDI OUT: the connector through which the device transmits all the MIDI
data it generates itself.
- MIDI THROUGH: the connector by which the device echoes the data receives
from MIDI IN.
Note: It is only the MIDI IN data that is echoed by MIDI through. All the
data generated by device itself is sent through MIDI OUT.
A Typical MIDI Sequencer Setup:
- MIDI OUT of synthesizer is connected to MIDI IN of sequencer.
- MIDI OUT of sequencer is connected to MIDI IN of synthesizer and "through"
to each of the additional sound modules.
- During recording, the keyboard-equipped synthesizer is used to send MIDI
message to the sequencer, which records them.
- During play back: messages are send out from the sequencer to the sound
modules and the synthesizer which will play back the music.
3. MIDI Messages
-- MIDI messages are used by MIDI devices to
communicate with each other.
Structure of MIDI messages:
- MIDI message includes a status byte and up to two data bytes.
- Status byte
- The most significant bit of status byte is set to 1.
- The 4 low-order bits identify which channel it belongs to (four bits
produce 16 possible channels).
- The 3 remaining bits identify the message.
- The most significant bit of data byte is set to 0.
Classification of MIDI messages:
----- voice messages
---- channel messages -----|
| ----- mode messages
|
MIDI messages ----|
| ---- common messages
----- system messages -----|---- real-time messages
---- exclusive messages
A. Channel messages:
-- messages that are transmitted on individual channels rather that globally
to all devices in the MIDI network.
A.1. Channel voice messages:
- Instruct the receiving instrument to assign particular sounds to its voice
- Turn notes on and off
- Alter the sound of the currently active note or notes
Voice Message Status Byte Data Byte1 Data Byte2
------------- ----------- ----------------- -----------------
Note off &H8x Key number Note Off velocity
Note on &H9x Key number Note on velocity
Polyphonic Key Pressure &HAx Key number Amount of pressure
Control Change &HBx Controller number Controller value
Program Change &HCx Program number None
Channel Pressure &HDx Pressure value None
Pitch Bend &HEx MSB LSB
Notes: `x' in status byte hex value stands for a channel number.
Example: a Note On message is followed by two bytes, one to identify the
note, and on to specify the velocity.
To play note number 80 with maximum
velocity on channel 13, the MIDI device would send these three hexadecimal byte
values: &H9C &H50 &H7F
A.2. Channel mode messages: -- Channel mode messages are a special
case of the Control Change message (&HBx or 1011nnnn). The difference
between a Control message and a Channel Mode message, which share the same
status byte value, is in the first data byte. Data byte values 121 through 127
have been reserved in the Control Change message for the channel mode messages.
- Channel mode messages determine how an instrument will process MIDI voice
messages.
1st Data Byte Description Meaning of 2nd Data Byte
------------- ---------------------- ------------------------
&H79 Reset all controllers None; set to 0
&H7A Local control 0 = off; 127 = on
&H7B All notes off None; set to 0
&H7C Omni mode off None; set to 0
&H7D Omni mode on None; set to 0
&H7E Mono mode on (Poly mode off) **
&H7F Poly mode on (Mono mode off) None; set to 0
** if value = 0 then the number of channels used is determined by the
receiver; all other values set a specific number of channels, beginning with the
current basic channel.
B. System Messages:
- System messages carry information that is not channel specific, such as
timing signal for synchronization, positioning information in pre-recorded
MIDI sequences, and detailed setup information for the destination device.
B.1. System real-time messages:
- messages related to synchronization
System Real-Time Message Status Byte
------------------------ -----------
Timing Clock &HF8
Start Sequence &HFA
Continue Sequence &HFB
Stop Sequence &HFC
Active Sensing &HFE
System Reset &HFF
B.2. System common messages:
- contain the following unrelated messages
System Common Message Status Byte Number of Data Bytes
--------------------- ----------- --------------------
MIDI Timing Code &HF1 1
Song Position Pointer &HF2 2
Song Select &HF3 1
Tune Request &HF6 None
B.3. System exclusive message:
- (a) Messages related to things that cannot be standardized, (b) addition
to the original MIDI specification.
- It is just a stream of bytes, all with their high bits set to 0, bracketed
by a pair of system exclusive start and end messages (&HF0 and &HF7).
4. General MIDI
- MIDI + Instrument Patch Map + Percussion Key Map --> a piece of MIDI
music sounds the same anywhere it is played
- Instrument patch map is a standard program list consisting of 128 patch
types.
- Percussion map specifies 47 percussion sounds.
- Key-based percussion is always transmitted on MIDI channel 10.
- Requirements for General MIDI Compatibility:
- Support all 16 channels.
- Each channel can play a different instrument/program (multitimbral).
- Each channel can play many voices (polyphony).
- Minimum of 24 fully dynamically allocated voices.
Appendix
A1.
General MIDI Instrument Patch Map
A2.
General MIDI Percussion Key Map
Further Exploration
Try some
good sources for locating internet sound/music materials at
A tutorial on MIDI and
wavetable music synthesis
YAHOO's Multimedia:Sound
Page
Top
| Chap
3 | CMPT 365
Home Page | CS