[JPG image artwork that is a companion to the text]

How to Create a Wave File using Scilab

introduction
the frequencies
making harmonic components
the sound envelope
creating the wave file
summary

introduction

Scilab can be used to create a wave file which you might then use in you midi synth soundfont table.

This wave file will be created using the additive synthesis technique, that is, by adding together frequency components.

We saw on the first page, basic sound generation, how to make a sine wave with Scilab. The frequencies for those sine waves were arbitrary, mathematical frequencies. I just used sin(x) as the fundamental frequency, sin(2*x) as the first harmonic, etc. The frequency doesn't mean anything until we set a sample rate, which is the number of times per second that the computer will read values from the file.

The basic mathematical argument to the sine function is an angle, which really signifies and angle measured around a circle. This angle is in dimensionless units called radians. Once around the circle, which is equivalent to one whole period of the fundamental frequency, is 2pi. So the first harmonic would be 4pi, the second 6pi, etc.

When we use an actual frequncy of a sound as the sine argument, we must use the time and frequency, in cycles per second, as arguments to the sine function. But we must make the time and frequency come out like an angle, because the sine function must take an angle as an argument. We want the time it takes to go once around a circle to be the period of one oscillation. So the time and frequency are multiplied and factored. Here is an example of this in Scilab.

x = 2 * %pi * f * t;

So, instead of an array of angle values for the sine function, we will create an array of time values. But there is one more step. The machine, the sound adapter card, will read the audio samples we create at a certain speed called the sample rate, so we must match up the values in the sine argument with the machine's sample rate. Since the values are actually discreet, digital, values we can make the time argument equal to a time increment times a sample number:

t = time_increment * N

But the time increment is just the reciprocal of the sample rate, Fs.

t = N / Fs

So, the argument to the sine function written out in Scilab script is

x = 2 * %pi * (f / Fs) * N;

Now the sine wave values that we create will be reproduced correctly by the sound adapter card.

making harmonic components

In Scilab, we will create a sine wave like we did on the first page, basic sound generation, but this time the array will be the sample numbers, N, as in the statement above. Let's make the sound file 2 seconds long, so the array of sample numbers needs to count up to 2-seconds-worth of samples. The most common sample rate for sound files is 44100 samples per second. So the array must count from 1 to 88200. Let's choose concert A, 440 cycles per second, as our fundamental frequency. Enter the following commands in Scilab:

N = 0 : 88200;
Fs = 44100;
f = 440;
x = 2 * %pi * (f / Fs) * N;

Remember to put the percent sign in front of pi in the command above. Now we can create a real freequency component that when sampled by a sound adapter will actually create a 440 hz sine wave. Enter the following command into Scilab:

y = sin(x);

This array, y, is just a sine wave. You can plot it with the following command:

plot(x, y)

If we use additive synthesis to create the harmonics, like we did on the first page, basic sound generation, we would repeat the above step and multiply the phase angle argument, x, by integer numbers:

y2 = sin(2*x);
y3 = sin(3*x);
y4 = sin(4*x);

You can add as many as want, that is, as many as it takes to get the right sound you want. But with additive synthesis, you need to choose the right harmonics with just the right amplitude. So for this example we can try staggering the amplitudes. Sounds from real musical instruments usually, but not always, have higher frequencies with smaller amplitudes than the lower frequncies. There is a basic reason for that: higher frequencies take more energy to create, so you have to work harder to get them. So for amplitudes let's choose 1, 1/3, 1/5, 1/7.

Another trick is to slightly detune the harmonics. Instead of using an exact integer to multiply the fundamental frequency, use a rational number that is a few percent detuned from the integer. A real resonator such as a guitar string really does produce exact harmonics with exact integer multiples of the fundamental frequency, but slight detuning will simulate the wave interference that happens in chorusing. So, for our actual harmonics enter the following Scilab commands:

y2 = (1/3)*sin(2.002*x);
y3 = (1/5)*sin(2.995*x);
y4 = (1/7)*sin(4.003*x);

Now do additive synthesis and add the harmonics with the following Scilab command:

z = y + y2 + y3 + y4;

Now let's see what the wave looks like. Plot the wave with the following Scilab command:

plot(x, z)

and you will get a graph that looks like this:

the sound envelope

A real musical note played by an instrument has an envelope to the overall sound. The beginning, middle and ending amplitudes of the note have a particular shape that we recognize as different for the guitar that for the trombone. This is why early analog synthesizers need voltage controlled amplifiers to produce an ADSR envelope (attack, decay, sustain, release).

On a computer an envelope is not hard to create. Creating a realistic envelope that actually mimics a real musical instrument is an art and a science. But we can do at least as good as the commercial synthesizers with our scilab, and other, software after we know how.

The envelope is created by multiplying each sound sample by a fractional number (between 0 and 1). In electrical engineering this is called modulation. The easy way to do this with our scilab frequencies is to create another array, with the same number of elements as the harmonics array, then multiply the two arrays element-by-element.

The attack, or rise time, of the note will typically be between 0.01 seconds and 0.2 seconds. If we choose a rise time of 0.05 seconds, then we must create a line that will have the same length as 0.05 seconds of the harmonics array. So, 0.05 seconds multiplied by 44100 samples per second is 2205. So in scilab, create an array of numbers, of length 2205, that starts at 0 and goes to 1:

line1 = linspace(0, 1, 2205);

Let's make a smooth, rather than hard, sound. We will therefore give a significant fraction of a second, say 0.1 seconds, before letting the sound decay. Using the same formula as above, time multiplies by sample rate, we find that we need a length of 4410 samples to hold the amplitude:

line2 = linspace(1, 1, 4410);

Then the amplitude will drop a little, say by 10 percent, over the same time interval, before the decay commences:

line3 = linspace(1, 0.9, 4410);

Now let the sound gradually decay for the remaining duration of the note, say by 50 percent. But let's stop the note cleanly by pinching it off with the reverse of an attack rise time, that is, with a turn-off envelope. So the last 2205 elements must be reserved for this turn off time. The time duration of the middle, decaying, part of the note is just the remainder from the other times: 2 seconds - 0.05 - 0.1 - 0.1 - 0.05 = 1.7 seconds. And 1.7 seconds corresponds to 74970 samples.

line4 = linspace(0.9, 0.45, 74970);

Now the turn-off slope will last for the same time as the attack, 2205 samples.

line5 = linspace(0.45, 0, 2205);

The Scilab function above, linspace, creates an array with the number of array elements given by the third argument. Its first value will be the value in the first argument, and the last value in the array will be the second argument to the function.

Make sure the total number of samples in the five arrays we creates above is equal to the length of the harmonics arrays: 88200.

line 1: 2205 samples
line 2: 4410 samples
line 3: 4410 samples
line 4: 79380 samples
line 5: 2205 samples

total samples = 2205 + 4410 + 4410 + 74970 + 2205 = 88200

Now build and plot the total envelope in the Scilab workspace.

envp = [line1, line2, line3, line4, line5];
plot(x, envp)

You should get a plot that looks like this:

The very top ledge of the initial pulse aligns exactly with the border of the graph window, so to see it better you could use the more advanced Scilab command,

plot2d(x, envp, rect=[0,0,5529,1.1])

which will look like this:

Now to apply the amplitude envelope to the harmonic components we just multply the two arrays together:

w = z .* envp;

Notice that the multiplication was performed with a dot in front of the star. This is necessary for multiplying arrays element-by-element. Now if we plot the result, it looks like this,

creating the wave file

We can now save the sound data we just created into a wave file. To do this we will use the Scilab command called wavwrite. To use wavwrite we need to keep in mind that wavwrite expects the sound data to be arranged in a matrix. I didn't mention this before, but the arrays of data we have been creating are row arrays rather than column arrays. The difference is that a row array is printed out all on one line, like

                1  2  3  4  5  6  .....   N

and a column array is printed out all in one column, like

This only matters for using Scilab and has nothing to do with sound and music synthesis. Arrays in Scilab are set up this way because Scilab is a mathematics software program, and the difference between row arrays and column arrays is important in matrix algebra. This little diversion was necessary because the wavwrite function expects the arrays the be arranged in rows. Changing a row array to a column array, or the reverse, is called a transpose operation (a term from matrix algebra). In Scilab, you can transpose an array by typing it with a single quote character adter it:

y_transpose = y';

One more detail is that wavewrite function expects the sound data to be in the range of -1 to +1. This is also just a Scilab detail. The actual sound data values in a wave file are integer numbers that range from -32768 to +32767. Scilab uses the fractional values between -1 and +1 because this is usually a more convenient range in numerical mathematics. The wavwrite function will convert the -1 to +1 range of values that we give to it and create a wave file with data in the -32768 to +32767 range. So, the way to normalize our data (another mathematics word) is to divide every number by the maximum value. This is easily done with one line of code in Scilab:

wn = w / max(abs(w));

We might as well make the wave file stereo. So we will just replicate the array and create two columns of data to put into the wavwrite function:

Ws = [wn ; wn];

In the above step, we combined two rows of data into a single two-row matrix.

The final step is to pass the matrix of sound data to the wavwrite function. We must specify the sample rate, Fs, and we will name the file "first_wave.wav":

wavwrite(Ws, Fs, 'first_wave.wav');

Now you can listen to first_wave.wav by playing it through your sound adapter. It's not going to win any awards, but it was simple enough for a tutorial.

summary

What we did in this tutorial was to create sound by programming a computer. But this was not MIDI programming, which is usually was musicians think of when programming is mentioned. This kind of programming is what the Csound program does. Whereas MIDI programming is about arranging notes into a song, Csound actually generates the sound. Csound will also let you write the notes for your song which will be in its score file. But the kind of programming we did in this tutorial, that is, sound programming, is written into the Csound instrument file.

These are the important techniques we learned in this tutorial:

We created sound by combining sine waves, that is, with additive synthesis. This is analogous to combining sine wave oscillator signals with a mixer in the old analog synthesizers.
We gave the sound an ADSR amplitude envelope by modulating the sound wave, that is, by multiplying each sound sample by a corresponding fraction between 0 and 1. This is analogous to the function that voltage controlled amplifiers performed in the old analog synthesizers.
We learned how to convert the arbitrary, mathematical representation of the sine wave function, y = sin(x), into the form needed by your sound adapter with the conversion formula
- x = 2 * pi * (f / Fs) * N
This is a fundamental formula used in digital signal processing, and you will need it, badly.
We learned how to to avoid unwanted clicking noises when the sound starts and stops by providing the envelope with sufficient rise time and providing a turn-off slope.
We learned the basics of using Scilab to create sound.

Music Synthesis and Physics Home