Synthesized Sound is just making waves. Computers have two basic ways of recreating sound, one way is Digitized Sound (sample it, and then play it back later), the other is to synthesize it (make a waveform that approximates what you want) -- think of it like taking a picture versus sketching/drawing and painting. Synthesizing is the latter, creating pressure waves by algorithm, rather than recording it.
Sampling is great, and easy, it is basically just digital recording. But the downside is that sampling takes a lot of memory. If you care about the math, or don't believe me, it is easy to calculate how much memory sampling takes. The formula is Sample Size x Sample Rate x channels. Or plugging in real-world values; 16 bits (2 bytes) per sample, 44,000 samples per second, by two channels (stereo) -- which equals 172K/second, or 10 megabytes per minute of sound, or 600 megabytes per hour. We can use a technique called compression to get that sound in about 1/4th - 1/8th the amount of space -- we can also lower the quality of the sound and reduce the size further, but it still takes a lot of memory for digitized sound. So instead of recreating sound by sampling, we can synthesize the sound.
Synthesis is creating a sound from various parts (or descriptions about the wave). The waves frequency, level, shape, and many secondary and tertiary effects on those waves all describe the "synthesized" sounds created. Then you just describe what notes or tones to play, using those synthesized instruments. You end up with something like a sheet (or sheets) of music, which is a highly efficient way of tell the orchestra or band what they should play. This is kind of like a compression that takes 1/100th or 1/100th the space to describe as the original music.
It can get pretty complex, and synthesis does not work well for just any sound. For example: it is very hard to synthesize a human voice (so that it sounds like a human). But Synthesis can be done pretty well for most instruments, and for music. And there are some serious advantages to synthesized music; like you can just substitute instruments (after the fact), change tempo, or alter individual instruments separate from the rest, and so on. Just like being a conductor, you can change the way the orchestra plays; where when you have digitized sounds you pretty much get to listen to them the way they were recorded.
The basic wave tones (notes) are simple (or compound) continuous frequencies that are easy (mathematically) to describe. The basic ones are as follows -
All are pretty easy to describe, however these synthetic waves don't look exactly like the wave an instrument will make when making a note. A perfect tone may look like a Sine Wave, and some tones may look similar to triangle waves or square waves, but never exactly. The sounds are "too pure" and so sound electronic. Real instruments have complex waves, and slop that makes them sound natural, but harder to recreate.
Wave Table -- We can make a compromise between sampled sound and synthesized sound. We can sample an individual wave form for an instrument. Then we use that wave (repeatedly) to represent an instrument. By speeding up (or slowing down) the rate that wave is played, we can increase and decrease the pitch -- and play different notes (tones) for that same wave (instrument). This is all done with tables to describe the wave -- hence the name "Wave Table Synthesis". It is still a sample -- but it is only one, or a few, wave forms long.
This technique produces sounds far more like the original instrument than traditional synthesized sounds, and it takes far less memory than digitized sound. Sampling the whole musical piece still gives a richer and more realistic experience, but also requires far more memory, and gives you far less control to alter it. Each way has their strengths and weaknesses, but in most cases Digitized Sound is much better if you have the memory to do it.
The wave form is only part of a sound -- it is the tone generated by an instrument. But instruments do not just create a tone (tones don't start and stop perfectly). They vary in volume as each note is created.
For example: hitting the key on a piano has many levels to that sound. The sound has the initial sound (the hammer hitting the string and releasing), after the strike, the note drops to a sustaining of that sound, and it slowly fades, and then the release of the pedal when the sound cuts off (but not instantly). A similar thing happens with wind instruments -- the surge of sound as the note is first made, then after the initial surge the note drops to the sustain level of the note, and the release as the musician plays out or releases the valve and goes on to the next note.
We have to describe those features of a sound to synthesizers it well. Those stages are described as ADSR -- Attack-Decay-Sustain-Release. The attack is the initial rise and level of start of the note, decay is how far and quickly it drops to the sustain level. Sustain is how level the tone is and how long it lasts, and finally the release; which can be sharp or shallow, depending on the instrument or musician.
The tone (wave form) of the instrument gets mixed with this ADSR description of the sound (note), and the result is a pretty good representation of the way instruments sound in real life; and it helps make the difference from playing notes to making music.
So synthesis is constructing sounds (music) from many different parts. Not only do you want to tell the computer what notes to play, what instrument to use, what speed to play those notes back at -- but you also want to be able to tell the computer about the wave form and the ADSR if the tone.
A little history
For those who care, historically, the Macs were far more advanced or pioneering than PC's. Macs did both digitized sounds and powerful wave table sound since 1984. And Macs quickly moved up in quality and went for stereo output and input. It took years for PC's to get both, and when they did, they weren't as good. In the interim PC's had simple tone generation. Commodore Amiga's took synthesized sounds and wave-forms to the next level around 1986 or 1987; and later PC's, and Macs, borrowed from those concepts.
The Mac had a software based synthesizer and digitized sound; so it could adapt quickly, and each program could modify capabilities or do its own thing, and upgrades could just be done with a new program or software patch. The PC's were usually hardware based, which took less computing time, but you had different capabilities depending on which card you had, and it took forever for any standards to be made. The Macs sound architecture was better, more compatible version to version, and more versatile, but it took more CPU overhead.
Ironically, the powerful digitized sound of Macs, and the fact that software synthesis took as much (or more) processing time than digital playback, meant that people didn't use wave tables as much and just opted for digitized sound. While on the PC's they had memory and bandwidth limitations that made them inferior for digital sound. Plus PC's had hardware synthesizers that took less processing power and I/O bandwidth than digital sound, so this forced more people to go for wave-table based computer sounds.
Now days, both platforms seem to have roughly similar capabilities.
I believe this article will give you a very good understanding of synthesized sound, and how it works in a computer. This article was in response to a reader who innocently asked for an explanation about how Wave Table sound works; to explain that, I first had to make sure that all the other "foundation" was explained as well. Ask a simple question, and get a long winded-answer.
The Mac was the first mainstream computer I know of to include all the support necessary to create good sample sounds on a computer. It was also the first computer to include a microphone to digitize sounds (input them). The Mac changed computer sounds forever -- from the pathetic little beep of the PC's, to the rich CD-Quality sound of our computers today.
Some computers before Macs had tone-generators, and through some fancy manipulation you could do 1 bit sound sampling (not very clear). I remember writing such code back in the days of the early AppleII's or the Commodore PET's (1979). I was so impressed that I could play back a few seconds of scratchy and distorted, but recognizable, audio. Why it had almost the quality and fidelity of a telephone. A few years later, I remember the first games with a few seconds of digitized sound. Now, 20 years later, almost all music is recorded and managed digitally and with our computers. And now you have an idea of how it works.