Introduction
This is the manual for Synthizer, a library for 3D audio and synthesis with a focus on games and VR applications.
There are 3 ways to learn Synthizer:
- This manual acts as a conceptual overview, as well as an API and object reference. If you have worked with audio libraries before, you can likely read the concepts section and hit the ground running. In particular, Synthizer is much like a more limited (but faster) form of WebAudio.
- The Python bindings are no longer officially extended without community contribution, but their repository contains a number of examples and tutorials.
- Finally, the official repository contains a number of C examples.
Where are the Python Bindings?
The Python bindings are found at this repository which now also contains the docs and a set of examples. They are now maintained separately.
The Synthizer C API
Synthizer has the following headers:
synthizer.h
: All library functionssynthizer_constants.h
: Constants, i.e. the very large property enum.
These headers are separated because it is easier in some cases to machine parse
synthizer_constants.h
when writing bindings for languages which don't have
good automatic generation, then manually handle synthizer.h
.
The Synthizer C API returns errors and writes results to out parameters. Out parameters are always the first parameters of a function, and errors are always nonzero. Note that error codes are currently not defined; they will be, once things are more stable.
It is possible to get information on the last error using these functions:
SYZ_CAPI syz_ErrorCode syz_getLastErrorCode(void);
SYZ_CAPI const char *syz_getLastErrorMessage(void);
These are thread-local functions. The last error message is valid until the next call into Synthizer.
Logging, Initialization, and Shutdown
The following excerpts from synthizer.h
specify the logging and initialization
API. Explanation follows:
enum SYZ_LOGGING_BACKEND {
SYZ_LOGGING_BACKEND_NONE,
SYZ_LOGGING_BACKEND_STDERR,
};
enum SYZ_LOG_LEVEL {
SYZ_LOG_LEVEL_ERROR = 0,
SYZ_LOG_LEVEL_WARN = 10,
SYZ_LOG_LEVEL_INFO = 20,
SYZ_LOG_LEVEL_DEBUG = 30,
};
struct syz_LibraryConfig {
unsigned int log_level;
unsigned int logging_backend;
const char *libsndfile_path;
};
SYZ_CAPI void syz_libraryConfigSetDefaults(struct syz_LibraryConfig *config);
SYZ_CAPI syz_ErrorCode syz_initialize(void);
SYZ_CAPI syz_ErrorCode syz_initializeWithConfig(const struct syz_LibraryConfig *config);
SYZ_CAPI syz_ErrorCode syz_shutdown();
Synthizer can be initialized in two ways. The simplest is syz_initialize
which will use reasonable library defaults for most apps. The second is
(excluding error checking):
struct syz_LibraryConfig cfg;
syz_libraryConfigSetDefaults(&config);
syz_initializeWithConfig(&config);
In particular, the latter approach allows for enabling logging and loading Libsndfile.
Currently Synthizer can only log to stderr and logging is slow enough that it shouldn't be enabled in production. It mostly exists for debugging. In future these restrictions will be lifted.
For more information on Libsndfile support, see the dedicated chapter.
Handles and userdata
Synthizer objects are referred to via reference-counted handles, with an optional void *
userdata pointer that can be
associated to link them to application state:
SYZ_CAPI syz_ErrorCode syz_handleIncRef(syz_Handle handle);
SYZ_CAPI syz_ErrorCode syz_handleDecRef(syz_Handle handle);
SYZ_CAPI syz_ErrorCode syz_handleGetObjectType(int *out, syz_Handle handle);
SYZ_CAPI syz_ErrorCode syz_handleGetUserdata(void **out, syz_Handle handle);
typedef void syz_UserdataFreeCallback(void *);
SYZ_CAPI syz_ErrorCode syz_handleSetUserdata(syz_Handle handle, void *userdata, syz_UserdataFreeCallback *free_callback);
basics of Handles
All Synthizer handles start with a reference count of 1. When the reference count reaches 0, the object is scheduled
for deletion, but may not be deleted immediately. Uniquely among Synthizer functions, syz_handleIncRef
and
syz_handleDecRef
can be called after library shutdown in order to allow languages like Rust to implement infallible
cloning and freeing. The issues introduced with respect to object lifetimes due to the fact that Synthizer objects may
stay around for a while can be dealt with userdata support, as described below.
No interface except for syz_handleDecRef
decrements reference counts in a way which should be visible to the
application provided that the application is itself using reference counts correctly.
Synthizer objects are like classes. They have "methods" and "bases". For example all generators support a common set
of operations named with a syz_generatorXXX
prefix.
The reserved config argument
Many constructors take a void *config
argument. This must be set to NULL, and is reserved for future use.
Userdata
Synthizer makes it possible to associate application data via a void *
pointer which will share the object's actual
lifetime rather than the lifetime of the handle to the object. This is useful for allowing applications to store state,
but also helps to deal with the lifetime issues introduced by the mismatch between the last reference count of the
object dying and the object actually dying. For example, the Rust and Python bindings use userdata to attach buffers to
objects when streaming from memory, so that the actual underlying resource stays around until Synthizer is guaranteed to
no longer use it.
Getting and setting userdata pointers is done in one of two ways. All Synthizer constructors take two additional
parameters to set the userdata and the free callback. Alternatively, an application can go through syz_getUserdata
and syz_setUserdata
. These are a threadsafe interface which will associate a void *
argument with the object. This
interface acts as if the operations were wrapped in a mutex internally, though they complete with no syscalls in all
reasonable cases of library usage.
The free_callback
parameter to syz_setUserdata
is optional. If present, it will be called on the userdata pointer
when the object is destroyed or when a new userdata pointer is set. Due to limitations of efficient audio programming,
this free happens in a background thread and may occur up to hundreds of milliseconds after the object no longer exists.
Most bindings will bind userdata support in a more friendly way. For example, Python provides a get_userdata
and
set_userdata
pair which work on normal Python objects.
basics of Audio in Synthizer
This section explains how to get audio into and out of Synthizer. The following objects must be used by every application:
- Generators produce audio, for example by reading a buffer of audio data.
- Sources play audio from one or more generators.
- Contexts represent audio devices and group objects for the same device together.
The most basic flow of Synthizer is to create a context, source, and generator,
then connect the generator to the source. For example, you might combine
BufferGenerator and
DirectSource to play a stereo audio file
to the speakers, or swap DirectSource
for
Source3D to place the sound in the 3D
environment.
The Context
Contexts represent audio devices and the listener in 3D space. They:
- Figure out the output channel format necessary for the current audio device and convert audio to it.
- Offer the ability to globally set gain.
- Let users set the position of the listener in 3D space.
- Let users set defaults for other objects, primarily the distance model and
panning strategies.
- If your application wants HRTF on by default, it is done on the context by
setting
SYZ_P_PANNER_STRATEGY
.
- If your application wants HRTF on by default, it is done on the context by
setting
For more information on 3D audio, see the dedicated section.
Almost all objects in Synthizer require a context to be created and must be used only with the context they're associated with.
A common question is whether an app should ever have more than one context. Though this is possible, contexts are very expensive objects that directly correspond to audio devices. Having 2 or 3 is the upper limit of what is reasonable, but it is by far easiest to only have one as this prevents running into issues where you mix objects from different contexts together.
Introduction to Generators
generators are how audio first enters Synthizer. They can do things like play
a buffer, generate
noise, or stream audio
data. By themselves, they're
silent and don't do anything, so they must be connected to
sources via syz_sourceAddGenerator
.
Generators are like a stereo without speakers: you have to plug them into
something else before they're audible. In this case the "something else" is a
source. Synthizer only supports using a generator with one source at a time,
but every source can have multiple generators. That is, given generators g1
and g2
and sources s1
and s2
, then g1
and g2
could be connected to
s1
, g1
to s1
and g2
to s2
, but not g1
to both s1
and s2
at the
same time.
Introduction to Sources
Sources are how generators are made audible. Synthizer offers 3 main kinds of source:
- The DirectSource plays audio directly, and can be used for things like background music. This is the only source type which won't convert audio to mono before using it.
- The ScalarPannedSource and AngularPannedSource allow for manual control over pan, either by azimuth/elevation or via a scalar from -1 to 1 where -1 is all left and 1 is all right.
- The Source3D allows for positioning audio in 3D space.
Every source offers the following functions:
SYZ_CAPI syz_ErrorCode syz_sourceAddGenerator(syz_Handle source, syz_Handle generator);
SYZ_CAPI syz_ErrorCode syz_sourceRemoveGenerator(syz_Handle source, syz_Handle generator);
Every source will mix audio from as many generators as are connected to it and then feed the audio through to the output of the source and to effects. See the section on channel mixing for how audio is converted to various different output formats, and effects and filters for information on how to do more with this API than simply playing audio.
Controlling Object Properties
Basics
most interesting audio control happens through properties, which are like knobs
on hardware controllers or dials in your DAW. Synthizer picks the values of
properties up on the next audio tick and automatically handles crossfading and
graceful changes as your app drives the values. Every property is identified
with a SYZ_P
constant in synthizer_constants.h
. IN bindings,
SYZ_P_MY_PROPERTY
will generally become my_property
or MyProperty
or etc.
depending on the dominant style of the language, and then either become an
actual settable property or a get_property
and set_property
pair depending
on if the language in question supports customized properties that aren't just
member variables (e.g. @property
in Python, properties in C#).
All properties are of one of the following types:
int
ordouble
, identified by ai
andd
suffix in the property API, the standard C primitive types.double3
anddouble6
, identified byd3
andd6
suffixes, vectors of 3 doubles and 6 doubles respectively. Primarily used to set position and orientation.object
, identified by ao
suffix, used to set object properties such as the buffer to use for a buffer generator.biquad
, configuration for a biquad filter. Used on effects and sources to allow filtering audio.
No property constant represents a property of two types. For example
SYZ_P_POSITION
is both on Context
and Source3D
but is a d3
in both
cases. Generators use SYZ_P_PLAYBACK_POSITION
, which is always a double.
Synthizer will always maintain this constraint.
The Property API is as follows:
SYZ_CAPI syz_ErrorCode syz_getI(int *out, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setI(syz_Handle target, int property, int value);
SYZ_CAPI syz_ErrorCode syz_getD(double *out, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setD(syz_Handle target, int property, double value);
SYZ_CAPI syz_ErrorCode syz_setO(syz_Handle target, int property, syz_Handle value);
SYZ_CAPI syz_ErrorCode syz_getD3(double *x, double *y, double *z, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setD3(syz_Handle target, int property, double x, double y, double z);
SYZ_CAPI syz_ErrorCode syz_getD6(double *x1, double *y1, double *z1, double *x2, double *y2, double *z2, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setD6(syz_Handle handle, int property, double x1, double y1, double z1, double x2, double y2, double z2);
SYZ_CAPI syz_ErrorCode syz_getBiquad(struct syz_BiquadConfig *filter, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setBiquad(syz_Handle target, int property, const struct syz_BiquadConfig *filter);
Property accesses happen without syscalls and are usually atomic operations and enqueues on a lockfree queue.
Object Properties Are Weak
Object properties do not increment the reference count of the handle associated with them. There isn't much to say here, but it is important enough that it's worth calling out with a section. For example, if you set the buffer on a buffer generator and then decrement the buffer's reference count to 0, the generator will stop playing audio rather than keeping the buffer alive.
A Note on Reading
Property reads need to be further explained. Because audio programming requires
not blocking the audio thread, Synthizer internally uses queues for property
writes. This means that any read may be outdated by some amount, even if the
thread making the read just set the value. Typically, reads should be reserved
for properties that Synthizer also sets (e.g. SYZ_P_PLAYBAKCK_POSITION
) or
used for debugging purposes.
syz_getO
is not offered by this API because it requires a mutex, which the
audio thread also can't handle. Additionally, object lifetime concerns make it
more difficult for such an interface to do something sane.
Though the above limitations prevent this anyway, it is in general an antipattern to store application state in your audio library. Even if reads were always up to date, it would still be slow to get data back out. Applications should keep things like object position around and update Synthizer, rather than asking Synthizer what the last value was.
Setting Gain/volume
SYZ_P_GAIN
All objects which play audio (generators, sources, contexts) offer a
SYZ_P_GAIN
property, a double scalar between 0.0 and infinity which controls
object volume. For example 2.0
is twice the amplitude and 0.5
is half the
amplitude. This works as you'd expect: if set on a generator it only affects
that generator, if on a source it affects everything connected to the source,
and so on. If a generator is set to 0.5 and the source that it's on is also
0.5, the output volume of the generator is 0.25 because both gains apply in
order.
This means that it is possible to control the volume of generators relative to each other when all connected to the same source, then control the overall volume of the source.
A Note on Human Perception
Humans don't perceive amplitude changes as you'd expect. For example, moving from 1.0 to 2.0 will generally sound like a large gap in volume, but from 2.0 to 3.0 much less so, and so on. Most audio applications that expose volume sliders to humans will expose them as decibels and convert to an amplitude factor internally. If you're just writing a game, you can mostly ignore this, but if you're doing something more complicated a proper understanding of decibels is important. In decibals, a gain of 1.0 is at 0 dB and every increase and/or decrease by 1 decibel sounds like the same amount of loudness as any other. The specific formulas to get to and/or from a gain are as follows:
decibels = 20 * log10(gain)
gain = 10**(db/20)
Where **
is exponentiation.
The obvious question is of course "why not expose this as decibels?" The problem with decibels is that gains over 1.0 will clip in most applications. But a gain of 1.0 in decibels is 0 dB. If there are two incredibly loud sounds both with a gain of 1.0 playing at the same time, the overall gain is effectively 2.0, which can clip in the same way. But 0 dB + 0 dB is still 0 dB even though the correct gain is 2.0. This gets worse for gains below 0. Consider 0.5, which is equivalent to roughly -6 dB. 0.5 + 0.5 is 1, but -6 + -6 is -12 dB. Which isn't only wrong, it even moved in the wrong direction all together.
As a consequence Synthizer always uses multiplicative factors on the amplitude, not decibels. Unless you know what you're doing, you should convert to gain as soon as possible and reason about how this works as a multiplier.
Pausing and Resuming Playback
All objects which play audio offer the following two functions:
SYZ_CAPI syz_ErrorCode syz_pause(syz_Handle object);
SYZ_CAPI syz_ErrorCode syz_play(syz_Handle object);
Which do exactly what they seem like they do.
In bindings, these are usually bound as instance methods, e.g. myobj.pause()
.
Configuring Objects to Continue Playing Until Silent
By default, Synthizer objects become silent when their reference counts go to 0, but this isn't always what you want. Sometimes, it is desirable to be able to continue playing audio until the object is "finished", for example for gunshots or other one-off effects. Synthizer calls this lingering, and offers the following API to configure it:
struct syz_DeleteBehaviorConfig {
int linger;
double linger_timeout;
};
SYZ_CAPI void syz_initDeleteBehaviorConfig(struct syz_DeleteBehaviorConfig *cfg);
SYZ_CAPI syz_ErrorCode syz_configDeleteBehavior(syz_Handle object, struct syz_DeleteBehaviorConfig *cfg);
To use it, call syz_initDeleteBehaviorConfig
on an empty syz_DeleteBehaviorConfig
struct, fill out the struct, and
call syz_configDeleteBehavior
. The fields have the following meaning:
linger
: if 0, die immediately, which is the default. If 1, keep the object around until it "finishes". What this means depends on the object and is documented in the object reference, but it generally "does what you'd expect". For some examples:BufferGenerator
will stop any looping and play until the end of the buffer, or die immediately if paused.- All sources will keep going until all their generators are no longer around.
linger_timeout
: if nonzero, set an upper bound on the amount of time an object may linger for. This is useful as a sanity check in your application.
These functions only configure what happens when the last reference to an object goes away and do not destroy the
object or manipulate the reference count in any other way. It is valid to call them immediately after object creation
if desired. No Synthizer interface besides syz_handleDecRef
will destroy an object unless otherwise explicitly
documented.
Lingering doesn't keep related objects alive. For example a BufferGenerator
that is lingering still goes silent if
the buffer attached to it is destroyed.
As with pausing, bindings usually make this an instance method.
Decoding Audio Data
The Quick Overview
Synthizer supports mp3, wav, and flac. If you need more formats, then you can load Libsndfile or decode the data yourself.
If you need to read from a file, use e.g. syz_createBufferFromFile
. If you
need to read from memory, use e.g. syz_createBufferFromEncodedData
. If you
need to just shove floats at Synthizer, use syz_creatBufferFromFloatArray
.
StreamingGenerator has a similar
set of methods. In general you can find out what methods are available in the
object reference. Everything supports some function that's equivalent to
syz_createBufferFromFile
.
These functions are the most stable interface because they can be easily supported across incompatible library versions. If your app can use them, it should do so.
Streams
Almost all of these methods wrap and hide something called a stream handle,
which can be created with e.g. syz_createStreamHandleFromFile
, then used with
e.g. syz_createBufferFromStreamHandle
. Bindings expose this to you, usually
with classes or your language's equivalent (e.g. in Python this is
StreamHandle
). This is used to get data from custom sources, for example the
network or encrypted asset stores. For info on writing your own streams, see
the dedicated section.
In addition to get streams via specific methods, Synthizer also exposes a generic interface:
SYZ_CAPI syz_ErrorCode syz_createStreamHandleFromStreamParams(syz_Handle *out, const char *protocol, const char *path, void *param);
Using the generic interface, streams are referred to with:
- A protocol, for example
"file"
, which specifies the kind of stream it is. Users can register their own protocols. - A path, for example to a file on disk. This is protocol-specific.
- And a
void *
param, which is passed through to the underlying stream implementation, and currently ignored by Synthizer.
So, for example, you might get a file by:
syz_createStreamHandleFromStreamParams("file", path, NULL);
Streams don't support raw data. They're always an encoded asset. So for
example mp3 streams are a thing, but floats at 44100 streams aren't. Synthizer
will offer a better interface for raw audio data pending there being enough
demand and a reason to go beyond syz_createBufferFromFloatArray
.
Loading Libsndfile
Synthizer supports 3 built-in audio formats: wav, MP3, and Flac. For apps which
need more, Synthizer supports loading
Libsndfile. To do so, use
syz_initializeWithConfig
and configure libsndfile_path
to be the absolute
path to a Libsnddfile shared object (.dll
, .so
, etc). Libsndfile will then
automatically be used where possible, replacing the built-in decoders.
Unfortunately, due to Libsndfile limitations, Libsndfile can only be used on seekable streams of known length. All Synthizer-provided methods of decoding currently support this, but custom streams may opt not to do so, for example if they're reading from the network. In this case, Libsndfile will be skipped. To see if this is happening, enable debug logging at library initialization and Synthizer will log what decoders it's trying to use.
Because of licensing incompatibilities, Libsndfile cannot be statically linked with Synthizer without effectively changing Synthizer's license to LGPL. Consequently dynamic linking with explicit configuration is the only way to use it. Your app will need to arrange to distribute a Libsndfile binary as well and use the procedure described above to load it.
Implementing Custom Streams and Custom Stream Protocols
Synthizer supports implementing custom streams in order to read from places that aren't files or memory: encrypted asset stores, the network, and so on. This section explains how to implement them.
before continuing, carefully consider whether you need this. Implementing a stream in a higher level language and forcing Synthizer to go through it has a small but likely noticeable performance hit. It'll work fine, but the built-in functionality will certainly be faster and more scalable. Implementing a stream in C is a complex process. If your app can use the already-existing funtionality, it is encouraged to do so.
A Complete Python Example
The rest of this section will explain in detail how streams work from the C API, but this is a very complex topic and most of the infrastructure which exists for it exists to make it possible to write convenient bindings. Consequently, here is a complete and simple custom stream which wraps a Python file object, registered as a custom protocol:
class CustomStream:
def __init__(self, path):
self.file = open(path, "rb")
def read(self, size):
return self.file.read(size)
def seek(self, position):
self.file.seek(position)
def close(self):
self.file.close()
def get_length(self):
pos = self.file.tell()
len = self.file.seek(0, 2)
self.file.seek(pos)
return len
def factory(protocol, path, param):
return CustomStream(path)
synthizer.register_stream_protocol("custom", factory)
gen = synthizer.StreamingGenerator.from_stream_params(ctx, "custom", sys.argv[1])
Your bindings will document how to do this, for example in Python see
help(synthizer.register_stream_protocol)
. it's usually going to be this level
of complexity when doing it from a binding. The rest of this section explains
what's going on from the C perspective, but non-C users are still encouraged to
read it because it explains the general idea and offers best practices for
efficient and stable stream usage.
It's important to note that though this example demonstrates using
StreamingGenerator
, buffers have similar methods to decode themselves from
streams. Since StreamingGenerator
has a large latency for anything but the
initial start-up, the primary use case is actually likely to be buffers.
The C Interface
To define a custom stream, the following types are used:
typedef int syz_StreamReadCallback(unsigned long long *read, unsigned long long requested, char *destination, void *userdata, const char ** err_msg);
typedef int syz_StreamSeekCallback(unsigned long long pos, void *userdata, const char **err_msg);
typedef int syz_StreamCloseCallback(void *userdata, const char **err_msg);
typedef void syz_StreamDestroyCallback(void *userdata);
struct syz_CustomStreamDef {
syz_StreamReadCallback *read_cb;
syz_StreamSeekCallback *seek_cb;
syz_StreamCloseCallback *close_cb;
syz_StreamDestroyCallback *destroy_cb;
long long length;
void *userdata;
};
SYZ_CAPI syz_ErrorCode syz_createStreamHandleFromCustomStream(syz_Handle *out, struct syz_CustomStreamDef *callbacks);
typedef int syz_StreamOpenCallback(struct syz_CustomStreamDef *callbacks, const char *protocol, const char *path, void *param, void *userdata, const char **err_msg);
SYZ_CAPI syz_ErrorCode syz_registerStreamProtocol(const char *protocol, syz_StreamOpenCallback *callback, void *userdata);
The following sections explain how these functions work.
Ways To Get A Custom Stream
There are two ways to get a custom stream. You can:
- Fill out the callbacks in
syz_CustomStreamDef
and usesyz_createStreamHandleFromCustomStream
. - Write a function which will fill out
syz_CustomStreamDef
from the standard stream parameters, and register a protocol withsyz_registerStreamProtocol
.
The difference between these is the scope: if you don't register a protocol,
only your app can access the custom stream, presumably via a module that
produces them. This is good because it keeps things modular. If registering a
protocol, however, the protocol can be used from anywhere in the process,
including other libraries and modules. For example, writing a
encrypted_sqlite3
protocol C library could then be used to add the
"encrypted_sqlite3"
protocol to any language.
Protocol names must be unique. The behavior is undefined if they aren't. A
good way of ensuring this is to namespace them. For example,
"ahicks92.my_super_special_protocol"
.
The void *param
parameter is reserved for your implementation, and passed to
the factory callback if using the stream parameters approach. It's assumed that
implementations going through syz_createStreamHandleFromCustomStreamDef
already have a way to move this information around.
Non-callback syz_CustomStreamDef
parameters
These are:
length
, which must be set and known for seekable streams. If the length of the stream is unknown, set it to -1.userdata
, which is passed as the userdata parameter to all stream callbacks.
The Stream Callbacks
Streams have the following callbacks, with mostly self-explanatory parameters:
- If going through the protocol interface, the open callback is called when the
stream is first opened. If going through
syz_createStreamHandleFromCustomStream
, it is assumed that the app already opened the stream and has put whatever it is going to need into theuserdata
field. - After that, the read and (if present) seek callbacks are called until the stream is no longer needed. The seek callback is optional.
- The close callback is called when Synthizer will no longer use the underlying asset.
- The destroy callback, optional, is called when it is safe to free all resources the stream is using.
For more information on why we offer both the close and destroy callback, see below on error handling.
All callbacks should return 0 on success, and (if necessary) write to their out parameters.
The read callback must always read exactly as many bytes are requested, never more. If it reads less bytes than were requested, Synthizer treats this as an end-of-stream condition. If the end of the stream has already been reached, the read callback should claim it read no bytes.
The seek callback is optional. Streams don't need to support seeking, but this disables seeking in StreamingGenerator
. it also disables support for
Libsndfile if Libsndfile was loaded, and additionally support for decoding wav files. In order to be seekable, a stream must:
- Have a seek callback; and
- Fill out the
length
field with a positive value, the length of the stream in bytes.
Error Handling
To indicate an error, callbacks should return a non-zero return value and
(optionally) set their err_msg
parameter to a string representation of the
error. Synthizer will log these errors if logging is enabled. For more complex
error handling, apps are encouraged to ferry the information from streams to
their main threads themselves. If a stream callback fails, Synthizer will
generally stop the stream all together. Consequently, apps should do their best
to recover and never fail the stream. Synthizer takes the approach of assuming
that any error is likely unrecoverable and expects that implementations already
did their best to succeed.
if the read callback fails, the position of the stream isn't updated. If the seek callback fails, Synthizer assumes that the position didn't move.
the reason that Synthizer offers a destroy callback in addition to one for closing is so that streams may use non-static strings as error messages. Synthizer may not be done logging these when the stream is closed, so apps doing this should make sure that they live at least as long as the destroy callback, after which Synthizer promises to never use anything related to this stream again.
The simplest way to handle error messages for C users is to just use string constants, but for other languages such as Python it is useful to be able to convert errors to strings and attach them to the binding's object so that these can be logged. The destroy callback primarily exists for this use case.
Synthizer makes one more guarantee on the lifetime required of err_msg
strings: they need only live as long as the next time a stream callback is
called. This means that, for example, the Python binding only keeps the most
recent error string around and replaces it as necessary.
Thread Safety
Streams will only ever be used by one thread at a time, but may be moved between threads.
Channel Upmixing and Downmixing
Synthizer has built-in understanding of mono (1 channel) and stereo (2 channels) audio formats. It will mix other formats to these as necessary. Specifically, we:
- If converting from mono to any other format, broadcast the mono channel to all of those in the other format.
- If going to mono, sum and normalize the channels in the other format.
- Otherwise, either drop extra channels or fill extra channels with silence.
Synthizer will be extended to support surround sound in future, which will give it a proper understanding of 4, 6, and 8 channels. Since Synthizer is aimed at non-experimental home media applications, we assume that the channel count is sufficient to know what the format is going to be. For example, there is no real alternative to 5.1 audio in the home environment if the audio has 6 channels. If you need more complex multichannel handling, you can pre-convert your audio to something Synthizer understands. Otherwise, other libraries may be a better option.
3D Audio, Panning, and HRTF
Introduction
Synthizer supports panning audio through two interfaces.
First is AngularPannedSource and ScalarPannedSource, which provides simple azimuth/elevation controls and the ability to pan based off a scalar, a value between -1 (all left) and 1 (all right). In this case the user application must compute these values itself.
The second way is to use Source3D, which
simulates a 3D environment when fed positional data. This ection concerns
itself with proper use of Source3D
, which is less straightforward for those
who haven't had prior exposure to these concepts.
Introduction
There are two mandatory steps to using Source3D
as well as a few optional
ones. The two mandatory steps are these:
- On the context, update
SYZ_P_POSITION
andSYZ_P_ORIENTATION
with the listener's position and orientation - On the source, update
SYZ_P_POSITION
with the source's position.
And optionally:
- Configure the default distance model to control how rapidly sources become quiet.
- Emphasize that sources have become close to the listener with the focus boost.
- Add effects (covered in a dedicated section).
Don't Move Sources Through the Head
People frequently pick up Synthizer, then try to move the source through the
center of the listener's head, then ask why it's weird and didn't work. It is
important to realize that this is a physical simulation of reality, and that the
reason you can move the source through the listener's head in the first place is
that this isn't an easily detectable case. If you aren't driving Synthizer in a
way connected to physical reality--for example if you are attempting to use x
as a way to pan sources from left to right and not linked to a position--then
you probably want one of the raw panned sources instead.
Setting Global Defaults
In addition to controlling the listener, the context offers the ability to set
defaults for all values discussed below. This is done through a set of
SYZ_P_DEFAULT_*
properties which match the names of those on sources. This is
of particular interest to those wishing to use HRTF, which is off by default.
Synthizer's Coordinate System and Orientations
The short version for those who are familiar with libraries that need position
and orientation data: Synthizer uses a right-handed coordinate system and
configures orientation through double6 properties so that it
is possible to atomically set all 6 values at once. This is represented as a
packed at and up vector pair in the format (at_x, at_y, at_z, up_x, up_y, up_z)
on the context under the SYZ_P_ORIENTATION
property. As with every
other library doing a similar thing, these are unit vectors.
The short version for those who want a 2D coordinate system where positive y is
north, positive x is east, and player orientations are represented as degrees
clockwise of north: use only x and y of SYZ_P_POSITION
, and set
SYZ_P_ORIENTATION
as follows:
(sin(angle * PI / 180), cos(angle * PI / 180), 0, 0, 0, 1)
The longer version is as follows.
Listener positions are represented through 3 values: the position, the at vector, and the up vector. The position is self-explanatory. The at vector points in the direction the listener is looking at all times, and the up vector points out the top of the listener's head, as if there were a pole up the spine. By driving these 3 values, it is possible to represent any position and orientation a listener might assume.
Synthizer uses a right-handed coordinate system, which means that if the at vector is pointed at positive x and the up vector at positive y, positive z moves sources to the right. This is called a right-handed coordinate system because of the right-hand rule: if you point your fingers along positive x and curl them toward positive y, your finger points at positive z. This isn't a standard. Every library that does something with object positions tends to choose a different value. If combining Synthizer with other non-2D components, it may be necessary to convert between coordinate systems. Resources on how to do this may easily be found through Google.
The at and up vectors must always be orthogonal, that is forming a right angle with each other. In order to facilitate this, Synthizer uses double6 properties so that both values can and must be set at the same time. If we didn't, then there would be a brief period where one was set and the other wasn't, in which case they would temporarily be invalid. Synthizer doesn't try to validate that these vectors are orthogonal and generally tries to do its best when they aren't, but nonetheless behavior in this case is undefined.
Finally, the at and up vectors must be unit vectors: vectors of length 1.
Panning strategies and HRTF.
The panning strategy specifies how sources are to be panned. Synthizer supports the following panning strategies:
Strategy | Channels | Description |
---|---|---|
SYZ_PANNER_STRATEGY_HRTF | 2 | An HRTF implementation, intended for use via headphones. |
SYZ_PANNER_STRATEGY_STEREO | 2 | A simple stereo panning strategy assuming speakers are at -90 and 90. |
When a source is created, the panning strategy it is to use is passed via the constructor function and cannot be
changed. A special value, SYZ_PANNER_STRATEGY_DELEGATE
allows the source to delegate this to the context, and can be
used in cases where the context's configuration should be preferred. A vast majority of applications will do this
configuration via the context and SYZ_PANNER_STRATEGY_DELEGATE
; other values should be safed for cases in which you
wish to mix panning types.
By default Synthizer is configured to use a stereo panning strategy, which simply pans between two speakers. This is because stereo panning strategies work on all devices from headphones to 5.1 surround sound systems, and it is not possible for Synthizer to reliably determine if the user is using headphones or not. HRTF provides a much better experience for headphone users but must be enabled by your application through setting the default panner strategy or doing so on individual sources.
Since panning strategies are per source, it is possible to have sources using different panning strategies. This is useful for two reasons: HRTF is expensive enough that you may wish to disable it if dealing with hundreds or thousands of sources, and it is sometimes useful to let UI elements use a different panning strategy. An example of this latter case is an audio gauge which pans from left to right.
Distance Models
The distance model controls how quickly sources become quiet as they move away from the listener. This is controlled through the following properties:
SYZ_P_DISTANCE_MODEL
: which of the distance model formulas to use.SYZ_P_DISTANCE_MAX
: the maximum distance at which the source will be audible.SYZ_P_DISTANCE_REF
: if you assume your source is a sphere, what's the radius of it?SYZ_P_ROLLOFF
: with some formulas, how rapidly does the sound get quieter? Generally, configuring this to a higher value makes the sound drop off more immediately near the head, then have more subtle changes at further distances.
It is not possible to provide generally applicable advice for what you should set the distance model to. A game using meters needs very different settings than one using feet or light years. Furthermore, these don't have concrete physical correspondances. Of the things Synthizer offers, this is possibly the least physically motivated and the most artistic from a game design perspective. In other words: play with different values and see what you like.
The concrete formulas for the distance models are as follows. Let d
be the
distance to the source, d_ref
the reference distance, d_max
the max
distance, r
the roll-off factor. Then the gain of the source is computed as a
linear scalar using one of the following formulas:
Model | Formula |
---|---|
SYZ_DISTANCE_MODEL_NONE | 1.0 |
SYZ_DISTANCE_MODEL_LINEAR | 1 - r * (clamp(d, d_ref, d_max) - d_ref) / (d_max - d_ref); |
SYZ_DISTANCE_MODEL_EXPONENTIAL when d_ref == 0.0 | 0.0 |
SYZ_DISTANCE_MODEL_EXPONENTIAL when d_ref > 0.0 | (max(d_ref, d) / d_ref) ** -r |
SYZ_DISTANCE_MODEL_INVERSE when d_ref = 0.0 | 0.0 |
SYZ_DISTANCE_MODEL_INVERSE when d_ref > 0.0 | d_ref / (d_ref + r * max(d, d_ref) - d_ref) |
The Closeness Boost
Sometimes, it is desirable to make sources "pop out" of the background environment. For example, if the player approaches an object with which they can interact, making it noticeably louder as the boundary is crossed can be useful. This is of primary interest to audiogame designers, a type of game for the blind, as it can be used to emphasize features of the environment in non-realistic but informative ways.
This is controlled through two properties:
SYZ_P_CLOSENESS_BOOST
: a value in DB controlling how much louder to make the sound. Negative values are allowed.SYZ_P_CLOSENESS_BOOST_DISTANCE
: when the source is closer than this distance, begin applying the closeness boost.
When the source is closer than the configured distance, the normal gain computation still applies, but an additional factor, the number of DB in the closeness boost, is added. This means that it is still possible for players to know if they are getting closer to the source.
The reason that the closeness boost is specified inDB is that otherwise it would require values greater than 1.0, and it is primarily going to be fed from artists and map developers. If we discover that this is a problem in future, it will be patched in a major Synthizer version bump.
Note that closeness boost has not gotten a lot of use yet. Though we are unlikely to remove the interface, the internal algorithms backing it might change.
Filters and Effects
Synthizer supports filters and effects in order to add environmental audio and do more than just playing sources in a vacuum. These sections explain how this works.
Filters
Synthizer supports a filter property type, as well as filters on effect sends. The API for this is as follows:
struct syz_BiquadConfig {
...
};
SYZ_CAPI syz_ErrorCode syz_getBiquad(struct syz_BiquadConfig *filter, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setBiquad(syz_Handle target, int property, const struct syz_BiquadConfig *filter);
SYZ_CAPI syz_ErrorCode syz_biquadDesignIdentity(struct syz_BiquadConfig *filter);
SYZ_CAPI syz_ErrorCode syz_biquadDesignLowpass(struct syz_BiquadConfig *filter, double frequency, double q);
SYZ_CAPI syz_ErrorCode syz_biquadDesignHighpass(struct syz_BiquadConfig *filter, double frequency, double q);
SYZ_CAPI syz_ErrorCode syz_biquadDesignBandpass(struct syz_BiquadConfig *filter, double frequency, double bandwidth);
See properties for how to set filter properties and effects for how to apply filters to effect sends.
The struct syz_BiquadConfig
is an opaque struct whose fields are only exposed
to allow allocating them on the stack. It represents configuration for a biquad
filter, designed using the Audio EQ
Cookbook. It's initialized with one of the
above design functions.
A suggested default for q
is 0.7071135624381276
, which gives Buttererworth
lowpass and highpass filters. For those not already familiar with biquad
filters, q
controls resonance: higher values of q
will cause the filter to
ring for some period of time.
All sources support filters, which may be installed in 3 places:
SYZ_P_FILTER
: applies to all audio traveling through the source.SYZ_P_FILTER_DIRECT
: applied afterSYZ_P_FILTER
to audio going directly to the speakers/through panners.SYZ_P_FILTER_EFFECTS
: Applied afterSYZ_P_FILTER
to audio going to effects.
This allows filtering the audio to effects separately, for example to cut high frequencies out of reverb on a source-by-source basis.
Additionally, all effects support a SYZ_P_FILTER_INPUT
, which applies to all
input audio to the effect. So, you can either have:
source filter -> direct path filter -> speakers
Or:
source filter -> effects filter outgoing from source -> filter on effect send -> input filter to effect -> effect
In future, Synthizer will stabilize the syz_BiquadConfig
struct and use it to
expose more options, e.g. automated filter modulation.
Effects and Effect Routing
users of the Synthizer API can route any number of sources to any number of global effects, for example echo. This is done through the following C API:
struct syz_RouteConfig {
double gain;
double fade_time;
syz_BiquadConfig filter;
};
SYZ_CAPI syz_ErrorCode syz_initRouteConfig(struct syz_RouteConfig *cfg);
SYZ_CAPI syz_ErrorCode syz_routingConfigRoute(syz_Handle context, syz_Handle output, syz_Handle input, struct syz_RouteConfig *config);
SYZ_CAPI syz_ErrorCode syz_routingRemoveRoute(syz_Handle context, syz_Handle output, syz_Handle input, double fade_out);
SYZ_CAPI syz_ErrorCode syz_routingRemoveAllRoutes(syz_Handle context, syz_Handle output, double fade_out);
Routes are uniquely identified by the output object (Source3D, etc) and input object (Echo, etc). There is no route handle type, nor is it possible to form duplicate routes.
In order to establish or update the parameters of a route, use
syz_routingConfigRoute
. This will form a route if there wasn't already one,
and update the parameters as necessary.
It is necessary to initialize syz_RouteConfig
with syz_initRouteConfig
before using it, but this need only be done once. After that, reusing the same
syz_RouteConfig
for a route without reinitializing it is encouraged.
Gains are per route and apply after the gain of the source. For example, you might feed 70% of a source's output to something (gain = 0.7).
Filters are also per route and apply after any filters on sources. For example, this can be used to change the filter on a per-reverb basis for a reverb zone algorithm that feeds sources to more than one reverb at a time.
In order to remove a route, use syz_routingRemoveRoute
. Alternatively, syz_routingRemoveAllRoutes
can remove all
routes from a source.
Many effects involve feedback and/or other long-running audio as part of their intended function. But while in development, it is often useful to reset an effect. Synthizer exposes a function for this purpose:
SYZ_CAPI syz_ErrorCode syz_effectReset(syz_Handle effect);
Which will work on any effect (at most, it does nothing). As with things like property access this is slow, and it's also not going to sound good, but it can do things like clear out the feedback paths of a reverb at the Python shell for interactive experimentation purposes.
Events
Synthizer supports receiving events. Currently, this is limited to knowing when buffer/streaming generators have looped and/or finished. Note that the use case of destroying objects only after they have stopped playing is better handled with lingering.
The API for this is as follows:
struct syz_Event {
int type;
syz_Handle source;
syz_Handle context;
};
SYZ_CAPI syz_ErrorCode syz_contextEnableEvents(syz_Handle context);
SYZ_CAPI syz_ErrorCode syz_contextGetNextEvent(struct syz_Event *out, syz_Handle context, unsigned long long flags);
SYZ_CAPI void syz_eventDeinit(struct syz_Event *event);
To begin receiving events, an application should call syz_contextEnableEvents
.
This cannot be undone. After a call to syz_contextEnableEvents
, events will
begin to fill the event queue and must be retrieved with
syz_contextGetNextEvent
. Failure to call syz_contextGetNextEvent
will
slowly fill the event queue, so applications should be sure to incorporate this
into their main UI/game update loops. After the application is done with an
event struct, it should then call syz_eventDeinit
on the event structure;
failure to do so leaks handles.
The flags
argument of syz_getNextEvent
is reserved and must be 0.
Events have a type, context, and source. The type is the kind of the vent. The context is the context from which the event was extracted. The source is the source of the event. Sources are not sources as in the Synthizer object, and are actually most commonly generators.
Event type constants are declared in synthizer_constants.h
with all other
constants. Currently Synthizer only offers SYZ_EVENT_TYPE_FINISHED
and
SYZ_EVENT_TYPE_LOOPED
which do exactly what they sound like: finished fires
when a generator which isn't configured to loop finished, and looped every time
a looping generator resets.
A special event type constant, SYZ_EVENT_TYPE_INVALID
, is returned by
syz_contextGetNextEvent
when there are no events in the queue. To write a
proper event loop (excluding error handling):
struct syz_Event evt;
while(1) {
syz_contextGetNextEvent(&evt, context, 0);
if (evt.type == SYZ_EVENT_TYPE_INVALID) {
break;
}
// handle it
}
Synthizer will never return an event if any handle to which the event refers is invalid at the time the event was extracted from the queue. This allows applications to delete handles without having to concern themselves with whether or not an event refers to a deleted handle.
In order to also offer thread safety, Synthizer event handling will temporarily
increment the reference counts of any handles to which an event refers, then
decrement them when syz_eventDeinit
is called. This allows applications the
ability to delete objects on threads other than the thread handling the event,
at the cost of extending the lifetimes of these handles slightly. It is
possible for an application to call syz_handleIncRef
if the application wishes
to keep one of these handles around.
The Automation API
NOTE: While the intent is that the following is stable, this is provisional and subject to change until things have had a chance to settle.
Introduction
Synthizer implements a WebAudio style automation API which can allow for the automation of double, d3, and d6 properties. This functions through a command queue of automation events, each with a time associated.
In order for applications to know when automation is getting low or for other synchronization events, it is also possible to enqueue a command which sends an event to your application.
The C API for this funtionality is very large. Readers should refer to synthizer.h for function and struct prototypes.
A Note on Accuracy
This is the first cut of the automation API. Consequently there's a big limitation: it's not possible to perfectly synchronize automation with the beginning of a generator/source's playback. That is, as things stand, attempts to build instruments and keyboards are bound to be disappointing. Future versions of Synthizer will likely improve this case, but the primary use for the automation API is fadeouts and crossfades, both of which can be done as things stand today. Rather than fix this now, this API is being made available to gain experience in order to determine how we want to address this deficiency in the future.
Improvement for accuracy and the ability to do instrument-level automation is being tracked here.
Overview of usage
The flow of the automation API looks like this:
- get the current time for an object by reading one of the following properties for that object, either from the object
itself or the context:
- Using
SYZ_P_CURRENT_TIME
plus a small amount of latency, in order to let the application determine how long it needs to get commands enqueued; or SYZ_P_SUGGESTED_AUTOMATION_TIME
for Synthizer's best suggestion of when to enqueue new commands.
- Using
- Build an array of
struct syz_AutomationCommand
.- Set the type of command to a
SYZ_AUTOMATION_COMMAND_XXX
constant. - Set the time to the time of the command.
- Set the target to the object to be automated.
- Set the appropriate command payload in the
params
union. - Leave flags at 0, for now.
- Set the type of command to a
- get an automation batch with
syz_createAutomationBatch
. - Add commands to it with
syz_automationbatchAddCommands
. - Execute the batch with
syz_automationbatchExecute
- Destroy the batch with
syz_handleDecRef
The above is involved primarily because this is C: bindings can and should offer easier, builder-like interfaces that aren't nearly so difficult.
What's going on?
Automation allows users to build a timeline of events and property values, relative to the context's current time. These events can then be enqueued for execution as time advances, and will happen within one "block" of the current time--accurate to about 5MS. Events can be scheduled in the past, and will contribute to any fading that's going on.
The best way to view this is as an approximation of a function. That is, if an event is enqueued at time 0 to set property x to 5, and then at time 10 to set it to 10, then at time 5 the property might be at 5 if the interpolation type is set to linear. The key insight is that it doesn't matter when the time 0 event was scheduled: if it was added at time 4, the value is still 5.
If events are scheduled early, they happen immediately. This is in line with the intended uses of using them to know when automation is nearing its end or is past specific times.
Time
Time in Synthizer is always advancing unless the context is paused. When the context is created, time starts at 0, then
proceeds forward. It is possible to read the time using SYZ_P_CURRENT_TIME
, which is available on all objects you'd
expect: generators, source, effects, etc. By building relative to these object-local times, it is possible for
applications to be prepared for the future when this may no longer be the case, even though they all currently just
defer to the context.
In most cases, though, automating relative to the current time will cause commands to arrive late, and execute in the
past. To deal with this, Synthizer offers SYZ_P_SUGGESTED_AUTOMATION_TIME
, the time that Synthizer suggests you might
want to enqueue commands relative to for them to arrive on time. This also advances forward, but is in the future
relative to SYZ_P_CURRENT_TIME
. Currently, this is a simple addition, but the goal is to make the algorithm smarter
in the future.
SYZ_P_SUGGESTED_AUTOMATION_TIME
is also quite latent. It is perfectly acceptable for applications to query the
current time, then add their own smaller offset.
There is one typical misuse that people like to do with APIs such as this: don't re-read the current time when building continuing timelines. Since the current time is always advancing, this will cause your timeline to be "jagged". That is:
current_time = # get current time
enqueue(current_time + 5)
current_time = # get current time
enqueue(current_time + 10)
Will not produce events 5 seconds apart, but instead 5 seconds and a bit. Correct usage is:
current_time = #get current time
enqueue_command(current_time + 5)
enqueue_command(current_time + 10)
A good rule of thumb is that time should be read when building an automation batch, then everything for that batch done relative to the time that was read at the beginning.
Finally, as implied by the above, automation doesn't respect pausing unless it's on the context: pausing a generator with automation attached will still have automation advance even while it is paused. This is currently unavoidable due to internal limitations which require significant work to lift.
Automation Batches
In order to enqueue commands, it is necessary to put them somewhere. We also want an API which allows for enqueueing
more than one command at a time for efficiency. In order to do this, Synthizer introduces the concept of the automation
batch, created with syz_createAutomationBatch
.
Automation batches are one-time-use command queues, to which commands can be appended with
syz_automationBatchAddCommands
. Uniquely in Synthizer, they are not threadsafe: you must provide synchronization
yourself. Once all the commands you want to add are added, syz_automationBatchExecute
executes the batch. Afterwords,
due to internal limitations, the batch may not be reused.
Note that the intent is that batches can automate multiple objects at once. Using a different object with every command is perfectly fine.
Commands
Automation offers the following command types:
SYZ_AUTOMATION_COMMAND_APPEND_PROPERTY
: Add a value to the timeline for a property, e.g. "use linear interpolation to get to 20.0 by 10 seconds".SYZ_AUTOMATION_COMMAND_SEND_USER_EVENT
: Add an event to the timeline, to be sent approximately when the timeline crosses the specified time.SYZ_AUTOMATION_COMMAND_CLEAR_PROPERTY
: Clear all events related to one specific property on one specific object.SYZ_AUTOMATION_COMMAND_CLEAR_EVENTS
: Clear all scheduled events for an object.SYZ_AUTOMATION_COMMAND_CLEAR_ALL_PROPERTIES
: Clear automation for all properties for a specific object.
Commands which clear data don't respect time, and take effect immediately. Typically, they're put at the beginning of the automation batch in order to clean up before adding new automation.
Parameters to commands are specified as the enum variants of syz_AutomationCommand
's params
union and are mostly
self-explanatory. Property automation is discussed below.
Automating Property values
Using SYZ_AUTOMATION_COMMAND_APPEND_PROPERTY
, it is possible to append values to the timeline for a property on a
specific object. This is done via the syz_AutomationPoint
struct:
struct syz_AutomationPoint {
unsigned int interpolation_type;
double values[6];
unsigned long long flags;
};
The fields are as follows:
- The interpolation type specifies how to get from the last value (if any) to the current value in the point:
SYZ_INTERPOLATION_TYPE_NONE
: do nothing until the point is crossed, then jump immediately.SYZ_P_INTERPOLATION_TYPE_LINEAR
: use linear interpolation from the last point.
- The values array specifies the values. The number of indices used depends on the property: doubles use only the first index, d3 use the first 3, and d6 use all 6.
- Flags is reserved and must be 0.
The timeline for a specific property represents the approximation of a continuous function, not a list of commands. As alluded to above, if points are early they still take effect by changing the continuous function. This is done so that things like audio "animations" can execute accurately even if the application experiences a delay.
Property automation interacts with normal property sets by desugaring normal property sets to
SYZ_INTERPOLATION_TUYPE_NONE
points scheduled "as soon as possible". Users are encouraged to either use automation or
property setting rather than both at the same time, but both at the same time has a consistent behavior.
Events
Events are much simpler: you specify a unsigned long long
parameter, and get it back when the event triggers.
unsigned long long
is used because it is possible to losslessly store a pointer in it on all platforms we support
without having to worry about the size changing on 32-bit platforms.
Examples.
Synthizer's repository contains some examples demonstrating common automation uses in the examples directory:
simple_automation.c
: build a simple envelope around a waveform.automation_circle.c
: automate the position of a source to move it in circles.play_notes.c
: play some notes every time the user presses enter (this works because the waveform is constant, but isn't ideal as explained above).
Stability and Versioning
Synthizer uses pre-1.0 semantic versioning. This means:
- Major is always 0.
- Minor is incremented for incompatible API changes.
- Patch is incremented for new features and/or bug fixes.
Synthizer is intended to be production ready software, but has not seen wide usage. It's somewhere between beta and 1.0: not as many features as you might want, but also not crashing at the drop of a hat. If you find bugs, please report them against the official repository.
API breakage is still expected. This manual attempts to document where API breakage may occur. These are referred to as provisional features.
Context
Constructors
syz_createContext
SYZ_CAPI syz_ErrorCode syz_createContext(syz_Handle *out, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);
Creates a context configured to play through the default output device.
Properties
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_GAIN | double | 1.0 | value >= 0.0 | The gain of the context |
SYZ_P_POSITION | double3 | (0, 0, 0) | any | The position of the listener. |
SYZ_P_ORIENTATION | double6 | (0, 1, 0, 0, 0, 1) | Two packed unit vectors | The orientation of the listener as (atx, aty, atz, upx, upy, upz) . |
SYZ_P_DEFAULT_DISTANCE_MODEL | int | SYZ_DISTANCE_MODEL_LINEAR | any SYZ_DISTANCE_MODEL | The default distance model for new sources. |
SYZ_P_DEFAULT_DISTANCE_REF | double | 1.0 | value >= 0.0 | The default reference distance for new sources. |
SYZ_P_DEFAULT_DISTANCE_MAX | double | 50.0 | value >= 0.0 | The default max distance for new sources. |
SYZ_P_DEFAULT_ROLLOFF | double | 1.0 | value >= 0.0 | The default rolloff for new sources. |
SYZ_P_DEFAULT_CLOSENESS_BOOST | double | 0.0 | any finite double | The default closeness boost for new sources in DB. |
SYZ_P_DEFAULT_CLOSENESS_BOOST_DISTANCE | double | 0.0 | value >= 0.0 | The default closeness boost distance for new sources |
SYZ_P_DEFAULT_PANNER_STRATEGY | int | SYZ_PANNER_STRATEGY_STEREO | any SYZ_PANNER_STRATEGY | The default panner strategy for new sources. |
Functions
syz_contextGetNextEvent
See events.
Linger Behavior
None.
Remarks
The context is the main entrypoint to Synthizer, responsible for the following:
- Control and manipulation of the audio device.
- Driving the audio threads.
- Owning all objects that play together.
- Representing the listener in 3D space.
All objects which are associated with a context take a context as part of all their constructors. Two objects which are both associated with different contexts should never interact. For efficiency, whether two objects are from different contexts is unvalidated, and the behavior of mixing them is undefined.
All objects associated with a context become useless once the context is destroyed. Calls to them will still work, but they can't be reassociated with a different context and no audible output will result.
Most programs create one context and destroy it at shutdown.
For the time being, all contexts output stereo audio, and it is not possible to specify the output device. These restrictions will be lifted in future.
For information on the meaning of the distance model properties, see 3D Audio.
Buffer
Constructors
syz_createBufferFromFile
SYZ_CAPI syz_ErrorCode syz_createBufferFromFile(syz_Handle *out, const char *path, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);
Create a buffer from a file using an UTF-8 encoded path.
syz_createBufferFromStreamParams
SYZ_CAPI syz_ErrorCode syz_createBufferFromStreamParams(syz_Handle *out, const char *protocol, const char *path, void *param, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);
Create a buffer from stream parameters. See decoding for information on streams.
This call will decode the stream in the calling thread, returning errors as necessary. Synthizer will eventually offer a BufferCache which supports background decoding and caching, but for the moment the responsibility of background decoding is placed on the calling program.
syz_createBufferFromEncodedData
SYZ_CAPI syz_ErrorCode syz_createBufferFromEncodedData(syz_Handle *out, unsigned long long data_len, const char *data, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);
Create a buffer from encoded audio data in ram, for example an ogg file read from disk. This will also work with mmapped pointers. As with all other decoding, Synthizer will autodetect the type from the data. The pointer must live for the duration of the call.
syz_createBufferFromFloatArray
SYZ_CAPI syz_ErrorCode syz_createBufferFromFloatArray(syz_Handle *out, unsigned int sr, unsigned int channels, unsigned long long frames, const float *data, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);
Create a buffer from an array of float data generated by the application. The array must be channels * frames
elements.
syz_createBufferFromStreamHandle
SYZ_CAPI syz_ErrorCode syz_createBufferFromStreamHandle(syz_Handle *out, syz_Handle stream, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);
Create a buffer from a stream handle. Usually used with custom streams. Decodes in the calling thread. The lifetime of the stream's underlying asset need only be as long as this call.
syz_ufferGetSizeInBytes
SYZ_CAPI syz_ErrorCode syz_bufferGetSizeInBytes(unsigned long long *size, syz_Handle buffer);
Get the approximate size of this buffer's in-memory representation in bytes.
Properties
None.
Functions
Getters
SYZ_CAPI syz_ErrorCode syz_bufferGetChannels(unsigned int *out, syz_Handle buffer);
SYZ_CAPI syz_ErrorCode syz_bufferGetLengthInSamples(unsigned int *out, syz_Handle buffer);
SYZ_CAPI syz_ErrorCode syz_bufferGetLengthInSeconds(double *out, syz_Handle buffer);
The self-explanatory getters. These aren't properties because they can't be written and they shouldn't participate in the property infrastructure.
Remarks
Buffers hold audio data, as a collection of contiguous chunks. Data is resampled to the Synthizer samplerate and converted to 16-bit PCM.
Buffers are one of the few Synthizer objects that don't require a context. They may be used freely with any object requiring a buffer, from any thread. In order to facilitate this, buffers are immutable after creation.
The approximate memory usage of a buffer in bytes is 2 * channels * duration_in_seconds * 44100
. Loading large assets
into buffers is not recommended. For things such as music tracks, use StreamingGenerators.
Note that on 32-bit architectures, some operating systems only allow a 2 gigabyte address space. Synthizer avoids
allocating buffers as contiguous arrays in part to allow efficient use of 32-bit address spaces, but this only goes so
far. If on a 32-bit architecture, expect to run out of memory from Synthizer's perspective well before decoding 2
Gigabytes of buffers simultaneously due to the inability to find consecutive free pages.
Operations Common to All Sources
Constructors
None.
Properties
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_GAIN | double | Any double > 0 | An additional gain factor applied to this source. | |
SYZ_P_FILTER | biquad | identity | any | A filter which applies to all audio leaving the source, before SYZ_P_FILTER_DIRECT and SYZ_P_FILTER_EFFECTS . |
SYZ_P_FILTER_DIRECT | biquad | identity | any | A filter which applies after SYZ_P_FILTER but not to audio traveling to effect sends. |
SYZ_P_FILTER_EFFECTS | biquad | identity | any | A filter which runs after SYZ_P_FILTER but only applies to audio traveling through effect sends. |
Functions
syz_sourceAddGenerator
, syz_sourceRemoveGenerator
SYZ_CAPI syz_ErrorCode syz_sourceAddGenerator(syz_Handle source, syz_Handle generator);
SYZ_CAPI syz_ErrorCode syz_sourceRemoveGenerator(syz_Handle source, syz_Handle generator);
Add/remove a generator from a source. Each generator may be added once and duplicate add calls will have no effect. Each generator should only be used with one source at a time.
Remarks
Sources represent audio output. They combine all generators connected to them, apply any effects if necessary, and feed the context. Subclasses of Source add panning and other features.
All sources offer filters via SYZ_P_FILTER
, SYZ_P_FILTER_DIRECT
and
SYZ_P_FILTER_EFFECTS
. First, SYZ_P_FILTER
is applied, then the audio is
split into two paths: the portion heading directly to the speakers gets
SYZ_P_FILTER_DIRECT
, and the portion heading to the effect sends gets
SYZ_P_FILTER_EFFECTS
. This can be used to simulate occlusion and perform
other per-source effect customization.
DirectSource
Constructors
syz_createDirectSource
SYZ_CAPI syz_ErrorCode syz_createDirectSource(syz_Handle *out, syz_Handle context, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates a direct source.
Properties
Inherited from Source only.
Linger Behavior
Lingers until the timeout or until all generators have been destroyed.
Remarks
A direct source is for music and other audio assets that don't wish to participate in panning, , and should be linked directly to speakers.
Audio is converted to the Context's channel count and passed directly through.
AngularPannedSource and ScalarPannedSource
Constructors
syz_createAngularPannedSource
SYZ_CAPI syz_ErrorCode syz_createAngularPannedSource(syz_Handle *out, syz_Handle context, int panner_strategy,
double azimuth, double elevation, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates an angular panned source, which can be controled through azimuth and elevation.
syz_createScalarPannedSource
SYZ_CAPI syz_ErrorCode syz_createScalarPannedSource(syz_Handle *out, syz_Handle context, int panner_strategy,
double panning_scalar, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates a scalar panned source, controlled via the panning scalar.
Properties
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_AZIMUTH | double | from constructor | 0.0 to 360.0 | The azimuth of the panner. See remarks. |
SYZ_P_ELEVATION | double | from constructor | -90.0 to 90.0 | See remarks |
SYZ_P_PANNING_SCALAR | double | from constructor | -1.0 to 1.0 | see remarks |
Linger Behavior
Lingers until all generators have been destroyed.
Remarks
The panned sources give direct control over a panner, which is either controlled via azimuth/elevation in degrees or a panning scalar. Which properties you use depend on which type of source you create (angular for azimuth/elevation, scalar for the panning scalar).
If using azimuth/elevation, 0.0 azimuth is forward and positive angles are clockwise. Elevation ranges from -90 (down) to 90 (up).
Some applications want to control panners through a panning scalar instead, i.e. for UI purposes. If using panning scalars, -1.0 is full left and 1.0 is full right.
For information on panning, see 3D Audio.
Source3D
Constructors
syz_createSource3D
SYZ_CAPI syz_ErrorCode syz_createSource3D(syz_Handle *out, syz_Handle context, int panner_strategy, double x, double y,
double z, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates a source3d positioned at the origin and with no associated generators.
Properties
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_POSITION | double3 | from constructor | any | The position of the source. |
SYZ_P_ORIENTATION | double6 | (0, 1, 0, 0, 0, 1) | Two packed unit vectors | The orientation of the source as (atx, aty, atz, upx, upy, upz) . Currently unused. |
SYZ_P_DISTANCE_MODEL | int | from Context | any SYZ_DISTANCE_MODEL | The distance model for this source. |
SYZ_P_DISTANCE_REF | double | From Context | value >= 0.0 | The reference distance. |
SYZ_P_DISTANCE_MAX | double | From Context | value >= 0.0 | The max distance for this source. |
SYZ_P_ROLLOFF | double | From Context | value >= 0.0 | The rolloff for this source. |
SYZ_P_CLOSENESS_BOOST | double | From Context | any finite double | The closeness boost for this source in DB. |
SYZ_P_CLOSENESS_BOOST_DISTANCE | double | From Context | value >= 0.0 | The closeness boost distance for this source. |
Linger Behavior
Lingers until all generators are destroyed.
Remarks
A Source3D represents an entity in 3D space. For explanations of the above properties, see 3D Audio.
When created, Source3D reads all of its defaults from the Context's corresponding properties. Changes to the Context versions don't affect already created sources. A typical use case is to configure the Context to the defaults of the game, and then create sources.
Operations Common to All Generators
Generators generate audio, and are how Synthizer knows what to play through sources.
Properties
All generators support the following properties:
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_GAIN | double | 1.0 | value >= 0.0 | The gain of the generator. |
SYZ_P_PITCH_BEND | double | 1.0 | 0.0 <= value <= 2.0 | Pitch bend of the generator as a multiplier (2.0 is +1 octave, 0.5 is -1 octave, etc) |
Remarks
Not all generators support SYZ_P_PITCH_BEND
because it doesn't necessarily
make sense for them to do so, but it can always be set.
BufferGenerator
Constructors
syz_createBufferGenerator
SYZ_CAPI syz_ErrorCode syz_createBufferGenerator(syz_Handle *out, syz_Handle context, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates a BufferGenerator. The buffer is set to NULL and the resulting generator will play silence until one is associated.
Properties
Enum | Type | Default Value | Range | Description |
---|---|---|---|---|
SYZ_P_BUFFER | Object | 0 | Any Buffer handle | The buffer to play |
SYZ_P_PLAYBACK_POSITION | double | 0.0 | value >= 0.0 | The position in the buffer. |
SYZ_P_LOOPING | int | 0 | 0 or 1 | Whether playback loops at the end of the buffer. |
Linger behavior
Disables looping and plays until the buffer ends.
Remarks
BufferGenerators play Buffers. This is the most efficient way to play audio.
SYZ_P_PLAYBACK_POSITION
is reset if SYZ_P_BUFFER
is modified.
SYZ_P_PLAYBACK_POSITION
can be set past the end of the buffer. If
SYZ_P_LOOPING = 0
, the generator will play silence. Otherwise, the position
will immediately loop to the beginning.
More than one BufferGenerator can use the same underlying Buffer.
If the buffer being used by this generator is destroyed, this generator immediately begins playing silence until another buffer is associated.
FastSineBankGenerator
Generate basic waveforms which can be expressed as the sum of sine waves (e.g. square, triangle).
Constructors
syz_createFastSineBankGenerator
struct syz_SineBankWave {
double frequency_mul;
double phase;
double gain;
};
struct syz_SineBankConfig {
const struct syz_SineBankWave *waves;
unsigned long long wave_count;
double initial_frequency;
};
SYZ_CAPI void syz_initSineBankConfig(struct syz_SineBankConfig *cfg);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGenerator(syz_Handle *out, syz_Handle context,
struct syz_SineBankConfig *bank_config, void *config,
void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Create a sine bank which evaluates itself by summing sine waves at specific multiples of a fundamental frequency. See remarks for specifics on what this means and what the values in the configuration structs should be.
Most applications will want to use the helpers which configure the bank with specific well-known waveforms.
You own the memory pointed to by syz_SineBankConfig
, and it may be freed immediately after the constructor call.
Pointing it at values on the stack is fine.
Specific waveform helpers
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorSine(syz_Handle *out, syz_Handle context,
double initial_frequency, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorTriangle(syz_Handle *out, syz_Handle context,
double initial_frequency, unsigned int partials,
void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorSquare(syz_Handle *out, syz_Handle context,
double initial_frequency, unsigned int partials,
void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorSaw(syz_Handle *out, syz_Handle context, double initial_frequency,
unsigned int partials, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Create waveforms of specific types, e.g. what you'd get from a digital synthesizer. Most applications will wish to use these functions. See remarks for additional notes on quality.
Properties
Enum | Type | Default Value | Range | Description |
---|---|---|---|---|
SYZ_P_FREQUENCY | double | set by constructor | any positive value | the frequency of the waveform. |
Linger behavior
Fades out over a few ms.
Remarks
This implements a fast sine wave generation algorithm which is on the order of single-digit clock cycles per sample, at
the cost of slight accuracy. The intended use is to generate chiptune-quality waveforms. For those not familiar, most
waveforms may be constructed of summed sine waves at specific frequencies, so this functions as a general-purpose wave
generator. Note that attempts to use this to run a fourier transform will not end well: the slight inaccuracy combined
with being O(samples*waves)
will cause wave counts over a few hundred to rapidly become impractically slow and low
quality. In the best case (right processor, your compiler likes Synthizer, etc), the theoretical execution time per
sample for 32 waves is around 5-10 clock cycles, so it can be push pretty far.
In order to specify the waveforms to use, you must use 3 parameters:
frequency_mul
: the multiplier on the base frequency for this wave. For examplefrequency_mul = 1.0
is a sine wave which plays back at whatever frequency the generator is set to,2.0
is the first harmonic, etc. Fractional values are permitted.phase
: the phase of the sinusoid in the range 0 to 1. This is slightly odd because there are enough approximations ofPI
out there depending on language; to get to what Synthizer wants, take your phase in radians and divide by whatever approximation ofPI
you're using.gain
: Self-explanatory. Negative gains are valid in order to allow converting from mathematical formulas that use it.
In order to provide a more convenient interface, the helper functions for various waveform types may be used. These specify the number of partials to generate, which does not exactly equate to harmonics because not all waveforms contain every harmonic. Simply playing with this value until it sounds good is the easiest way to deal with it; for most applications, no more than 30 should be required. More specifically, the square wave makes a good concrete example of how partials can be different from harmonics because it only includes odd harmonics. So:
partials | harmonics included |
---|---|
1 | 1.0 (fundamental), 3.0 |
2 | 1.0 (fundamental), 3.0, 5.0 |
3 | 1.0, 3.0, 5.0, 7.0 |
The reason that you might wish to use less partials is due to aliasing. Extremely high partial counts will alias if they go above nyquist, currently 22050 hZ. If you are playing high frequencies lowering the partial count may be called for. By contrast, intentionally forcing it to alias can produce more "chiptune"-quality sound. The CPU usage of more partials should be unnoticeable for all practical values; if this turns out not to be the case you are encouraged to open an issue.
This generator does not allow introducing a DC term. If you need one for some reason, open an issue instead of trying to hack it in with a sine wave at 0HZ and the appropriate phase and gain.
NoiseGenerator
Inherits from Generator.
Constructors
syz_createNoiseGenerator
SYZ_CAPI syz_ErrorCode syz_createNoiseGenerator(syz_Handle *out, syz_Handle context, unsigned int channels,
void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates a NoiseGenerator configured for uniform noise with the specified number of output channels. The number of output channels cannot be configured at runtime. Each channel produces decorrelated noise.
Properties
Enum | Type | Default Value | Range | Description |
---|---|---|---|---|
SYZ_P_NOISE_TYPE | int | SYZ_NOISE_TYPE_UNIFORM | any SYZ_NOISE_TYPE | The type of noise to generate. See remarks. |
Linger Behavior
Fades out over a few milliseconds.
Remarks
NoiseGenerators generate noise. This is most useful when filtered via the source, and can make things such as plausible if low-quality wind and whistling effects.
Synthizer allows setting the algorithm used to generate noise to one of the following options. Note that these are more precisely named than white/pink/brown; the sections below document the equivalent in the more standard nomenclature.
SYZ_NOISE_TYPE_UNIFORM
A uniform noise source. From an audio perspective this is white noise, but is sampled from a uniform rather than Gaussian distribution for efficiency.
SYZ_NOISE_TYPE_VM
This is pink noise generated with the Voss-McCartney algorithm, which consists of a number of summed uniform random number generators which are run at different rates. Synthizer adds an additional random number generator at the top of the hierarchy in order to improve the color of the noise in the high frequencies.
SYZ_NOISE_TYPE_FILTERED_BROWN
This is brown noise generated with a -6DB filter.
StreamingGenerator
Constructors
syz_createStreamingGeneratorFromFile
SYZ_CAPI syz_ErrorCode syz_createStreamingGeneratorFromFile(syz_Handle *out, syz_Handle context, const char *path,
void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Create a StreamingGenerator from an UTF-8 encoded path.
syz_createStreamingGeneratorFromStreamParams
SYZ_CAPI syz_ErrorCode syz_createStreamingGeneratorFromStreamParams(syz_Handle *out, syz_Handle context,
const char *protocol, const char *path, void *param,
void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Create a StreamingGenerator from the standard stream parameters.
syz_createStreamingGeneratorFromStreamHandle
SYZ_CAPI syz_ErrorCode syz_createStreamingGeneratorFromStreamHandle(syz_Handle *out, syz_Handle context,
syz_Handle stream, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Create a StreamingGenerator
from a stream handle.
Properties
Enum | Type | Default Value | Range | Description |
---|---|---|---|---|
SYZ_P_PLAYBACK_POSITION | double | 0.0 | value >= 0.0 | The position of the stream. |
SYZ_P_LOOPING | int | 0 | 0 or 1 | Whether playback loops |
Linger Behavior
Disables looping and continues until the stream ends.
Remarks
StreamingGenerator
plays streams, decoding and reading on demand. The typical
use case is for music playback.
Due to the expense of streaming from disk and other I/O sources, having more than a few StreamingGenerators going will cause a decrease in audio quality on many systems, typically manifesting as drop-outs and crackling. StreamingGenerator creates one background thread per instance and does all decoding and I/O in that thread.
At startup, StreamingGenerator's background thread eagerly decodes a relatively large amount of data in order to build up a buffer which prevents underruns. Thereafter, it will pick up property changes every time the background thread wakes up to add more data to the buffer. This means that most operations are high latency, currently on the order of 100 to 200 MS. The least latent operation is the initial start-up, which will begin playing as soon as enough data is decoded. How long that takes depends on the format and I/O characteristics of the stream, as well as the user's machine and current load of the system.
Operations Common to All Effects
Properties
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_GAIN | double | usually 1.0 | value >= 0.0 | The overall gain of the effect. |
SYZ_P_FILTER_INPUT | biquad | usually identity. if not, documented with the effect. | any | A filter which applies to the input of this effect. Runs after filters on effect sends. |
Functions
syz_effectReset
SYZ_CAPI syz_ErrorCode syz_effectReset(syz_Handle effect);
Clears the internal state of the effect. Intended for design/development purposes. This function may produce clicks and other artifacts and is slow.
Remarks
For more information on how effects work, see the dedicated section.
GlobalEcho
Constructors
syz_createGlobalEcho
SYZ_CAPI syz_ErrorCode syz_createGlobalEcho(syz_Handle *out, syz_Handle context, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates an echo effect.
Functions
syz_echoSetTaps
struct syz_EchoTapConfig {
double delay;
double gain_l;
double gain_r;
};
SYZ_CAPI syz_ErrorCode syz_globalEchoSetTaps(syz_Handle handle, unsigned int n_taps, struct syz_EchoTapConfig *taps);
Configure the taps for this Echo. Currently, delay must be no greater than 5 seconds. To clear the taps, set the echo to an array of 0 elements.
Properties
None
Linger Behavior
Lingers until the delay line is empty, that is until no more echoes can possibly be heard.
Remarks
This is a stereo tapped delay line, with a one-block crossfade when taps are reconfigured. The max delay is currently fixed at 5 seconds, but this will be made user configurable in future.
This implementation offers precise control over the placement of taps, at the cost of not being able to have indefinitely long echo effects. It's most useful for modeling discrete, panned echo taps. Some ways this is useful are:
- Emphasize footsteps off walls in large spaces, by computing the parameters for the taps off level geometry.
- Emphasize openings or coridors.
- Pair it with a reverb implementation to offer additional, highly controlled early reflection emphasis
This is effectively discrete convolution for 2 channels, implemented using an
algorithm designed for sparse taps. In other words, the cost of any echo effect
is O(taps)
per sample. Anything up to a few thousand discrete taps is
probably fine, but beyond that the cost will become prohibitive.
GlobalFdnReverb
A reverb based off a feedback delay network.
Inherits from GlobalEffect.
Constructors
syz_createGlobalFdnReverb
SYZ_CAPI syz_ErrorCode syz_createGlobalFdnReverb(syz_Handle *out, syz_Handle context, void *config, void *userdata,
syz_UserdataFreeCallback *userdata_free_callback);
Creates a global FDN reverb with default settings.
Properties
See remarks for a description of what these do and how to use them effectively.
In addition to the below, FdnReverb defaults its gain to 0.7. Gains of 1.0 are almost never what you want, since that makes the reverb as loud as the non-reverb audio paths.
Enum | Type | Default | Range | Description |
---|---|---|---|---|
SYZ_P_INPUT_FILTER | Biquad | Lowpass Butterworth at 2000 HZ | any biquad | a filter that applies to the audio at the input of the reverb. |
SYZ_P_MEAN_FREE_PATH | double | 0.1 | 0.0 to 0.5 | The mean free path of the simulated environment. |
SYZ_P_T60 | double | 0.3 | 0.0 to 100.0 | The T60 of the reverb |
SYZ_P_LATE_REFLECTIONS_LF_ROLLOFF | double | 1.0 | 0.0 to 2.0 | A multiplicative factor on T60 for the low frequency band |
SYZ_P_LATE_REFLECTIONS_LF_REFERENCE | double | 200.0 | 0.0 to 22050.0 | Where the low band of the feedback equalizer ends |
SYZ_P_LATE_REFLECTIONS_HF_ROLLOFF | double | 0.5 | 0.0 to 2.0 | A multiplicative factor on T60 for the high frequency band |
SYZ_P_LATE_REFLECTIONS_HF_REFERENCE | double | 500.0 | 0.0 to 22050.0 | Where the high band of the equalizer starts. |
SYZ_P_LATE_REFLECTIONS_DIFFUSION | double | 1.0 | 0.0 to 1.0 | Controls the diffusion of the late reflections as a percent. |
SYZ_P_LATE_REFLECTIONS_MODULATION_DEPTH | double | 0.01 | 0.0 to 0.3 | The depth of the modulation of the delay lines on the feedback path in seconds. |
SYZ_P_LATE_REFLECTIONS_MODULATION_FREQUENCY | double | 0.5 | 0.01 to 100.0 | The frequency of the modulation of the delay lines inthe feedback paths. |
SYZ_P_LATE_REFLECTIONS_DELAY | double | 0.03 | 0.0 to 0.5 | The delay of the late reflections relative to the input in seconds. |
Linger behavior
Lingers for slightly longer than t60
.
Remarks
This is a reverb composed of a feedback delay network with 8 internal delay lines. The algorithm proceeds as follows:
- Audio is fed through the input filter, a lowpass. Use this to eliminate high frequencies, which can be quite harsh when fed to reverb algorithms.
- Then, audio is fed into a series of 8 delay lines, connected with a feedback
matrix. It's essentially a set of parallel allpass filters with some
additional feedbacks, but inspired by physics.
- Each of these delay lines is modulated, to reduce periodicity.
- On each feedback path, the audio is fed through an equalizer to precisely control the decay rate in 3 frequency bands.
- Two decorrelated channels are extracted. This will be increased to 4 when surround sound support is added.
- Finally, the output is delayed by the late reflections delay.
The current reverb modle is missing spatialized early reflections. Practically
speaking this makes very little difference when using an FDN because the FDN
simulates them effectively on its own, but the SYZ_P_EARLY_REFLECTIONS_*
namespace is reserved for that purpose. The plan is to feed them through HRTF
in order to attempt to capture the shape of the room, possibly with a per-source
model.
The reverb is also missing the ability to pan late reflections; this is on the roadmap.
The default configuration is something to the effect of a medium-sized room. Presets will be added in future. The following sections explain considerations for reverb design with this algorithm:
A Note On Property Changes
The FdnReverb effect involves a large amount of feedback and is therefore impossible to crossfade efficiently. To that end,we don't try. Expect most property changes save for t60 and the hf/lf frequency controls to cause clicking and other artifacts.
To change properties smoothly, it's best to create a reverb, set all the parameters, connect all the sources to the new one, and disconnect all the sources from the old one, in that order. Synthizer may eventually do this internally, but that necessitates taking a permanent and large allocation cost without a lot of implementation work being done first, so for the moment we don't.
In practice, this doesn't matter. Most environments don't change reverb characteristics. A good flow is as follows:
- Design the reverb in your level editor/other environment.
- When necessary, use
syz_effectReset
for interactive experimentation. - When distributing/launching for real, use the above crossfading instructions.
It is of course possible to use more than one reverb at a time as well, and to fade sources between them at different levels. Note, however, that reverbs are relatively expensive.
The Input Filter
Most reverb algorithms have a problem: high frequencies are emphasized.
Synthizer's is no different. To solve this, we introduce an input lowpass
filter, which can cut out the higher frequencies. This is SYZ_P_FILTER_INPUT
,
available on all effects, but defaulted by the reverb to a lowpass at 1500 HZ
because most of the negative characteristics of reverbs occur when high
frequencies are overemphasized.
Changing this cutoff filter is the strongest tool available for coloring the reverb. Low cutoffs are great for rooms with sound dampening, high cutoffs for concrete walls. It can be disabled, but doing so will typically cause metallic and periodic artifacts to be noticeable.
It's also possible to swap it with other filter types. Lowpass filters are effectively the only filter type that aligns with the real world in the context of a reverb, but other filter types can produce interesting effects.
Choosing the mean free path and late reflections delay
These two values are most directly responsible for controlling how big a space feels. Intuitively, the mean free path is the average distance from wall to wall, and the late reflections delay is the time it takes for audio to hit something for the first time. In general, get the mean free path by dividing the average distance between the walls by the speed of sound, and set the late reflections delay to something in the same order of magnitude.
A good approximation for the mean free path is 4 * volume / surface_area
.
Mathematically, it's the average time sound travels before reflection off an
obstacle. Very large mean free paths produce many discrete echoes. For
unrealistically large values, the late reflections won't be able to converge at
all.
Choosing T60 and controlling per-band decay
The t60 and related properties control the gains and configuration of a filter on the feedback path.
The t60 of a reverb is defined as the time it takes for the reverb to decay by
-60db
. Effectively this can be thought of as how long until the reverb is
completely silent. 0.2 to 0.5 is a particularly reverberant and large living
room, 1.0 to 2.0 is a concert hall, 5.0 is an amazingly large cavern, and values
larger than that quickly become unrealistic and metallic.
Most environments don't have the same decay time for all frequency bands, so the FdnReverb actually uses a 3-band equalizer instead of raw gains on the feedback paths. The bands are as follows:
- 0.0 to
SYZ_P_LATE_REFLECTIONS_LF_REFERENCE
SYZ_P_LATE_REFLECTIONS_LF_REFERENCE
toSYZ_P_LATE_REFLECTIONS_HF_REFERENCE
SYZ_P_LATE_REFLECTIONS_HF_REFERENCE
to nyquist
SYZ_P_T60
controls the decay time of the middle frequency band. The lower
band is t60 * lf_rolloff
, and the upper t60 * hf_rolloff
. This allows you
to simply change T60, and let the rolloff ratios control coloration.
Intuitively, rooms with carpet on all the walls have a rather low hf reference and rolloff, and giant stone caverns are close to equal in all frequency bands. The lf reference/rolloff pairing can be used primarily for non-natural base boosting. When the reverb starts, all frequencies are relatively equal but, as the audio continually gets fed back through the feedback paths, the equalizer will emphasize or deemphasize the 3 frequency bands at different rates. To use this effectively, treat the hf/lf as defining the materials of the wall, then move t60.
Note that the amount of coloration you can get from the equalizer is limited especially for short reverbs. To control the perception of the environment more bluntly and independently of t60, use the input filter.
Diffusion
The diffusion of the reverb is how fast the reverb tail transitions from discrete echoes to a continuous reverberant response. Synthizer exposes this to you as a percent-based control, since it's not conveniently possible to tie anything to a real physical quantity in this case. Typically, diffusion at 1.0 (the default) is what you want.
Another way to think of diffusion is how rough the walls are, how many obstacles there are for sound to bounce off of, etc.
Delay Line modulation
A problem with feedback delay networks and/or other allpass/comb filter reverb designs is that they tend to be obviously periodic. To deal with this, modulation of the delay lines on the feedback path is often introduced. The final stage of designing an FdnReverb is to decide on the values of the modulation depth and frequency.
The trade-off here is this:
- At low modulation depth/frequency, the reverb likes to sound metallic.
- At high modulation depth/frequency, the reverb gains very obvious nonlinear effects.
- At very high modulation depth/frequency, the reverb doesn't sound like a reverb at all.
FdnReverb tries to default to universally applicable settings, but it might still be worth adjusting these. To disable modulation all together, set the depth to 0.0; due to internal details, setting the frequency to 0.0 is not possible.
The artifacts introduced by large modulation depth/frequency values are least noticeable with percussive sounds and most noticeable with constant tones such as pianos and vocals. Inversely, the periodic artifacts of no or little modulation are most noticeable with percussive sounds and least noticeable with constant tones.
In general, the best way to not need to touch these settings is to use realistic t60, as the beginning of the reverb isn't generally periodic.
Audio EQ Cookbook
The following is the Audio EQ Cookbook, containing the most widely used formulas for biquad filters. Synthizer's internal implementation of most filters either follows these exactly or is composed of cascaded/parallel sections.
There are several versions of this document on the web. This version is from http://music.columbia.edu/pipermail/music-dsp/2001-March/041752.html.
Cookbook formulae for audio EQ biquad filter coefficients
---------------------------------------------------------------------------
by Robert Bristow-Johnson <rbj at gisco.net> a.k.a. <robert at audioheads.com>
All filter transfer functions were derived from analog prototypes (that
are shown below for each EQ filter type) and had been digitized using the
Bilinear Transform. BLT frequency warping has been taken into account
for both significant frequency relocation and for bandwidth readjustment.
First, given a biquad transfer function defined as:
b0 + b1*z^-1 + b2*z^-2
H(z) = ------------------------ (Eq 1)
a0 + a1*z^-1 + a2*z^-2
This shows 6 coefficients instead of 5 so, depending on your architechture,
you will likely normalize a0 to be 1 and perhaps also b0 to 1 (and collect
that into an overall gain coefficient). Then your transfer function would
look like:
(b0/a0) + (b1/a0)*z^-1 + (b2/a0)*z^-2
H(z) = --------------------------------------- (Eq 2)
1 + (a1/a0)*z^-1 + (a2/a0)*z^-2
or
1 + (b1/b0)*z^-1 + (b2/b0)*z^-2
H(z) = (b0/a0) * --------------------------------- (Eq 3)
1 + (a1/a0)*z^-1 + (a2/a0)*z^-2
The most straight forward implementation would be the Direct I form (using Eq 2):
y[n] = (b0/a0)*x[n] + (b1/a0)*x[n-1] + (b2/a0)*x[n-2]
- (a1/a0)*y[n-1] - (a2/a0)*y[n-2] (Eq 4)
This is probably both the best and the easiest method to implement in the 56K.
Now, given:
sampleRate (the sampling frequency)
frequency ("wherever it's happenin', man." "center" frequency
or "corner" (-3 dB) frequency, or shelf midpoint frequency,
depending on which filter type)
dBgain (used only for peaking and shelving filters)
bandwidth in octaves (between -3 dB frequencies for BPF and notch
or between midpoint (dBgain/2) gain frequencies for peaking EQ)
_or_ Q (the EE kind of definition)
_or_ S, a "shelf slope" parameter (for shelving EQ only). when S = 1,
the shelf slope is as steep as it can be and remain monotonically
increasing or decreasing gain with frequency. the shelf slope, in
dB/octave, remains proportional to S for all other values.
First compute a few intermediate variables:
A = sqrt[ 10^(dBgain/20) ]
= 10^(dBgain/40) (for peaking and shelving EQ filters only)
omega = 2*PI*frequency/sampleRate
sin = sin(omega)
cos = cos(omega)
alpha = sin/(2*Q) (if Q is specified)
= sin*sinh[ ln(2)/2 * bandwidth * omega/sin ] (if bandwidth is specified)
beta = sqrt(A)/Q (for shelving EQ filters only)
= sqrt(A)*sqrt[ (A + 1/A)*(1/S - 1) + 2 ] (if shelf slope is specified)
= sqrt[ (A^2 + 1)/S - (A-1)^2 ]
Then compute the coefficients for whichever filter type you want:
The analog prototypes are shown for normalized frequency.
The bilinear transform substitutes:
1 1 - z^-1
s <- -------------- * ----------
tan(omega/2) 1 + z^-1
and makes use of these trig identities:
sin(w)
tan(w/2) = ------------
1 + cos(w)
1 - cos(w)
(tan(w/2))^2 = ------------
1 + cos(w)
LPF: H(s) = 1 / (s^2 + s/Q + 1)
b0 = (1 - cos)/2
b1 = 1 - cos
b2 = (1 - cos)/2
a0 = 1 + alpha
a1 = -2*cos
a2 = 1 - alpha
HPF: H(s) = s^2 / (s^2 + s/Q + 1)
b0 = (1 + cos)/2
b1 = -(1 + cos)
b2 = (1 + cos)/2
a0 = 1 + alpha
a1 = -2*cos
a2 = 1 - alpha
BPF (constant skirt gain): H(s) = s / (s^2 + s/Q + 1)
b0 = Q*alpha
b1 = 0
b2 = -Q*alpha
a0 = 1 + alpha
a1 = -2*cos
a2 = 1 - alpha
BPF (constant peak gain): H(s) = (s/Q) / (s^2 + s/Q + 1)
b0 = alpha
b1 = 0
b2 = -alpha
a0 = 1 + alpha
a1 = -2*cos
a2 = 1 - alpha
notch: H(s) = (s^2 + 1) / (s^2 + s/Q + 1)
b0 = 1
b1 = -2*cos
b2 = 1
a0 = 1 + alpha
a1 = -2*cos
a2 = 1 - alpha
APF: H(s) = (s^2 - s/Q + 1) / (s^2 + s/Q + 1)
b0 = 1 - alpha
b1 = -2*cos
b2 = 1 + alpha
a0 = 1 + alpha
a1 = -2*cos
a2 = 1 - alpha
peakingEQ: H(s) = (s^2 + s*(A/Q) + 1) / (s^2 + s/(A*Q) + 1)
b0 = 1 + alpha*A
b1 = -2*cos
b2 = 1 - alpha*A
a0 = 1 + alpha/A
a1 = -2*cos
a2 = 1 - alpha/A
lowShelf: H(s) = A * (s^2 + beta*s + A) / (A*s^2 + beta*s + 1)
b0 = A*[ (A+1) - (A-1)*cos + beta*sin ]
b1 = 2*A*[ (A-1) - (A+1)*cos ]
b2 = A*[ (A+1) - (A-1)*cos - beta*sin ]
a0 = (A+1) + (A-1)*cos + beta*sin
a1 = -2*[ (A-1) + (A+1)*cos ]
a2 = (A+1) + (A-1)*cos - beta*sin
highShelf: H(s) = A * (A*s^2 + beta*s + 1) / (s^2 + beta*s + A)
b0 = A*[ (A+1) + (A-1)*cos + beta*sin ]
b1 = -2*A*[ (A-1) + (A+1)*cos ]
b2 = A*[ (A+1) + (A-1)*cos - beta*sin ]
a0 = (A+1) - (A-1)*cos + beta*sin
a1 = 2*[ (A-1) - (A+1)*cos ]
a2 = (A+1) - (A-1)*cos - beta*sin