Introduction

This is the manual for Synthizer, a library for 3D audio and synthesis with a focus on games and VR applications.

There are 3 ways to learn Synthizer:

  • This manual acts as a conceptual overview, as well as an API and object reference. If you have worked with audio libraries before, you can likely read the concepts section and hit the ground running. In particular, Synthizer is much like a more limited (but faster) form of WebAudio.
  • The Python bindings are no longer officially extended without community contribution, but their repository contains a number of examples and tutorials.
  • Finally, the official repository contains a number of C examples.

Where are the Python Bindings?

The Python bindings are found at this repository which now also contains the docs and a set of examples. They are now maintained separately.

The Synthizer C API

Synthizer has the following headers:

  • synthizer.h: All library functions
  • synthizer_constants.h: Constants, i.e. the very large property enum.

These headers are separated because it is easier in some cases to machine parse synthizer_constants.h when writing bindings for languages which don't have good automatic generation, then manually handle synthizer.h.

The Synthizer C API returns errors and writes results to out parameters. Out parameters are always the first parameters of a function, and errors are always nonzero. Note that error codes are currently not defined; they will be, once things are more stable.

It is possible to get information on the last error using these functions:

SYZ_CAPI syz_ErrorCode syz_getLastErrorCode(void);
SYZ_CAPI const char *syz_getLastErrorMessage(void);

These are thread-local functions. The last error message is valid until the next call into Synthizer.

Logging, Initialization, and Shutdown

The following excerpts from synthizer.h specify the logging and initialization API. Explanation follows:

enum SYZ_LOGGING_BACKEND {
    SYZ_LOGGING_BACKEND_NONE,
    SYZ_LOGGING_BACKEND_STDERR,
};

enum SYZ_LOG_LEVEL {
    SYZ_LOG_LEVEL_ERROR = 0,
    SYZ_LOG_LEVEL_WARN = 10,
    SYZ_LOG_LEVEL_INFO = 20,
    SYZ_LOG_LEVEL_DEBUG = 30,
};

struct syz_LibraryConfig {
    unsigned int log_level;
    unsigned int logging_backend;
    const char *libsndfile_path;
};

SYZ_CAPI void syz_libraryConfigSetDefaults(struct syz_LibraryConfig *config);

SYZ_CAPI syz_ErrorCode syz_initialize(void);    
SYZ_CAPI syz_ErrorCode syz_initializeWithConfig(const struct syz_LibraryConfig *config);
SYZ_CAPI syz_ErrorCode syz_shutdown();

Synthizer can be initialized in two ways. The simplest is syz_initialize which will use reasonable library defaults for most apps. The second is (excluding error checking):

struct syz_LibraryConfig cfg;

syz_libraryConfigSetDefaults(&config);
syz_initializeWithConfig(&config);

In particular, the latter approach allows for enabling logging and loading Libsndfile.

Currently Synthizer can only log to stderr and logging is slow enough that it shouldn't be enabled in production. It mostly exists for debugging. In future these restrictions will be lifted.

For more information on Libsndfile support, see the dedicated chapter.

Handles and userdata

Synthizer objects are referred to via reference-counted handles, with an optional void * userdata pointer that can be associated to link them to application state:

SYZ_CAPI syz_ErrorCode syz_handleIncRef(syz_Handle handle);
SYZ_CAPI syz_ErrorCode syz_handleDecRef(syz_Handle handle);
SYZ_CAPI syz_ErrorCode syz_handleGetObjectType(int *out, syz_Handle handle);

SYZ_CAPI syz_ErrorCode syz_handleGetUserdata(void **out, syz_Handle handle);
typedef void syz_UserdataFreeCallback(void *);
SYZ_CAPI syz_ErrorCode syz_handleSetUserdata(syz_Handle handle, void *userdata, syz_UserdataFreeCallback *free_callback);

basics of Handles

All Synthizer handles start with a reference count of 1. When the reference count reaches 0, the object is scheduled for deletion, but may not be deleted immediately. Uniquely among Synthizer functions, syz_handleIncRef and syz_handleDecRef can be called after library shutdown in order to allow languages like Rust to implement infallible cloning and freeing. The issues introduced with respect to object lifetimes due to the fact that Synthizer objects may stay around for a while can be dealt with userdata support, as described below.

No interface except for syz_handleDecRef decrements reference counts in a way which should be visible to the application provided that the application is itself using reference counts correctly.

Synthizer objects are like classes. They have "methods" and "bases". For example all generators support a common set of operations named with a syz_generatorXXX prefix.

The reserved config argument

Many constructors take a void *config argument. This must be set to NULL, and is reserved for future use.

Userdata

Synthizer makes it possible to associate application data via a void * pointer which will share the object's actual lifetime rather than the lifetime of the handle to the object. This is useful for allowing applications to store state, but also helps to deal with the lifetime issues introduced by the mismatch between the last reference count of the object dying and the object actually dying. For example, the Rust and Python bindings use userdata to attach buffers to objects when streaming from memory, so that the actual underlying resource stays around until Synthizer is guaranteed to no longer use it.

Getting and setting userdata pointers is done in one of two ways. All Synthizer constructors take two additional parameters to set the userdata and the free callback. Alternatively, an application can go through syz_getUserdata and syz_setUserdata. These are a threadsafe interface which will associate a void * argument with the object. This interface acts as if the operations were wrapped in a mutex internally, though they complete with no syscalls in all reasonable cases of library usage.

The free_callback parameter to syz_setUserdata is optional. If present, it will be called on the userdata pointer when the object is destroyed or when a new userdata pointer is set. Due to limitations of efficient audio programming, this free happens in a background thread and may occur up to hundreds of milliseconds after the object no longer exists.

Most bindings will bind userdata support in a more friendly way. For example, Python provides a get_userdata and set_userdata pair which work on normal Python objects.

basics of Audio in Synthizer

This section explains how to get audio into and out of Synthizer. The following objects must be used by every application:

  • Generators produce audio, for example by reading a buffer of audio data.
  • Sources play audio from one or more generators.
  • Contexts represent audio devices and group objects for the same device together.

The most basic flow of Synthizer is to create a context, source, and generator, then connect the generator to the source. For example, you might combine BufferGenerator and DirectSource to play a stereo audio file to the speakers, or swap DirectSource for Source3D to place the sound in the 3D environment.

The Context

Contexts represent audio devices and the listener in 3D space. They:

  • Figure out the output channel format necessary for the current audio device and convert audio to it.
  • Offer the ability to globally set gain.
  • Let users set the position of the listener in 3D space.
  • Let users set defaults for other objects, primarily the distance model and panning strategies.
    • If your application wants HRTF on by default, it is done on the context by setting SYZ_P_PANNER_STRATEGY.

For more information on 3D audio, see the dedicated section.

Almost all objects in Synthizer require a context to be created and must be used only with the context they're associated with.

A common question is whether an app should ever have more than one context. Though this is possible, contexts are very expensive objects that directly correspond to audio devices. Having 2 or 3 is the upper limit of what is reasonable, but it is by far easiest to only have one as this prevents running into issues where you mix objects from different contexts together.

Introduction to Generators

generators are how audio first enters Synthizer. They can do things like play a buffer, generate noise, or stream audio data. By themselves, they're silent and don't do anything, so they must be connected to sources via syz_sourceAddGenerator.

Generators are like a stereo without speakers: you have to plug them into something else before they're audible. In this case the "something else" is a source. Synthizer only supports using a generator with one source at a time, but every source can have multiple generators. That is, given generators g1 and g2 and sources s1 and s2, then g1 and g2 could be connected to s1, g1 to s1 and g2 to s2, but not g1 to both s1 and s2 at the same time.

Introduction to Sources

Sources are how generators are made audible. Synthizer offers 3 main kinds of source:

  • The DirectSource plays audio directly, and can be used for things like background music. This is the only source type which won't convert audio to mono before using it.
  • The ScalarPannedSource and AngularPannedSource allow for manual control over pan, either by azimuth/elevation or via a scalar from -1 to 1 where -1 is all left and 1 is all right.
  • The Source3D allows for positioning audio in 3D space.

Every source offers the following functions:

SYZ_CAPI syz_ErrorCode syz_sourceAddGenerator(syz_Handle source, syz_Handle generator);
SYZ_CAPI syz_ErrorCode syz_sourceRemoveGenerator(syz_Handle source, syz_Handle generator);

Every source will mix audio from as many generators as are connected to it and then feed the audio through to the output of the source and to effects. See the section on channel mixing for how audio is converted to various different output formats, and effects and filters for information on how to do more with this API than simply playing audio.

Controlling Object Properties

Basics

most interesting audio control happens through properties, which are like knobs on hardware controllers or dials in your DAW. Synthizer picks the values of properties up on the next audio tick and automatically handles crossfading and graceful changes as your app drives the values. Every property is identified with a SYZ_P constant in synthizer_constants.h. IN bindings, SYZ_P_MY_PROPERTY will generally become my_property or MyProperty or etc. depending on the dominant style of the language, and then either become an actual settable property or a get_property and set_property pair depending on if the language in question supports customized properties that aren't just member variables (e.g. @property in Python, properties in C#).

All properties are of one of the following types:

  • int or double, identified by a i and d suffix in the property API, the standard C primitive types.
  • double3 and double6, identified by d3 and d6 suffixes, vectors of 3 doubles and 6 doubles respectively. Primarily used to set position and orientation.
  • object, identified by a o suffix, used to set object properties such as the buffer to use for a buffer generator.
  • biquad, configuration for a biquad filter. Used on effects and sources to allow filtering audio.

No property constant represents a property of two types. For example SYZ_P_POSITION is both on Context and Source3D but is a d3 in both cases. Generators use SYZ_P_PLAYBACK_POSITION, which is always a double. Synthizer will always maintain this constraint.

The Property API is as follows:

SYZ_CAPI syz_ErrorCode syz_getI(int *out, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setI(syz_Handle target, int property, int value);

SYZ_CAPI syz_ErrorCode syz_getD(double *out, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setD(syz_Handle target, int property, double value);

SYZ_CAPI syz_ErrorCode syz_setO(syz_Handle target, int property, syz_Handle value);

SYZ_CAPI syz_ErrorCode syz_getD3(double *x, double *y, double *z, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setD3(syz_Handle target, int property, double x, double y, double z);

SYZ_CAPI syz_ErrorCode syz_getD6(double *x1, double *y1, double *z1, double *x2, double *y2, double *z2, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setD6(syz_Handle handle, int property, double x1, double y1, double z1, double x2, double y2, double z2);

SYZ_CAPI syz_ErrorCode syz_getBiquad(struct syz_BiquadConfig *filter, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setBiquad(syz_Handle target, int property, const struct syz_BiquadConfig *filter);

Property accesses happen without syscalls and are usually atomic operations and enqueues on a lockfree queue.

Object Properties Are Weak

Object properties do not increment the reference count of the handle associated with them. There isn't much to say here, but it is important enough that it's worth calling out with a section. For example, if you set the buffer on a buffer generator and then decrement the buffer's reference count to 0, the generator will stop playing audio rather than keeping the buffer alive.

A Note on Reading

Property reads need to be further explained. Because audio programming requires not blocking the audio thread, Synthizer internally uses queues for property writes. This means that any read may be outdated by some amount, even if the thread making the read just set the value. Typically, reads should be reserved for properties that Synthizer also sets (e.g. SYZ_P_PLAYBAKCK_POSITION) or used for debugging purposes.

syz_getO is not offered by this API because it requires a mutex, which the audio thread also can't handle. Additionally, object lifetime concerns make it more difficult for such an interface to do something sane.

Though the above limitations prevent this anyway, it is in general an antipattern to store application state in your audio library. Even if reads were always up to date, it would still be slow to get data back out. Applications should keep things like object position around and update Synthizer, rather than asking Synthizer what the last value was.

Setting Gain/volume

SYZ_P_GAIN

All objects which play audio (generators, sources, contexts) offer a SYZ_P_GAIN property, a double scalar between 0.0 and infinity which controls object volume. For example 2.0 is twice the amplitude and 0.5 is half the amplitude. This works as you'd expect: if set on a generator it only affects that generator, if on a source it affects everything connected to the source, and so on. If a generator is set to 0.5 and the source that it's on is also 0.5, the output volume of the generator is 0.25 because both gains apply in order.

This means that it is possible to control the volume of generators relative to each other when all connected to the same source, then control the overall volume of the source.

A Note on Human Perception

Humans don't perceive amplitude changes as you'd expect. For example, moving from 1.0 to 2.0 will generally sound like a large gap in volume, but from 2.0 to 3.0 much less so, and so on. Most audio applications that expose volume sliders to humans will expose them as decibels and convert to an amplitude factor internally. If you're just writing a game, you can mostly ignore this, but if you're doing something more complicated a proper understanding of decibels is important. In decibals, a gain of 1.0 is at 0 dB and every increase and/or decrease by 1 decibel sounds like the same amount of loudness as any other. The specific formulas to get to and/or from a gain are as follows:

decibels = 20 * log10(gain)
gain = 10**(db/20)

Where ** is exponentiation.

The obvious question is of course "why not expose this as decibels?" The problem with decibels is that gains over 1.0 will clip in most applications. But a gain of 1.0 in decibels is 0 dB. If there are two incredibly loud sounds both with a gain of 1.0 playing at the same time, the overall gain is effectively 2.0, which can clip in the same way. But 0 dB + 0 dB is still 0 dB even though the correct gain is 2.0. This gets worse for gains below 0. Consider 0.5, which is equivalent to roughly -6 dB. 0.5 + 0.5 is 1, but -6 + -6 is -12 dB. Which isn't only wrong, it even moved in the wrong direction all together.

As a consequence Synthizer always uses multiplicative factors on the amplitude, not decibels. Unless you know what you're doing, you should convert to gain as soon as possible and reason about how this works as a multiplier.

Pausing and Resuming Playback

All objects which play audio offer the following two functions:

SYZ_CAPI syz_ErrorCode syz_pause(syz_Handle object);
SYZ_CAPI syz_ErrorCode syz_play(syz_Handle object);

Which do exactly what they seem like they do.

In bindings, these are usually bound as instance methods, e.g. myobj.pause().

Configuring Objects to Continue Playing Until Silent

By default, Synthizer objects become silent when their reference counts go to 0, but this isn't always what you want. Sometimes, it is desirable to be able to continue playing audio until the object is "finished", for example for gunshots or other one-off effects. Synthizer calls this lingering, and offers the following API to configure it:

struct syz_DeleteBehaviorConfig {
    int linger;
    double linger_timeout;
};

SYZ_CAPI void syz_initDeleteBehaviorConfig(struct syz_DeleteBehaviorConfig *cfg);
SYZ_CAPI syz_ErrorCode syz_configDeleteBehavior(syz_Handle object, struct syz_DeleteBehaviorConfig *cfg);

To use it, call syz_initDeleteBehaviorConfig on an empty syz_DeleteBehaviorConfig struct, fill out the struct, and call syz_configDeleteBehavior. The fields have the following meaning:

  • linger: if 0, die immediately, which is the default. If 1, keep the object around until it "finishes". What this means depends on the object and is documented in the object reference, but it generally "does what you'd expect". For some examples:
    • BufferGenerator will stop any looping and play until the end of the buffer, or die immediately if paused.
    • All sources will keep going until all their generators are no longer around.
  • linger_timeout: if nonzero, set an upper bound on the amount of time an object may linger for. This is useful as a sanity check in your application.

These functions only configure what happens when the last reference to an object goes away and do not destroy the object or manipulate the reference count in any other way. It is valid to call them immediately after object creation if desired. No Synthizer interface besides syz_handleDecRef will destroy an object unless otherwise explicitly documented.

Lingering doesn't keep related objects alive. For example a BufferGenerator that is lingering still goes silent if the buffer attached to it is destroyed.

As with pausing, bindings usually make this an instance method.

Decoding Audio Data

The Quick Overview

Synthizer supports mp3, wav, and flac. If you need more formats, then you can load Libsndfile or decode the data yourself.

If you need to read from a file, use e.g. syz_createBufferFromFile. If you need to read from memory, use e.g. syz_createBufferFromEncodedData. If you need to just shove floats at Synthizer, use syz_creatBufferFromFloatArray.

StreamingGenerator has a similar set of methods. In general you can find out what methods are available in the object reference. Everything supports some function that's equivalent to syz_createBufferFromFile.

These functions are the most stable interface because they can be easily supported across incompatible library versions. If your app can use them, it should do so.

Streams

Almost all of these methods wrap and hide something called a stream handle, which can be created with e.g. syz_createStreamHandleFromFile, then used with e.g. syz_createBufferFromStreamHandle. Bindings expose this to you, usually with classes or your language's equivalent (e.g. in Python this is StreamHandle). This is used to get data from custom sources, for example the network or encrypted asset stores. For info on writing your own streams, see the dedicated section.

In addition to get streams via specific methods, Synthizer also exposes a generic interface:

SYZ_CAPI syz_ErrorCode syz_createStreamHandleFromStreamParams(syz_Handle *out, const char *protocol, const char *path, void *param);

Using the generic interface, streams are referred to with:

  • A protocol, for example"file", which specifies the kind of stream it is. Users can register their own protocols.
  • A path, for example to a file on disk. This is protocol-specific.
  • And a void * param, which is passed through to the underlying stream implementation, and currently ignored by Synthizer.

So, for example, you might get a file by:

syz_createStreamHandleFromStreamParams("file", path, NULL);

Streams don't support raw data. They're always an encoded asset. So for example mp3 streams are a thing, but floats at 44100 streams aren't. Synthizer will offer a better interface for raw audio data pending there being enough demand and a reason to go beyond syz_createBufferFromFloatArray.

Loading Libsndfile

Synthizer supports 3 built-in audio formats: wav, MP3, and Flac. For apps which need more, Synthizer supports loading Libsndfile. To do so, use syz_initializeWithConfig and configure libsndfile_path to be the absolute path to a Libsnddfile shared object (.dll, .so, etc). Libsndfile will then automatically be used where possible, replacing the built-in decoders.

Unfortunately, due to Libsndfile limitations, Libsndfile can only be used on seekable streams of known length. All Synthizer-provided methods of decoding currently support this, but custom streams may opt not to do so, for example if they're reading from the network. In this case, Libsndfile will be skipped. To see if this is happening, enable debug logging at library initialization and Synthizer will log what decoders it's trying to use.

Because of licensing incompatibilities, Libsndfile cannot be statically linked with Synthizer without effectively changing Synthizer's license to LGPL. Consequently dynamic linking with explicit configuration is the only way to use it. Your app will need to arrange to distribute a Libsndfile binary as well and use the procedure described above to load it.

Implementing Custom Streams and Custom Stream Protocols

Synthizer supports implementing custom streams in order to read from places that aren't files or memory: encrypted asset stores, the network, and so on. This section explains how to implement them.

before continuing, carefully consider whether you need this. Implementing a stream in a higher level language and forcing Synthizer to go through it has a small but likely noticeable performance hit. It'll work fine, but the built-in functionality will certainly be faster and more scalable. Implementing a stream in C is a complex process. If your app can use the already-existing funtionality, it is encouraged to do so.

A Complete Python Example

The rest of this section will explain in detail how streams work from the C API, but this is a very complex topic and most of the infrastructure which exists for it exists to make it possible to write convenient bindings. Consequently, here is a complete and simple custom stream which wraps a Python file object, registered as a custom protocol:

class CustomStream:
    def __init__(self, path):
        self.file = open(path, "rb")

    def read(self, size):
        return self.file.read(size)

    def seek(self, position):
        self.file.seek(position)

    def close(self):
        self.file.close()

    def get_length(self):
        pos = self.file.tell()
        len = self.file.seek(0, 2)
        self.file.seek(pos)
        return len

def factory(protocol, path, param):
    return CustomStream(path)


synthizer.register_stream_protocol("custom", factory)
gen = synthizer.StreamingGenerator.from_stream_params(ctx, "custom", sys.argv[1])

Your bindings will document how to do this, for example in Python see help(synthizer.register_stream_protocol). it's usually going to be this level of complexity when doing it from a binding. The rest of this section explains what's going on from the C perspective, but non-C users are still encouraged to read it because it explains the general idea and offers best practices for efficient and stable stream usage.

It's important to note that though this example demonstrates using StreamingGenerator, buffers have similar methods to decode themselves from streams. Since StreamingGenerator has a large latency for anything but the initial start-up, the primary use case is actually likely to be buffers.

The C Interface

To define a custom stream, the following types are used:

typedef int syz_StreamReadCallback(unsigned long long *read, unsigned long long requested, char *destination, void *userdata, const char ** err_msg);
typedef int syz_StreamSeekCallback(unsigned long long pos, void *userdata, const char **err_msg);
typedef int syz_StreamCloseCallback(void *userdata, const char **err_msg);
typedef void syz_StreamDestroyCallback(void *userdata);

struct syz_CustomStreamDef {
    syz_StreamReadCallback *read_cb;
    syz_StreamSeekCallback *seek_cb;
    syz_StreamCloseCallback *close_cb;
    syz_StreamDestroyCallback *destroy_cb;
    long long length;
    void *userdata;
};

SYZ_CAPI syz_ErrorCode syz_createStreamHandleFromCustomStream(syz_Handle *out, struct syz_CustomStreamDef *callbacks);

typedef int syz_StreamOpenCallback(struct syz_CustomStreamDef *callbacks, const char *protocol, const char *path, void *param, void *userdata, const char **err_msg);
SYZ_CAPI syz_ErrorCode syz_registerStreamProtocol(const char *protocol, syz_StreamOpenCallback *callback, void *userdata);

The following sections explain how these functions work.

Ways To Get A Custom Stream

There are two ways to get a custom stream. You can:

  • Fill out the callbacks in syz_CustomStreamDef and use syz_createStreamHandleFromCustomStream.
  • Write a function which will fill out syz_CustomStreamDef from the standard stream parameters, and register a protocol with syz_registerStreamProtocol.

The difference between these is the scope: if you don't register a protocol, only your app can access the custom stream, presumably via a module that produces them. This is good because it keeps things modular. If registering a protocol, however, the protocol can be used from anywhere in the process, including other libraries and modules. For example, writing a encrypted_sqlite3 protocol C library could then be used to add the "encrypted_sqlite3" protocol to any language.

Protocol names must be unique. The behavior is undefined if they aren't. A good way of ensuring this is to namespace them. For example, "ahicks92.my_super_special_protocol".

The void *param parameter is reserved for your implementation, and passed to the factory callback if using the stream parameters approach. It's assumed that implementations going through syz_createStreamHandleFromCustomStreamDef already have a way to move this information around.

Non-callback syz_CustomStreamDef parameters

These are:

  • length, which must be set and known for seekable streams. If the length of the stream is unknown, set it to -1.
  • userdata, which is passed as the userdata parameter to all stream callbacks.

The Stream Callbacks

Streams have the following callbacks, with mostly self-explanatory parameters:

  • If going through the protocol interface, the open callback is called when the stream is first opened. If going through syz_createStreamHandleFromCustomStream, it is assumed that the app already opened the stream and has put whatever it is going to need into the userdata field.
  • After that, the read and (if present) seek callbacks are called until the stream is no longer needed. The seek callback is optional.
  • The close callback is called when Synthizer will no longer use the underlying asset.
  • The destroy callback, optional, is called when it is safe to free all resources the stream is using.

For more information on why we offer both the close and destroy callback, see below on error handling.

All callbacks should return 0 on success, and (if necessary) write to their out parameters.

The read callback must always read exactly as many bytes are requested, never more. If it reads less bytes than were requested, Synthizer treats this as an end-of-stream condition. If the end of the stream has already been reached, the read callback should claim it read no bytes.

The seek callback is optional. Streams don't need to support seeking, but this disables seeking in StreamingGenerator. it also disables support for Libsndfile if Libsndfile was loaded, and additionally support for decoding wav files. In order to be seekable, a stream must:

  • Have a seek callback; and
  • Fill out the length field with a positive value, the length of the stream in bytes.

Error Handling

To indicate an error, callbacks should return a non-zero return value and (optionally) set their err_msg parameter to a string representation of the error. Synthizer will log these errors if logging is enabled. For more complex error handling, apps are encouraged to ferry the information from streams to their main threads themselves. If a stream callback fails, Synthizer will generally stop the stream all together. Consequently, apps should do their best to recover and never fail the stream. Synthizer takes the approach of assuming that any error is likely unrecoverable and expects that implementations already did their best to succeed.

if the read callback fails, the position of the stream isn't updated. If the seek callback fails, Synthizer assumes that the position didn't move.

the reason that Synthizer offers a destroy callback in addition to one for closing is so that streams may use non-static strings as error messages. Synthizer may not be done logging these when the stream is closed, so apps doing this should make sure that they live at least as long as the destroy callback, after which Synthizer promises to never use anything related to this stream again.

The simplest way to handle error messages for C users is to just use string constants, but for other languages such as Python it is useful to be able to convert errors to strings and attach them to the binding's object so that these can be logged. The destroy callback primarily exists for this use case.

Synthizer makes one more guarantee on the lifetime required of err_msg strings: they need only live as long as the next time a stream callback is called. This means that, for example, the Python binding only keeps the most recent error string around and replaces it as necessary.

Thread Safety

Streams will only ever be used by one thread at a time, but may be moved between threads.

Channel Upmixing and Downmixing

Synthizer has built-in understanding of mono (1 channel) and stereo (2 channels) audio formats. It will mix other formats to these as necessary. Specifically, we:

  • If converting from mono to any other format, broadcast the mono channel to all of those in the other format.
  • If going to mono, sum and normalize the channels in the other format.
  • Otherwise, either drop extra channels or fill extra channels with silence.

Synthizer will be extended to support surround sound in future, which will give it a proper understanding of 4, 6, and 8 channels. Since Synthizer is aimed at non-experimental home media applications, we assume that the channel count is sufficient to know what the format is going to be. For example, there is no real alternative to 5.1 audio in the home environment if the audio has 6 channels. If you need more complex multichannel handling, you can pre-convert your audio to something Synthizer understands. Otherwise, other libraries may be a better option.

3D Audio, Panning, and HRTF

Introduction

Synthizer supports panning audio through two interfaces.

First is AngularPannedSource and ScalarPannedSource, which provides simple azimuth/elevation controls and the ability to pan based off a scalar, a value between -1 (all left) and 1 (all right). In this case the user application must compute these values itself.

The second way is to use Source3D, which simulates a 3D environment when fed positional data. This ection concerns itself with proper use of Source3D, which is less straightforward for those who haven't had prior exposure to these concepts.

Introduction

There are two mandatory steps to using Source3D as well as a few optional ones. The two mandatory steps are these:

  • On the context, update SYZ_P_POSITION and SYZ_P_ORIENTATION with the listener's position and orientation
  • On the source, update SYZ_P_POSITION with the source's position.

And optionally:

  • Configure the default distance model to control how rapidly sources become quiet.
  • Emphasize that sources have become close to the listener with the focus boost.
  • Add effects (covered in a dedicated section).

Don't Move Sources Through the Head

People frequently pick up Synthizer, then try to move the source through the center of the listener's head, then ask why it's weird and didn't work. It is important to realize that this is a physical simulation of reality, and that the reason you can move the source through the listener's head in the first place is that this isn't an easily detectable case. If you aren't driving Synthizer in a way connected to physical reality--for example if you are attempting to use x as a way to pan sources from left to right and not linked to a position--then you probably want one of the raw panned sources instead.

Setting Global Defaults

In addition to controlling the listener, the context offers the ability to set defaults for all values discussed below. This is done through a set of SYZ_P_DEFAULT_* properties which match the names of those on sources. This is of particular interest to those wishing to use HRTF, which is off by default.

Synthizer's Coordinate System and Orientations

The short version for those who are familiar with libraries that need position and orientation data: Synthizer uses a right-handed coordinate system and configures orientation through double6 properties so that it is possible to atomically set all 6 values at once. This is represented as a packed at and up vector pair in the format (at_x, at_y, at_z, up_x, up_y, up_z) on the context under the SYZ_P_ORIENTATION property. As with every other library doing a similar thing, these are unit vectors.

The short version for those who want a 2D coordinate system where positive y is north, positive x is east, and player orientations are represented as degrees clockwise of north: use only x and y of SYZ_P_POSITION, and set SYZ_P_ORIENTATION as follows:

(sin(angle * PI / 180), cos(angle * PI / 180), 0, 0, 0, 1)

The longer version is as follows.

Listener positions are represented through 3 values: the position, the at vector, and the up vector. The position is self-explanatory. The at vector points in the direction the listener is looking at all times, and the up vector points out the top of the listener's head, as if there were a pole up the spine. By driving these 3 values, it is possible to represent any position and orientation a listener might assume.

Synthizer uses a right-handed coordinate system, which means that if the at vector is pointed at positive x and the up vector at positive y, positive z moves sources to the right. This is called a right-handed coordinate system because of the right-hand rule: if you point your fingers along positive x and curl them toward positive y, your finger points at positive z. This isn't a standard. Every library that does something with object positions tends to choose a different value. If combining Synthizer with other non-2D components, it may be necessary to convert between coordinate systems. Resources on how to do this may easily be found through Google.

The at and up vectors must always be orthogonal, that is forming a right angle with each other. In order to facilitate this, Synthizer uses double6 properties so that both values can and must be set at the same time. If we didn't, then there would be a brief period where one was set and the other wasn't, in which case they would temporarily be invalid. Synthizer doesn't try to validate that these vectors are orthogonal and generally tries to do its best when they aren't, but nonetheless behavior in this case is undefined.

Finally, the at and up vectors must be unit vectors: vectors of length 1.

Panning strategies and HRTF.

The panning strategy specifies how sources are to be panned. Synthizer supports the following panning strategies:

StrategyChannelsDescription
SYZ_PANNER_STRATEGY_HRTF2An HRTF implementation, intended for use via headphones.
SYZ_PANNER_STRATEGY_STEREO2A simple stereo panning strategy assuming speakers are at -90 and 90.

When a source is created, the panning strategy it is to use is passed via the constructor function and cannot be changed. A special value, SYZ_PANNER_STRATEGY_DELEGATE allows the source to delegate this to the context, and can be used in cases where the context's configuration should be preferred. A vast majority of applications will do this configuration via the context and SYZ_PANNER_STRATEGY_DELEGATE; other values should be safed for cases in which you wish to mix panning types.

By default Synthizer is configured to use a stereo panning strategy, which simply pans between two speakers. This is because stereo panning strategies work on all devices from headphones to 5.1 surround sound systems, and it is not possible for Synthizer to reliably determine if the user is using headphones or not. HRTF provides a much better experience for headphone users but must be enabled by your application through setting the default panner strategy or doing so on individual sources.

Since panning strategies are per source, it is possible to have sources using different panning strategies. This is useful for two reasons: HRTF is expensive enough that you may wish to disable it if dealing with hundreds or thousands of sources, and it is sometimes useful to let UI elements use a different panning strategy. An example of this latter case is an audio gauge which pans from left to right.

Distance Models

The distance model controls how quickly sources become quiet as they move away from the listener. This is controlled through the following properties:

  • SYZ_P_DISTANCE_MODEL: which of the distance model formulas to use.
  • SYZ_P_DISTANCE_MAX: the maximum distance at which the source will be audible.
  • SYZ_P_DISTANCE_REF: if you assume your source is a sphere, what's the radius of it?
  • SYZ_P_ROLLOFF: with some formulas, how rapidly does the sound get quieter? Generally, configuring this to a higher value makes the sound drop off more immediately near the head, then have more subtle changes at further distances.

It is not possible to provide generally applicable advice for what you should set the distance model to. A game using meters needs very different settings than one using feet or light years. Furthermore, these don't have concrete physical correspondances. Of the things Synthizer offers, this is possibly the least physically motivated and the most artistic from a game design perspective. In other words: play with different values and see what you like.

The concrete formulas for the distance models are as follows. Let d be the distance to the source, d_ref the reference distance, d_max the max distance, r the roll-off factor. Then the gain of the source is computed as a linear scalar using one of the following formulas:

ModelFormula
SYZ_DISTANCE_MODEL_NONE1.0
SYZ_DISTANCE_MODEL_LINEAR1 - r * (clamp(d, d_ref, d_max) - d_ref) / (d_max - d_ref);
SYZ_DISTANCE_MODEL_EXPONENTIAL when d_ref == 0.00.0
SYZ_DISTANCE_MODEL_EXPONENTIAL when d_ref > 0.0(max(d_ref, d) / d_ref) ** -r
SYZ_DISTANCE_MODEL_INVERSE when d_ref = 0.00.0
SYZ_DISTANCE_MODEL_INVERSE when d_ref > 0.0d_ref / (d_ref + r * max(d, d_ref) - d_ref)

The Closeness Boost

Sometimes, it is desirable to make sources "pop out" of the background environment. For example, if the player approaches an object with which they can interact, making it noticeably louder as the boundary is crossed can be useful. This is of primary interest to audiogame designers, a type of game for the blind, as it can be used to emphasize features of the environment in non-realistic but informative ways.

This is controlled through two properties:

  • SYZ_P_CLOSENESS_BOOST: a value in DB controlling how much louder to make the sound. Negative values are allowed.
  • SYZ_P_CLOSENESS_BOOST_DISTANCE: when the source is closer than this distance, begin applying the closeness boost.

When the source is closer than the configured distance, the normal gain computation still applies, but an additional factor, the number of DB in the closeness boost, is added. This means that it is still possible for players to know if they are getting closer to the source.

The reason that the closeness boost is specified inDB is that otherwise it would require values greater than 1.0, and it is primarily going to be fed from artists and map developers. If we discover that this is a problem in future, it will be patched in a major Synthizer version bump.

Note that closeness boost has not gotten a lot of use yet. Though we are unlikely to remove the interface, the internal algorithms backing it might change.

Filters and Effects

Synthizer supports filters and effects in order to add environmental audio and do more than just playing sources in a vacuum. These sections explain how this works.

Filters

Synthizer supports a filter property type, as well as filters on effect sends. The API for this is as follows:

struct syz_BiquadConfig {
...
};

SYZ_CAPI syz_ErrorCode syz_getBiquad(struct syz_BiquadConfig *filter, syz_Handle target, int property);
SYZ_CAPI syz_ErrorCode syz_setBiquad(syz_Handle target, int property, const struct syz_BiquadConfig *filter);

SYZ_CAPI syz_ErrorCode syz_biquadDesignIdentity(struct syz_BiquadConfig *filter);
SYZ_CAPI syz_ErrorCode syz_biquadDesignLowpass(struct syz_BiquadConfig *filter, double frequency, double q);
SYZ_CAPI syz_ErrorCode syz_biquadDesignHighpass(struct syz_BiquadConfig *filter, double frequency, double q);
SYZ_CAPI syz_ErrorCode syz_biquadDesignBandpass(struct syz_BiquadConfig *filter, double frequency, double bandwidth);

See properties for how to set filter properties and effects for how to apply filters to effect sends.

The struct syz_BiquadConfig is an opaque struct whose fields are only exposed to allow allocating them on the stack. It represents configuration for a biquad filter, designed using the Audio EQ Cookbook. It's initialized with one of the above design functions.

A suggested default for q is 0.7071135624381276, which gives Buttererworth lowpass and highpass filters. For those not already familiar with biquad filters, q controls resonance: higher values of q will cause the filter to ring for some period of time.

All sources support filters, which may be installed in 3 places:

  • SYZ_P_FILTER: applies to all audio traveling through the source.
  • SYZ_P_FILTER_DIRECT: applied after SYZ_P_FILTER to audio going directly to the speakers/through panners.
  • SYZ_P_FILTER_EFFECTS: Applied after SYZ_P_FILTER to audio going to effects.

This allows filtering the audio to effects separately, for example to cut high frequencies out of reverb on a source-by-source basis.

Additionally, all effects support a SYZ_P_FILTER_INPUT, which applies to all input audio to the effect. So, you can either have:

source filter -> direct path filter -> speakers

Or:

source filter -> effects filter outgoing from source -> filter on effect send -> input filter to effect -> effect

In future, Synthizer will stabilize the syz_BiquadConfig struct and use it to expose more options, e.g. automated filter modulation.

Effects and Effect Routing

users of the Synthizer API can route any number of sources to any number of global effects, for example echo. This is done through the following C API:

struct syz_RouteConfig {
    double gain;
    double fade_time;
    syz_BiquadConfig filter;
};

SYZ_CAPI syz_ErrorCode syz_initRouteConfig(struct syz_RouteConfig *cfg);
SYZ_CAPI syz_ErrorCode syz_routingConfigRoute(syz_Handle context, syz_Handle output, syz_Handle input, struct syz_RouteConfig *config);
SYZ_CAPI syz_ErrorCode syz_routingRemoveRoute(syz_Handle context, syz_Handle output, syz_Handle input, double fade_out);
SYZ_CAPI syz_ErrorCode syz_routingRemoveAllRoutes(syz_Handle context, syz_Handle output, double fade_out);

Routes are uniquely identified by the output object (Source3D, etc) and input object (Echo, etc). There is no route handle type, nor is it possible to form duplicate routes.

In order to establish or update the parameters of a route, use syz_routingConfigRoute. This will form a route if there wasn't already one, and update the parameters as necessary.

It is necessary to initialize syz_RouteConfig with syz_initRouteConfig before using it, but this need only be done once. After that, reusing the same syz_RouteConfig for a route without reinitializing it is encouraged.

Gains are per route and apply after the gain of the source. For example, you might feed 70% of a source's output to something (gain = 0.7).

Filters are also per route and apply after any filters on sources. For example, this can be used to change the filter on a per-reverb basis for a reverb zone algorithm that feeds sources to more than one reverb at a time.

In order to remove a route, use syz_routingRemoveRoute. Alternatively, syz_routingRemoveAllRoutes can remove all routes from a source.

Many effects involve feedback and/or other long-running audio as part of their intended function. But while in development, it is often useful to reset an effect. Synthizer exposes a function for this purpose:

SYZ_CAPI syz_ErrorCode syz_effectReset(syz_Handle effect);

Which will work on any effect (at most, it does nothing). As with things like property access this is slow, and it's also not going to sound good, but it can do things like clear out the feedback paths of a reverb at the Python shell for interactive experimentation purposes.

Events

Synthizer supports receiving events. Currently, this is limited to knowing when buffer/streaming generators have looped and/or finished. Note that the use case of destroying objects only after they have stopped playing is better handled with lingering.

The API for this is as follows:

struct syz_Event {
    int type;
    syz_Handle source;
    syz_Handle context;
};

SYZ_CAPI syz_ErrorCode syz_contextEnableEvents(syz_Handle context);
SYZ_CAPI syz_ErrorCode syz_contextGetNextEvent(struct syz_Event *out, syz_Handle context, unsigned long long flags);

SYZ_CAPI void syz_eventDeinit(struct syz_Event *event);

To begin receiving events, an application should call syz_contextEnableEvents. This cannot be undone. After a call to syz_contextEnableEvents, events will begin to fill the event queue and must be retrieved with syz_contextGetNextEvent. Failure to call syz_contextGetNextEvent will slowly fill the event queue, so applications should be sure to incorporate this into their main UI/game update loops. After the application is done with an event struct, it should then call syz_eventDeinit on the event structure; failure to do so leaks handles.

The flags argument of syz_getNextEvent is reserved and must be 0.

Events have a type, context, and source. The type is the kind of the vent. The context is the context from which the event was extracted. The source is the source of the event. Sources are not sources as in the Synthizer object, and are actually most commonly generators.

Event type constants are declared in synthizer_constants.h with all other constants. Currently Synthizer only offers SYZ_EVENT_TYPE_FINISHED and SYZ_EVENT_TYPE_LOOPED which do exactly what they sound like: finished fires when a generator which isn't configured to loop finished, and looped every time a looping generator resets.

A special event type constant, SYZ_EVENT_TYPE_INVALID, is returned by syz_contextGetNextEvent when there are no events in the queue. To write a proper event loop (excluding error handling):

struct syz_Event evt;

while(1) {
    syz_contextGetNextEvent(&evt, context, 0);
    if (evt.type == SYZ_EVENT_TYPE_INVALID) {
        break;
    }
    // handle it
}

Synthizer will never return an event if any handle to which the event refers is invalid at the time the event was extracted from the queue. This allows applications to delete handles without having to concern themselves with whether or not an event refers to a deleted handle.

In order to also offer thread safety, Synthizer event handling will temporarily increment the reference counts of any handles to which an event refers, then decrement them when syz_eventDeinit is called. This allows applications the ability to delete objects on threads other than the thread handling the event, at the cost of extending the lifetimes of these handles slightly. It is possible for an application to call syz_handleIncRef if the application wishes to keep one of these handles around.

The Automation API

NOTE: While the intent is that the following is stable, this is provisional and subject to change until things have had a chance to settle.

Introduction

Synthizer implements a WebAudio style automation API which can allow for the automation of double, d3, and d6 properties. This functions through a command queue of automation events, each with a time associated.

In order for applications to know when automation is getting low or for other synchronization events, it is also possible to enqueue a command which sends an event to your application.

The C API for this funtionality is very large. Readers should refer to synthizer.h for function and struct prototypes.

A Note on Accuracy

This is the first cut of the automation API. Consequently there's a big limitation: it's not possible to perfectly synchronize automation with the beginning of a generator/source's playback. That is, as things stand, attempts to build instruments and keyboards are bound to be disappointing. Future versions of Synthizer will likely improve this case, but the primary use for the automation API is fadeouts and crossfades, both of which can be done as things stand today. Rather than fix this now, this API is being made available to gain experience in order to determine how we want to address this deficiency in the future.

Improvement for accuracy and the ability to do instrument-level automation is being tracked here.

Overview of usage

The flow of the automation API looks like this:

  • get the current time for an object by reading one of the following properties for that object, either from the object itself or the context:
    • Using SYZ_P_CURRENT_TIME plus a small amount of latency, in order to let the application determine how long it needs to get commands enqueued; or
    • SYZ_P_SUGGESTED_AUTOMATION_TIME for Synthizer's best suggestion of when to enqueue new commands.
  • Build an array of struct syz_AutomationCommand.
    • Set the type of command to a SYZ_AUTOMATION_COMMAND_XXX constant.
    • Set the time to the time of the command.
    • Set the target to the object to be automated.
    • Set the appropriate command payload in the params union.
    • Leave flags at 0, for now.
  • get an automation batch with syz_createAutomationBatch.
  • Add commands to it with syz_automationbatchAddCommands.
  • Execute the batch with syz_automationbatchExecute
  • Destroy the batch with syz_handleDecRef

The above is involved primarily because this is C: bindings can and should offer easier, builder-like interfaces that aren't nearly so difficult.

What's going on?

Automation allows users to build a timeline of events and property values, relative to the context's current time. These events can then be enqueued for execution as time advances, and will happen within one "block" of the current time--accurate to about 5MS. Events can be scheduled in the past, and will contribute to any fading that's going on.

The best way to view this is as an approximation of a function. That is, if an event is enqueued at time 0 to set property x to 5, and then at time 10 to set it to 10, then at time 5 the property might be at 5 if the interpolation type is set to linear. The key insight is that it doesn't matter when the time 0 event was scheduled: if it was added at time 4, the value is still 5.

If events are scheduled early, they happen immediately. This is in line with the intended uses of using them to know when automation is nearing its end or is past specific times.

Time

Time in Synthizer is always advancing unless the context is paused. When the context is created, time starts at 0, then proceeds forward. It is possible to read the time using SYZ_P_CURRENT_TIME, which is available on all objects you'd expect: generators, source, effects, etc. By building relative to these object-local times, it is possible for applications to be prepared for the future when this may no longer be the case, even though they all currently just defer to the context.

In most cases, though, automating relative to the current time will cause commands to arrive late, and execute in the past. To deal with this, Synthizer offers SYZ_P_SUGGESTED_AUTOMATION_TIME, the time that Synthizer suggests you might want to enqueue commands relative to for them to arrive on time. This also advances forward, but is in the future relative to SYZ_P_CURRENT_TIME. Currently, this is a simple addition, but the goal is to make the algorithm smarter in the future.

SYZ_P_SUGGESTED_AUTOMATION_TIME is also quite latent. It is perfectly acceptable for applications to query the current time, then add their own smaller offset.

There is one typical misuse that people like to do with APIs such as this: don't re-read the current time when building continuing timelines. Since the current time is always advancing, this will cause your timeline to be "jagged". That is:

current_time = # get current time
enqueue(current_time + 5)
current_time = # get current time
enqueue(current_time + 10)

Will not produce events 5 seconds apart, but instead 5 seconds and a bit. Correct usage is:

current_time = #get current time
enqueue_command(current_time + 5)
enqueue_command(current_time + 10)

A good rule of thumb is that time should be read when building an automation batch, then everything for that batch done relative to the time that was read at the beginning.

Finally, as implied by the above, automation doesn't respect pausing unless it's on the context: pausing a generator with automation attached will still have automation advance even while it is paused. This is currently unavoidable due to internal limitations which require significant work to lift.

Automation Batches

In order to enqueue commands, it is necessary to put them somewhere. We also want an API which allows for enqueueing more than one command at a time for efficiency. In order to do this, Synthizer introduces the concept of the automation batch, created with syz_createAutomationBatch.

Automation batches are one-time-use command queues, to which commands can be appended with syz_automationBatchAddCommands. Uniquely in Synthizer, they are not threadsafe: you must provide synchronization yourself. Once all the commands you want to add are added, syz_automationBatchExecute executes the batch. Afterwords, due to internal limitations, the batch may not be reused.

Note that the intent is that batches can automate multiple objects at once. Using a different object with every command is perfectly fine.

Commands

Automation offers the following command types:

  • SYZ_AUTOMATION_COMMAND_APPEND_PROPERTY: Add a value to the timeline for a property, e.g. "use linear interpolation to get to 20.0 by 10 seconds".
  • SYZ_AUTOMATION_COMMAND_SEND_USER_EVENT: Add an event to the timeline, to be sent approximately when the timeline crosses the specified time.
  • SYZ_AUTOMATION_COMMAND_CLEAR_PROPERTY: Clear all events related to one specific property on one specific object.
  • SYZ_AUTOMATION_COMMAND_CLEAR_EVENTS: Clear all scheduled events for an object.
  • SYZ_AUTOMATION_COMMAND_CLEAR_ALL_PROPERTIES: Clear automation for all properties for a specific object.

Commands which clear data don't respect time, and take effect immediately. Typically, they're put at the beginning of the automation batch in order to clean up before adding new automation.

Parameters to commands are specified as the enum variants of syz_AutomationCommand's params union and are mostly self-explanatory. Property automation is discussed below.

Automating Property values

Using SYZ_AUTOMATION_COMMAND_APPEND_PROPERTY, it is possible to append values to the timeline for a property on a specific object. This is done via the syz_AutomationPoint struct:

struct syz_AutomationPoint {
  unsigned int interpolation_type;
  double values[6];
  unsigned long long flags;
};

The fields are as follows:

  • The interpolation type specifies how to get from the last value (if any) to the current value in the point:
    • SYZ_INTERPOLATION_TYPE_NONE: do nothing until the point is crossed, then jump immediately.
    • SYZ_P_INTERPOLATION_TYPE_LINEAR: use linear interpolation from the last point.
  • The values array specifies the values. The number of indices used depends on the property: doubles use only the first index, d3 use the first 3, and d6 use all 6.
  • Flags is reserved and must be 0.

The timeline for a specific property represents the approximation of a continuous function, not a list of commands. As alluded to above, if points are early they still take effect by changing the continuous function. This is done so that things like audio "animations" can execute accurately even if the application experiences a delay.

Property automation interacts with normal property sets by desugaring normal property sets to SYZ_INTERPOLATION_TUYPE_NONE points scheduled "as soon as possible". Users are encouraged to either use automation or property setting rather than both at the same time, but both at the same time has a consistent behavior.

Events

Events are much simpler: you specify a unsigned long long parameter, and get it back when the event triggers. unsigned long long is used because it is possible to losslessly store a pointer in it on all platforms we support without having to worry about the size changing on 32-bit platforms.

Examples.

Synthizer's repository contains some examples demonstrating common automation uses in the examples directory:

  • simple_automation.c: build a simple envelope around a waveform.
  • automation_circle.c: automate the position of a source to move it in circles.
  • play_notes.c: play some notes every time the user presses enter (this works because the waveform is constant, but isn't ideal as explained above).

Stability and Versioning

Synthizer uses pre-1.0 semantic versioning. This means:

  • Major is always 0.
  • Minor is incremented for incompatible API changes.
  • Patch is incremented for new features and/or bug fixes.

Synthizer is intended to be production ready software, but has not seen wide usage. It's somewhere between beta and 1.0: not as many features as you might want, but also not crashing at the drop of a hat. If you find bugs, please report them against the official repository.

API breakage is still expected. This manual attempts to document where API breakage may occur. These are referred to as provisional features.

Context

Constructors

syz_createContext

SYZ_CAPI syz_ErrorCode syz_createContext(syz_Handle *out, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);

Creates a context configured to play through the default output device.

Properties

EnumTypeDefaultRangeDescription
SYZ_P_GAINdouble1.0value >= 0.0The gain of the context
SYZ_P_POSITIONdouble3(0, 0, 0)anyThe position of the listener.
SYZ_P_ORIENTATIONdouble6(0, 1, 0, 0, 0, 1)Two packed unit vectorsThe orientation of the listener as (atx, aty, atz, upx, upy, upz).
SYZ_P_DEFAULT_DISTANCE_MODELintSYZ_DISTANCE_MODEL_LINEARany SYZ_DISTANCE_MODELThe default distance model for new sources.
SYZ_P_DEFAULT_DISTANCE_REFdouble1.0value >= 0.0The default reference distance for new sources.
SYZ_P_DEFAULT_DISTANCE_MAXdouble50.0value >= 0.0The default max distance for new sources.
SYZ_P_DEFAULT_ROLLOFFdouble1.0value >= 0.0The default rolloff for new sources.
SYZ_P_DEFAULT_CLOSENESS_BOOSTdouble0.0any finite doubleThe default closeness boost for new sources in DB.
SYZ_P_DEFAULT_CLOSENESS_BOOST_DISTANCEdouble0.0value >= 0.0The default closeness boost distance for new sources
SYZ_P_DEFAULT_PANNER_STRATEGYintSYZ_PANNER_STRATEGY_STEREOany SYZ_PANNER_STRATEGYThe default panner strategy for new sources.

Functions

syz_contextGetNextEvent

See events.

Linger Behavior

None.

Remarks

The context is the main entrypoint to Synthizer, responsible for the following:

  • Control and manipulation of the audio device.
  • Driving the audio threads.
  • Owning all objects that play together.
  • Representing the listener in 3D space.

All objects which are associated with a context take a context as part of all their constructors. Two objects which are both associated with different contexts should never interact. For efficiency, whether two objects are from different contexts is unvalidated, and the behavior of mixing them is undefined.

All objects associated with a context become useless once the context is destroyed. Calls to them will still work, but they can't be reassociated with a different context and no audible output will result.

Most programs create one context and destroy it at shutdown.

For the time being, all contexts output stereo audio, and it is not possible to specify the output device. These restrictions will be lifted in future.

For information on the meaning of the distance model properties, see 3D Audio.

Buffer

Constructors

syz_createBufferFromFile

SYZ_CAPI syz_ErrorCode syz_createBufferFromFile(syz_Handle *out, const char *path, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);

Create a buffer from a file using an UTF-8 encoded path.

syz_createBufferFromStreamParams

SYZ_CAPI syz_ErrorCode syz_createBufferFromStreamParams(syz_Handle *out, const char *protocol, const char *path, void *param, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);

Create a buffer from stream parameters. See decoding for information on streams.

This call will decode the stream in the calling thread, returning errors as necessary. Synthizer will eventually offer a BufferCache which supports background decoding and caching, but for the moment the responsibility of background decoding is placed on the calling program.

syz_createBufferFromEncodedData

SYZ_CAPI syz_ErrorCode syz_createBufferFromEncodedData(syz_Handle *out, unsigned long long data_len, const char *data, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);

Create a buffer from encoded audio data in ram, for example an ogg file read from disk. This will also work with mmapped pointers. As with all other decoding, Synthizer will autodetect the type from the data. The pointer must live for the duration of the call.

syz_createBufferFromFloatArray

SYZ_CAPI syz_ErrorCode syz_createBufferFromFloatArray(syz_Handle *out, unsigned int sr, unsigned int channels, unsigned long long frames, const float *data, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);

Create a buffer from an array of float data generated by the application. The array must be channels * frames elements.

syz_createBufferFromStreamHandle

SYZ_CAPI syz_ErrorCode syz_createBufferFromStreamHandle(syz_Handle *out, syz_Handle stream, void *userdata, syz_UserdataFreeCallback *userdata_free_callback);

Create a buffer from a stream handle. Usually used with custom streams. Decodes in the calling thread. The lifetime of the stream's underlying asset need only be as long as this call.

syz_ufferGetSizeInBytes

SYZ_CAPI syz_ErrorCode syz_bufferGetSizeInBytes(unsigned long long *size, syz_Handle buffer);

Get the approximate size of this buffer's in-memory representation in bytes.

Properties

None.

Functions

Getters

SYZ_CAPI syz_ErrorCode syz_bufferGetChannels(unsigned int *out, syz_Handle buffer);
SYZ_CAPI syz_ErrorCode syz_bufferGetLengthInSamples(unsigned int *out, syz_Handle buffer);
SYZ_CAPI syz_ErrorCode syz_bufferGetLengthInSeconds(double *out, syz_Handle buffer);

The self-explanatory getters. These aren't properties because they can't be written and they shouldn't participate in the property infrastructure.

Remarks

Buffers hold audio data, as a collection of contiguous chunks. Data is resampled to the Synthizer samplerate and converted to 16-bit PCM.

Buffers are one of the few Synthizer objects that don't require a context. They may be used freely with any object requiring a buffer, from any thread. In order to facilitate this, buffers are immutable after creation.

The approximate memory usage of a buffer in bytes is 2 * channels * duration_in_seconds * 44100. Loading large assets into buffers is not recommended. For things such as music tracks, use StreamingGenerators. Note that on 32-bit architectures, some operating systems only allow a 2 gigabyte address space. Synthizer avoids allocating buffers as contiguous arrays in part to allow efficient use of 32-bit address spaces, but this only goes so far. If on a 32-bit architecture, expect to run out of memory from Synthizer's perspective well before decoding 2 Gigabytes of buffers simultaneously due to the inability to find consecutive free pages.

Operations Common to All Sources

Constructors

None.

Properties

EnumTypeDefaultRangeDescription
SYZ_P_GAINdoubleAny double > 0An additional gain factor applied to this source.
SYZ_P_FILTERbiquadidentityanyA filter which applies to all audio leaving the source, before SYZ_P_FILTER_DIRECT and SYZ_P_FILTER_EFFECTS.
SYZ_P_FILTER_DIRECTbiquadidentityanyA filter which applies after SYZ_P_FILTER but not to audio traveling to effect sends.
SYZ_P_FILTER_EFFECTSbiquadidentityanyA filter which runs after SYZ_P_FILTER but only applies to audio traveling through effect sends.

Functions

syz_sourceAddGenerator, syz_sourceRemoveGenerator

SYZ_CAPI syz_ErrorCode syz_sourceAddGenerator(syz_Handle source, syz_Handle generator);
SYZ_CAPI syz_ErrorCode syz_sourceRemoveGenerator(syz_Handle source, syz_Handle generator);

Add/remove a generator from a source. Each generator may be added once and duplicate add calls will have no effect. Each generator should only be used with one source at a time.

Remarks

Sources represent audio output. They combine all generators connected to them, apply any effects if necessary, and feed the context. Subclasses of Source add panning and other features.

All sources offer filters via SYZ_P_FILTER, SYZ_P_FILTER_DIRECT and SYZ_P_FILTER_EFFECTS. First, SYZ_P_FILTER is applied, then the audio is split into two paths: the portion heading directly to the speakers gets SYZ_P_FILTER_DIRECT, and the portion heading to the effect sends gets SYZ_P_FILTER_EFFECTS. This can be used to simulate occlusion and perform other per-source effect customization.

DirectSource

Constructors

syz_createDirectSource

SYZ_CAPI syz_ErrorCode syz_createDirectSource(syz_Handle *out, syz_Handle context, void *config, void *userdata,
                                              syz_UserdataFreeCallback *userdata_free_callback);

Creates a direct source.

Properties

Inherited from Source only.

Linger Behavior

Lingers until the timeout or until all generators have been destroyed.

Remarks

A direct source is for music and other audio assets that don't wish to participate in panning, , and should be linked directly to speakers.

Audio is converted to the Context's channel count and passed directly through.

AngularPannedSource and ScalarPannedSource

Constructors

syz_createAngularPannedSource

SYZ_CAPI syz_ErrorCode syz_createAngularPannedSource(syz_Handle *out, syz_Handle context, int panner_strategy,
                                                     double azimuth, double elevation, void *config, void *userdata,
                                                     syz_UserdataFreeCallback *userdata_free_callback);

Creates an angular panned source, which can be controled through azimuth and elevation.

syz_createScalarPannedSource

SYZ_CAPI syz_ErrorCode syz_createScalarPannedSource(syz_Handle *out, syz_Handle context, int panner_strategy,
                                                    double panning_scalar, void *config, void *userdata,
                                                    syz_UserdataFreeCallback *userdata_free_callback);

Creates a scalar panned source, controlled via the panning scalar.

Properties

EnumTypeDefaultRangeDescription
SYZ_P_AZIMUTHdoublefrom constructor0.0 to 360.0The azimuth of the panner. See remarks.
SYZ_P_ELEVATIONdoublefrom constructor-90.0 to 90.0See remarks
SYZ_P_PANNING_SCALARdoublefrom constructor-1.0 to 1.0see remarks

Linger Behavior

Lingers until all generators have been destroyed.

Remarks

The panned sources give direct control over a panner, which is either controlled via azimuth/elevation in degrees or a panning scalar. Which properties you use depend on which type of source you create (angular for azimuth/elevation, scalar for the panning scalar).

If using azimuth/elevation, 0.0 azimuth is forward and positive angles are clockwise. Elevation ranges from -90 (down) to 90 (up).

Some applications want to control panners through a panning scalar instead, i.e. for UI purposes. If using panning scalars, -1.0 is full left and 1.0 is full right.

For information on panning, see 3D Audio.

Source3D

Constructors

syz_createSource3D

SYZ_CAPI syz_ErrorCode syz_createSource3D(syz_Handle *out, syz_Handle context, int panner_strategy, double x, double y,
                                          double z, void *config, void *userdata,
                                          syz_UserdataFreeCallback *userdata_free_callback);

Creates a source3d positioned at the origin and with no associated generators.

Properties

EnumTypeDefaultRangeDescription
SYZ_P_POSITIONdouble3from constructoranyThe position of the source.
SYZ_P_ORIENTATIONdouble6(0, 1, 0, 0, 0, 1)Two packed unit vectorsThe orientation of the source as (atx, aty, atz, upx, upy, upz). Currently unused.
SYZ_P_DISTANCE_MODELintfrom Contextany SYZ_DISTANCE_MODELThe distance model for this source.
SYZ_P_DISTANCE_REFdoubleFrom Contextvalue >= 0.0The reference distance.
SYZ_P_DISTANCE_MAXdoubleFrom Contextvalue >= 0.0The max distance for this source.
SYZ_P_ROLLOFFdoubleFrom Contextvalue >= 0.0The rolloff for this source.
SYZ_P_CLOSENESS_BOOSTdoubleFrom Contextany finite doubleThe closeness boost for this source in DB.
SYZ_P_CLOSENESS_BOOST_DISTANCEdoubleFrom Contextvalue >= 0.0The closeness boost distance for this source.

Linger Behavior

Lingers until all generators are destroyed.

Remarks

A Source3D represents an entity in 3D space. For explanations of the above properties, see 3D Audio.

When created, Source3D reads all of its defaults from the Context's corresponding properties. Changes to the Context versions don't affect already created sources. A typical use case is to configure the Context to the defaults of the game, and then create sources.

Operations Common to All Generators

Generators generate audio, and are how Synthizer knows what to play through sources.

Properties

All generators support the following properties:

EnumTypeDefaultRangeDescription
SYZ_P_GAINdouble1.0value >= 0.0The gain of the generator.
SYZ_P_PITCH_BENDdouble1.00.0 <= value <= 2.0Pitch bend of the generator as a multiplier (2.0 is +1 octave, 0.5 is -1 octave, etc)

Remarks

Not all generators support SYZ_P_PITCH_BEND because it doesn't necessarily make sense for them to do so, but it can always be set.

BufferGenerator

Constructors

syz_createBufferGenerator

SYZ_CAPI syz_ErrorCode syz_createBufferGenerator(syz_Handle *out, syz_Handle context, void *config, void *userdata,
                                                 syz_UserdataFreeCallback *userdata_free_callback);

Creates a BufferGenerator. The buffer is set to NULL and the resulting generator will play silence until one is associated.

Properties

EnumTypeDefault ValueRangeDescription
SYZ_P_BUFFERObject0Any Buffer handleThe buffer to play
SYZ_P_PLAYBACK_POSITIONdouble0.0value >= 0.0The position in the buffer.
SYZ_P_LOOPINGint00 or 1Whether playback loops at the end of the buffer.

Linger behavior

Disables looping and plays until the buffer ends.

Remarks

BufferGenerators play Buffers. This is the most efficient way to play audio.

SYZ_P_PLAYBACK_POSITION is reset if SYZ_P_BUFFER is modified.

SYZ_P_PLAYBACK_POSITION can be set past the end of the buffer. If SYZ_P_LOOPING = 0, the generator will play silence. Otherwise, the position will immediately loop to the beginning.

More than one BufferGenerator can use the same underlying Buffer.

If the buffer being used by this generator is destroyed, this generator immediately begins playing silence until another buffer is associated.

FastSineBankGenerator

Generate basic waveforms which can be expressed as the sum of sine waves (e.g. square, triangle).

Constructors

syz_createFastSineBankGenerator

struct syz_SineBankWave {
  double frequency_mul;
  double phase;
  double gain;
};

struct syz_SineBankConfig {
  const struct syz_SineBankWave *waves;
  unsigned long long wave_count;
  double initial_frequency;
};

SYZ_CAPI void syz_initSineBankConfig(struct syz_SineBankConfig *cfg);

SYZ_CAPI syz_ErrorCode syz_createFastSineBankGenerator(syz_Handle *out, syz_Handle context,
                                                       struct syz_SineBankConfig *bank_config, void *config,
                                                       void *userdata,
                                                       syz_UserdataFreeCallback *userdata_free_callback);

Create a sine bank which evaluates itself by summing sine waves at specific multiples of a fundamental frequency. See remarks for specifics on what this means and what the values in the configuration structs should be.

Most applications will want to use the helpers which configure the bank with specific well-known waveforms.

You own the memory pointed to by syz_SineBankConfig, and it may be freed immediately after the constructor call. Pointing it at values on the stack is fine.

Specific waveform helpers

SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorSine(syz_Handle *out, syz_Handle context,
                                                           double initial_frequency, void *config, void *userdata,
                                                           syz_UserdataFreeCallback *userdata_free_callback);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorTriangle(syz_Handle *out, syz_Handle context,
                                                               double initial_frequency, unsigned int partials,
                                                               void *config, void *userdata,
                                                               syz_UserdataFreeCallback *userdata_free_callback);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorSquare(syz_Handle *out, syz_Handle context,
                                                             double initial_frequency, unsigned int partials,
                                                             void *config, void *userdata,
                                                             syz_UserdataFreeCallback *userdata_free_callback);
SYZ_CAPI syz_ErrorCode syz_createFastSineBankGeneratorSaw(syz_Handle *out, syz_Handle context, double initial_frequency,
                                                          unsigned int partials, void *config, void *userdata,
                                                          syz_UserdataFreeCallback *userdata_free_callback);

Create waveforms of specific types, e.g. what you'd get from a digital synthesizer. Most applications will wish to use these functions. See remarks for additional notes on quality.

Properties

EnumTypeDefault ValueRangeDescription
SYZ_P_FREQUENCYdoubleset by constructorany positive valuethe frequency of the waveform.

Linger behavior

Fades out over a few ms.

Remarks

This implements a fast sine wave generation algorithm which is on the order of single-digit clock cycles per sample, at the cost of slight accuracy. The intended use is to generate chiptune-quality waveforms. For those not familiar, most waveforms may be constructed of summed sine waves at specific frequencies, so this functions as a general-purpose wave generator. Note that attempts to use this to run a fourier transform will not end well: the slight inaccuracy combined with being O(samples*waves) will cause wave counts over a few hundred to rapidly become impractically slow and low quality. In the best case (right processor, your compiler likes Synthizer, etc), the theoretical execution time per sample for 32 waves is around 5-10 clock cycles, so it can be push pretty far.

In order to specify the waveforms to use, you must use 3 parameters:

  • frequency_mul: the multiplier on the base frequency for this wave. For example frequency_mul = 1.0 is a sine wave which plays back at whatever frequency the generator is set to, 2.0 is the first harmonic, etc. Fractional values are permitted.
  • phase: the phase of the sinusoid in the range 0 to 1. This is slightly odd because there are enough approximations of PI out there depending on language; to get to what Synthizer wants, take your phase in radians and divide by whatever approximation of PI you're using.
  • gain: Self-explanatory. Negative gains are valid in order to allow converting from mathematical formulas that use it.

In order to provide a more convenient interface, the helper functions for various waveform types may be used. These specify the number of partials to generate, which does not exactly equate to harmonics because not all waveforms contain every harmonic. Simply playing with this value until it sounds good is the easiest way to deal with it; for most applications, no more than 30 should be required. More specifically, the square wave makes a good concrete example of how partials can be different from harmonics because it only includes odd harmonics. So:

partialsharmonics included
11.0 (fundamental), 3.0
21.0 (fundamental), 3.0, 5.0
31.0, 3.0, 5.0, 7.0

The reason that you might wish to use less partials is due to aliasing. Extremely high partial counts will alias if they go above nyquist, currently 22050 hZ. If you are playing high frequencies lowering the partial count may be called for. By contrast, intentionally forcing it to alias can produce more "chiptune"-quality sound. The CPU usage of more partials should be unnoticeable for all practical values; if this turns out not to be the case you are encouraged to open an issue.

This generator does not allow introducing a DC term. If you need one for some reason, open an issue instead of trying to hack it in with a sine wave at 0HZ and the appropriate phase and gain.

NoiseGenerator

Inherits from Generator.

Constructors

syz_createNoiseGenerator

SYZ_CAPI syz_ErrorCode syz_createNoiseGenerator(syz_Handle *out, syz_Handle context, unsigned int channels,
                                                void *config, void *userdata,
                                                syz_UserdataFreeCallback *userdata_free_callback);

Creates a NoiseGenerator configured for uniform noise with the specified number of output channels. The number of output channels cannot be configured at runtime. Each channel produces decorrelated noise.

Properties

EnumTypeDefault ValueRangeDescription
SYZ_P_NOISE_TYPEintSYZ_NOISE_TYPE_UNIFORMany SYZ_NOISE_TYPEThe type of noise to generate. See remarks.

Linger Behavior

Fades out over a few milliseconds.

Remarks

NoiseGenerators generate noise. This is most useful when filtered via the source, and can make things such as plausible if low-quality wind and whistling effects.

Synthizer allows setting the algorithm used to generate noise to one of the following options. Note that these are more precisely named than white/pink/brown; the sections below document the equivalent in the more standard nomenclature.

SYZ_NOISE_TYPE_UNIFORM

A uniform noise source. From an audio perspective this is white noise, but is sampled from a uniform rather than Gaussian distribution for efficiency.

SYZ_NOISE_TYPE_VM

This is pink noise generated with the Voss-McCartney algorithm, which consists of a number of summed uniform random number generators which are run at different rates. Synthizer adds an additional random number generator at the top of the hierarchy in order to improve the color of the noise in the high frequencies.

SYZ_NOISE_TYPE_FILTERED_BROWN

This is brown noise generated with a -6DB filter.

StreamingGenerator

Constructors

syz_createStreamingGeneratorFromFile

SYZ_CAPI syz_ErrorCode syz_createStreamingGeneratorFromFile(syz_Handle *out, syz_Handle context, const char *path,
                                                            void *config, void *userdata,
                                                            syz_UserdataFreeCallback *userdata_free_callback);

Create a StreamingGenerator from an UTF-8 encoded path.

syz_createStreamingGeneratorFromStreamParams

SYZ_CAPI syz_ErrorCode syz_createStreamingGeneratorFromStreamParams(syz_Handle *out, syz_Handle context,
                                                                    const char *protocol, const char *path, void *param,
                                                                    void *config, void *userdata,
                                                                    syz_UserdataFreeCallback *userdata_free_callback);

Create a StreamingGenerator from the standard stream parameters.

syz_createStreamingGeneratorFromStreamHandle

SYZ_CAPI syz_ErrorCode syz_createStreamingGeneratorFromStreamHandle(syz_Handle *out, syz_Handle context,
                                                                    syz_Handle stream, void *config, void *userdata,
                                                                    syz_UserdataFreeCallback *userdata_free_callback);

Create a StreamingGenerator from a stream handle.

Properties

EnumTypeDefault ValueRangeDescription
SYZ_P_PLAYBACK_POSITIONdouble0.0value >= 0.0The position of the stream.
SYZ_P_LOOPINGint00 or 1Whether playback loops

Linger Behavior

Disables looping and continues until the stream ends.

Remarks

StreamingGenerator plays streams, decoding and reading on demand. The typical use case is for music playback.

Due to the expense of streaming from disk and other I/O sources, having more than a few StreamingGenerators going will cause a decrease in audio quality on many systems, typically manifesting as drop-outs and crackling. StreamingGenerator creates one background thread per instance and does all decoding and I/O in that thread.

At startup, StreamingGenerator's background thread eagerly decodes a relatively large amount of data in order to build up a buffer which prevents underruns. Thereafter, it will pick up property changes every time the background thread wakes up to add more data to the buffer. This means that most operations are high latency, currently on the order of 100 to 200 MS. The least latent operation is the initial start-up, which will begin playing as soon as enough data is decoded. How long that takes depends on the format and I/O characteristics of the stream, as well as the user's machine and current load of the system.

Operations Common to All Effects

Properties

EnumTypeDefaultRangeDescription
SYZ_P_GAINdoubleusually 1.0value >= 0.0The overall gain of the effect.
SYZ_P_FILTER_INPUTbiquadusually identity. if not, documented with the effect.anyA filter which applies to the input of this effect. Runs after filters on effect sends.

Functions

syz_effectReset

SYZ_CAPI syz_ErrorCode syz_effectReset(syz_Handle effect);

Clears the internal state of the effect. Intended for design/development purposes. This function may produce clicks and other artifacts and is slow.

Remarks

For more information on how effects work, see the dedicated section.

GlobalEcho

Constructors

syz_createGlobalEcho

SYZ_CAPI syz_ErrorCode syz_createGlobalEcho(syz_Handle *out, syz_Handle context, void *config, void *userdata,
                                            syz_UserdataFreeCallback *userdata_free_callback);

Creates an echo effect.

Functions

syz_echoSetTaps

struct syz_EchoTapConfig {
    double delay;
    double gain_l;
    double gain_r;
};

SYZ_CAPI syz_ErrorCode syz_globalEchoSetTaps(syz_Handle handle, unsigned int n_taps, struct syz_EchoTapConfig *taps);

Configure the taps for this Echo. Currently, delay must be no greater than 5 seconds. To clear the taps, set the echo to an array of 0 elements.

Properties

None

Linger Behavior

Lingers until the delay line is empty, that is until no more echoes can possibly be heard.

Remarks

This is a stereo tapped delay line, with a one-block crossfade when taps are reconfigured. The max delay is currently fixed at 5 seconds, but this will be made user configurable in future.

This implementation offers precise control over the placement of taps, at the cost of not being able to have indefinitely long echo effects. It's most useful for modeling discrete, panned echo taps. Some ways this is useful are:

  • Emphasize footsteps off walls in large spaces, by computing the parameters for the taps off level geometry.
  • Emphasize openings or coridors.
  • Pair it with a reverb implementation to offer additional, highly controlled early reflection emphasis

This is effectively discrete convolution for 2 channels, implemented using an algorithm designed for sparse taps. In other words, the cost of any echo effect is O(taps) per sample. Anything up to a few thousand discrete taps is probably fine, but beyond that the cost will become prohibitive.

GlobalFdnReverb

A reverb based off a feedback delay network.

Inherits from GlobalEffect.

Constructors

syz_createGlobalFdnReverb

SYZ_CAPI syz_ErrorCode syz_createGlobalFdnReverb(syz_Handle *out, syz_Handle context, void *config, void *userdata,
                                                 syz_UserdataFreeCallback *userdata_free_callback);

Creates a global FDN reverb with default settings.

Properties

See remarks for a description of what these do and how to use them effectively.

In addition to the below, FdnReverb defaults its gain to 0.7. Gains of 1.0 are almost never what you want, since that makes the reverb as loud as the non-reverb audio paths.

EnumTypeDefaultRangeDescription
SYZ_P_INPUT_FILTERBiquadLowpass Butterworth at 2000 HZany biquada filter that applies to the audio at the input of the reverb.
SYZ_P_MEAN_FREE_PATHdouble0.10.0 to 0.5The mean free path of the simulated environment.
SYZ_P_T60double0.30.0 to 100.0The T60 of the reverb
SYZ_P_LATE_REFLECTIONS_LF_ROLLOFFdouble1.00.0 to 2.0A multiplicative factor on T60 for the low frequency band
SYZ_P_LATE_REFLECTIONS_LF_REFERENCEdouble200.00.0 to 22050.0Where the low band of the feedback equalizer ends
SYZ_P_LATE_REFLECTIONS_HF_ROLLOFFdouble0.50.0 to 2.0A multiplicative factor on T60 for the high frequency band
SYZ_P_LATE_REFLECTIONS_HF_REFERENCEdouble500.00.0 to 22050.0Where the high band of the equalizer starts.
SYZ_P_LATE_REFLECTIONS_DIFFUSIONdouble1.00.0 to 1.0Controls the diffusion of the late reflections as a percent.
SYZ_P_LATE_REFLECTIONS_MODULATION_DEPTHdouble0.010.0 to 0.3The depth of the modulation of the delay lines on the feedback path in seconds.
SYZ_P_LATE_REFLECTIONS_MODULATION_FREQUENCYdouble0.50.01 to 100.0The frequency of the modulation of the delay lines inthe feedback paths.
SYZ_P_LATE_REFLECTIONS_DELAYdouble0.030.0 to 0.5The delay of the late reflections relative to the input in seconds.

Linger behavior

Lingers for slightly longer than t60.

Remarks

This is a reverb composed of a feedback delay network with 8 internal delay lines. The algorithm proceeds as follows:

  • Audio is fed through the input filter, a lowpass. Use this to eliminate high frequencies, which can be quite harsh when fed to reverb algorithms.
  • Then, audio is fed into a series of 8 delay lines, connected with a feedback matrix. It's essentially a set of parallel allpass filters with some additional feedbacks, but inspired by physics.
    • Each of these delay lines is modulated, to reduce periodicity.
    • On each feedback path, the audio is fed through an equalizer to precisely control the decay rate in 3 frequency bands.
  • Two decorrelated channels are extracted. This will be increased to 4 when surround sound support is added.
  • Finally, the output is delayed by the late reflections delay.

The current reverb modle is missing spatialized early reflections. Practically speaking this makes very little difference when using an FDN because the FDN simulates them effectively on its own, but the SYZ_P_EARLY_REFLECTIONS_* namespace is reserved for that purpose. The plan is to feed them through HRTF in order to attempt to capture the shape of the room, possibly with a per-source model.

The reverb is also missing the ability to pan late reflections; this is on the roadmap.

The default configuration is something to the effect of a medium-sized room. Presets will be added in future. The following sections explain considerations for reverb design with this algorithm:

A Note On Property Changes

The FdnReverb effect involves a large amount of feedback and is therefore impossible to crossfade efficiently. To that end,we don't try. Expect most property changes save for t60 and the hf/lf frequency controls to cause clicking and other artifacts.

To change properties smoothly, it's best to create a reverb, set all the parameters, connect all the sources to the new one, and disconnect all the sources from the old one, in that order. Synthizer may eventually do this internally, but that necessitates taking a permanent and large allocation cost without a lot of implementation work being done first, so for the moment we don't.

In practice, this doesn't matter. Most environments don't change reverb characteristics. A good flow is as follows:

  • Design the reverb in your level editor/other environment.
  • When necessary, use syz_effectReset for interactive experimentation.
  • When distributing/launching for real, use the above crossfading instructions.

It is of course possible to use more than one reverb at a time as well, and to fade sources between them at different levels. Note, however, that reverbs are relatively expensive.

The Input Filter

Most reverb algorithms have a problem: high frequencies are emphasized. Synthizer's is no different. To solve this, we introduce an input lowpass filter, which can cut out the higher frequencies. This is SYZ_P_FILTER_INPUT, available on all effects, but defaulted by the reverb to a lowpass at 1500 HZ because most of the negative characteristics of reverbs occur when high frequencies are overemphasized.

Changing this cutoff filter is the strongest tool available for coloring the reverb. Low cutoffs are great for rooms with sound dampening, high cutoffs for concrete walls. It can be disabled, but doing so will typically cause metallic and periodic artifacts to be noticeable.

It's also possible to swap it with other filter types. Lowpass filters are effectively the only filter type that aligns with the real world in the context of a reverb, but other filter types can produce interesting effects.

Choosing the mean free path and late reflections delay

These two values are most directly responsible for controlling how big a space feels. Intuitively, the mean free path is the average distance from wall to wall, and the late reflections delay is the time it takes for audio to hit something for the first time. In general, get the mean free path by dividing the average distance between the walls by the speed of sound, and set the late reflections delay to something in the same order of magnitude.

A good approximation for the mean free path is 4 * volume / surface_area. Mathematically, it's the average time sound travels before reflection off an obstacle. Very large mean free paths produce many discrete echoes. For unrealistically large values, the late reflections won't be able to converge at all.

Choosing T60 and controlling per-band decay

The t60 and related properties control the gains and configuration of a filter on the feedback path.

The t60 of a reverb is defined as the time it takes for the reverb to decay by -60db. Effectively this can be thought of as how long until the reverb is completely silent. 0.2 to 0.5 is a particularly reverberant and large living room, 1.0 to 2.0 is a concert hall, 5.0 is an amazingly large cavern, and values larger than that quickly become unrealistic and metallic.

Most environments don't have the same decay time for all frequency bands, so the FdnReverb actually uses a 3-band equalizer instead of raw gains on the feedback paths. The bands are as follows:

  • 0.0 to SYZ_P_LATE_REFLECTIONS_LF_REFERENCE
  • SYZ_P_LATE_REFLECTIONS_LF_REFERENCE to SYZ_P_LATE_REFLECTIONS_HF_REFERENCE
  • SYZ_P_LATE_REFLECTIONS_HF_REFERENCE to nyquist

SYZ_P_T60 controls the decay time of the middle frequency band. The lower band is t60 * lf_rolloff, and the upper t60 * hf_rolloff. This allows you to simply change T60, and let the rolloff ratios control coloration.

Intuitively, rooms with carpet on all the walls have a rather low hf reference and rolloff, and giant stone caverns are close to equal in all frequency bands. The lf reference/rolloff pairing can be used primarily for non-natural base boosting. When the reverb starts, all frequencies are relatively equal but, as the audio continually gets fed back through the feedback paths, the equalizer will emphasize or deemphasize the 3 frequency bands at different rates. To use this effectively, treat the hf/lf as defining the materials of the wall, then move t60.

Note that the amount of coloration you can get from the equalizer is limited especially for short reverbs. To control the perception of the environment more bluntly and independently of t60, use the input filter.

Diffusion

The diffusion of the reverb is how fast the reverb tail transitions from discrete echoes to a continuous reverberant response. Synthizer exposes this to you as a percent-based control, since it's not conveniently possible to tie anything to a real physical quantity in this case. Typically, diffusion at 1.0 (the default) is what you want.

Another way to think of diffusion is how rough the walls are, how many obstacles there are for sound to bounce off of, etc.

Delay Line modulation

A problem with feedback delay networks and/or other allpass/comb filter reverb designs is that they tend to be obviously periodic. To deal with this, modulation of the delay lines on the feedback path is often introduced. The final stage of designing an FdnReverb is to decide on the values of the modulation depth and frequency.

The trade-off here is this:

  • At low modulation depth/frequency, the reverb likes to sound metallic.
  • At high modulation depth/frequency, the reverb gains very obvious nonlinear effects.
  • At very high modulation depth/frequency, the reverb doesn't sound like a reverb at all.

FdnReverb tries to default to universally applicable settings, but it might still be worth adjusting these. To disable modulation all together, set the depth to 0.0; due to internal details, setting the frequency to 0.0 is not possible.

The artifacts introduced by large modulation depth/frequency values are least noticeable with percussive sounds and most noticeable with constant tones such as pianos and vocals. Inversely, the periodic artifacts of no or little modulation are most noticeable with percussive sounds and least noticeable with constant tones.

In general, the best way to not need to touch these settings is to use realistic t60, as the beginning of the reverb isn't generally periodic.

Audio EQ Cookbook

The following is the Audio EQ Cookbook, containing the most widely used formulas for biquad filters. Synthizer's internal implementation of most filters either follows these exactly or is composed of cascaded/parallel sections.

There are several versions of this document on the web. This version is from http://music.columbia.edu/pipermail/music-dsp/2001-March/041752.html.

         Cookbook formulae for audio EQ biquad filter coefficients
---------------------------------------------------------------------------
by Robert Bristow-Johnson <rbj at gisco.net>  a.k.a. <robert at audioheads.com>


All filter transfer functions were derived from analog prototypes (that 
are shown below for each EQ filter type) and had been digitized using the 
Bilinear Transform.  BLT frequency warping has been taken into account 
for both significant frequency relocation and for bandwidth readjustment.

First, given a biquad transfer function defined as:

            b0 + b1*z^-1 + b2*z^-2
    H(z) = ------------------------                                (Eq 1)
            a0 + a1*z^-1 + a2*z^-2

This shows 6 coefficients instead of 5 so, depending on your architechture,
you will likely normalize a0 to be 1 and perhaps also b0 to 1 (and collect
that into an overall gain coefficient).  Then your transfer function would
look like:

            (b0/a0) + (b1/a0)*z^-1 + (b2/a0)*z^-2
    H(z) = ---------------------------------------                 (Eq 2)
               1 + (a1/a0)*z^-1 + (a2/a0)*z^-2

or

                      1 + (b1/b0)*z^-1 + (b2/b0)*z^-2
    H(z) = (b0/a0) * ---------------------------------             (Eq 3)
                      1 + (a1/a0)*z^-1 + (a2/a0)*z^-2


The most straight forward implementation would be the Direct I form (using Eq 2):

y[n] = (b0/a0)*x[n] + (b1/a0)*x[n-1] + (b2/a0)*x[n-2]
                    - (a1/a0)*y[n-1] - (a2/a0)*y[n-2]              (Eq 4)

This is probably both the best and the easiest method to implement in the 56K.



Now, given:

    sampleRate (the sampling frequency)

    frequency ("wherever it's happenin', man."  "center" frequency 
        or "corner" (-3 dB) frequency, or shelf midpoint frequency, 
        depending on which filter type)
    
    dBgain (used only for peaking and shelving filters)

    bandwidth in octaves (between -3 dB frequencies for BPF and notch
        or between midpoint (dBgain/2) gain frequencies for peaking EQ)

     _or_ Q (the EE kind of definition)

     _or_ S, a "shelf slope" parameter (for shelving EQ only).  when S = 1, 
        the shelf slope is as steep as it can be and remain monotonically 
        increasing or decreasing gain with frequency.  the shelf slope, in 
        dB/octave, remains proportional to S for all other values.



First compute a few intermediate variables:

    A     = sqrt[ 10^(dBgain/20) ]
          = 10^(dBgain/40)                    (for peaking and shelving EQ filters only)

    omega = 2*PI*frequency/sampleRate

    sin   = sin(omega)
    cos   = cos(omega)

    alpha = sin/(2*Q)                                     (if Q is specified)
          = sin*sinh[ ln(2)/2 * bandwidth * omega/sin ]   (if bandwidth is specified)

    beta  = sqrt(A)/Q                                     (for shelving EQ filters only)
          = sqrt(A)*sqrt[ (A + 1/A)*(1/S - 1) + 2 ]       (if shelf slope is specified)
          = sqrt[ (A^2 + 1)/S - (A-1)^2 ]


Then compute the coefficients for whichever filter type you want:

  The analog prototypes are shown for normalized frequency.
  The bilinear transform substitutes:
  
                1          1 - z^-1
  s  <-  -------------- * ----------
          tan(omega/2)     1 + z^-1

and makes use of these trig identities:

                    sin(w)
   tan(w/2)    = ------------
                  1 + cos(w)


                  1 - cos(w)
  (tan(w/2))^2 = ------------
                  1 + cos(w)



LPF:            H(s) = 1 / (s^2 + s/Q + 1)

                b0 =  (1 - cos)/2
                b1 =   1 - cos
                b2 =  (1 - cos)/2
                a0 =   1 + alpha
                a1 =  -2*cos
                a2 =   1 - alpha



HPF:            H(s) = s^2 / (s^2 + s/Q + 1)

                b0 =  (1 + cos)/2
                b1 = -(1 + cos)
                b2 =  (1 + cos)/2
                a0 =   1 + alpha
                a1 =  -2*cos
                a2 =   1 - alpha



BPF (constant skirt gain):    H(s) = s / (s^2 + s/Q + 1)

                b0 =   Q*alpha
                b1 =   0
                b2 =  -Q*alpha
                a0 =   1 + alpha
                a1 =  -2*cos
                a2 =   1 - alpha


BPF (constant peak gain):     H(s) = (s/Q) / (s^2 + s/Q + 1)

                b0 =   alpha
                b1 =   0
                b2 =  -alpha
                a0 =   1 + alpha
                a1 =  -2*cos
                a2 =   1 - alpha



notch:          H(s) = (s^2 + 1) / (s^2 + s/Q + 1)

                b0 =   1
                b1 =  -2*cos
                b2 =   1
                a0 =   1 + alpha
                a1 =  -2*cos
                a2 =   1 - alpha



APF:          H(s) = (s^2 - s/Q + 1) / (s^2 + s/Q + 1)

                b0 =   1 - alpha
                b1 =  -2*cos
                b2 =   1 + alpha
                a0 =   1 + alpha
                a1 =  -2*cos
                a2 =   1 - alpha



peakingEQ:      H(s) = (s^2 + s*(A/Q) + 1) / (s^2 + s/(A*Q) + 1)

                b0 =   1 + alpha*A
                b1 =  -2*cos
                b2 =   1 - alpha*A
                a0 =   1 + alpha/A
                a1 =  -2*cos
                a2 =   1 - alpha/A



lowShelf:       H(s) = A * (s^2 + beta*s + A) / (A*s^2 + beta*s + 1)

                b0 =    A*[ (A+1) - (A-1)*cos + beta*sin ]
                b1 =  2*A*[ (A-1) - (A+1)*cos            ]
                b2 =    A*[ (A+1) - (A-1)*cos - beta*sin ]
                a0 =        (A+1) + (A-1)*cos + beta*sin
                a1 =   -2*[ (A-1) + (A+1)*cos            ]
                a2 =        (A+1) + (A-1)*cos - beta*sin



highShelf:      H(s) = A * (A*s^2 + beta*s + 1) / (s^2 + beta*s + A)

                b0 =    A*[ (A+1) + (A-1)*cos + beta*sin ]
                b1 = -2*A*[ (A-1) + (A+1)*cos            ]
                b2 =    A*[ (A+1) + (A-1)*cos - beta*sin ]
                a0 =        (A+1) - (A-1)*cos + beta*sin
                a1 =    2*[ (A-1) - (A+1)*cos            ]
                a2 =        (A+1) - (A-1)*cos - beta*sin