Connect to Android’s audio device from C++ code.

Android Wavetable Synthesizer Tutorial Series

  1. App Architecture
  2. UI with Jetpack Compose
  3. ViewModel
  4. Calling C++ Code From Kotlin with JNI
  5. Playing Back Audio on Android with C++ (this one)
  6. Wavetable Synthesis Algorithm in C++

Introduction

Welcome to the 5th part of the Android wavetable synthesizer app tutorial!

In this tutorial series, we are building a synthesizer music app on Android using all the modern (as of 2022) tools and best practices.

In the last episode, we learned how to call the C++ code from Kotlin. We said that it would be necessary to play back sound in our app. Today play back sound we will!

Table of Contents

  1. How To Play a Sound on Android From Code?
  2. Android Audio APIs
  3. How Does an Audio Driver Work?
  4. The Audio Thread
  5. How to Connect to Android Audio Using Oboe?
    1. AudioPlayer Interface
    2. AudioSource Interface
  6. AudioPlayer and oboe::AudioStreamDataCallback implementer
  7. AudioSource implementer
  8. Initialization in the WavetableSynthesizer Class
  9. Adjustment to CMakeLists.txt
  10. Specifying Oboe as a Dependency
  11. Running the Synthesizer
  12. Summary

How To Play a Sound on Android From Code?

To play back sound on Android, we have 2 options:

  1. use a high-level Java API,
  2. write a callback for the Android audio driver using one of the available audio APIs in the C language.

Option 1 is sufficient when we want to perform very simple audio tasks, for example, play back a single sound.

Our synthesizer won’t be very complicated but it will need the fine-grained control of the audio driver interaction.

Android Audio APIs

On Android, we have 3 audio APIs that we can use to control the sound playback:

  1. OpenSL ES,
  2. AAudio, and
  3. Oboe.

The first one is a “nasty, low-level” (IMO) C-style API.

The second one is a more modern approach containing useful features for professional audio apps. However, it is only supported from Android 8.0, which means it will not run on older Android devices.

The third option is a C+±style wrapper for the previous two. It tries to use AAudio underneath whenever it can and falls back on OpenSL ES when AAudio is not available.

Google recommends using Oboe.

So that’s exactly what we’ll do! (Additionally, I think it’s the most developer-friendly option of the three).

How Does an Audio Driver Work?

In all operating systems (OSs), we have at least one audio driver. Their operating principle is quite simple.

A typical audio driver API contains the following 3 concepts:

  1. a command to start output playback stream, play(),
  2. a command to stop output playback stream, stop(), and
  3. a user-defined audio callback.

The audio callback is a function defined by a software developer that is supplied to the audio driver. When the output playback stream is running, the audio callback is called at regular time intervals, for example, every 10 milliseconds. The time interval specifies how many samples should be generated by the audio callback. The generated samples to be played back should be placed in a dedicated buffer supplied by the audio driver.

The audio callback is called until the output stream is closed, for example, by calling stop().

The Audio Thread

The audio callback is called by the audio driver from a dedicated thread, often called the audio thread.

The callback must return a specified number of samples in time much shorter than these samples represent.

That means that we never ever can block the audio thread. In particular we mustn’t do the following on the audio thread:

  1. allocate memory,
  2. take ownership of a mutex,
  3. wait for an asynchronous task completion.

All the above examples require calls to OS APIs and as such have no time constraint guarantees.

If we take too much time to generate the samples, an underrun will occur, which manifests itself through glitches in the audio output.

Unfortunately, any app that has audible glitches in the audio output cannot be used by professional musicians. They would risk too much of their reputation by accepting the possibility of a glitch 🙂

How to Connect to Android Audio Using Oboe?

To connect to Android’s audio driver using the Oboe library, we need to implement the audio stream setup, starting, stopping, and the audio callback. We can do it by subclassing oboe::AudioStreamDataCallback.

But then we will make our app dependent on the Oboe library, right?

Right. Therefore, we must use a 2-way abstraction so that Oboe doesn’t know concrete classes that use it and our app doesn’t know which audio driver we are using.

We can summarize this design on the following diagram:

Class diagram depicting classes involved in the audio playback.

Figure 1. Classes involved in the audio playback in our synthesizer app.

AudioPlayer Interface

Let’s define what we need from the audio driver in our synthesizer app in terms of a purely virtual class, an interface:

Listing 2. AudioPlayer.h.

#pragma once

namespace wavetablesynthesizer {
class AudioPlayer {
 public:
  virtual ~AudioPlayer() = default;

  // Start the audio device
  virtual int32_t play() = 0;

  // Stop the audio device
  virtual void stop() = 0;
};
}  // namespace wavetablesynthesizer

What is great about this interface is that it’s not dependent on anything 🙂

This interface can now be implemented by a class that underneath uses one of the 3 previously described audio APIs. Since we only interact with the audio device through the interface, we can completely abstract out which particular audio API is used.

Any implementer of AudioPlayer has to provide a function for the audio callback but this does not have to be part of the interface. However, any audio callback will need to call our sound-generating code. It will do so through another interface, AudioSource.

AudioSource Interface

Let’s also define what will be needed from our synthesizer by the implemented AudioPlayer:

Listing 3. AudioSource.h.

#pragma once

namespace wavetablesynthesizer {
class AudioSource {
 public:
  virtual ~AudioSource() = default;

  // Return 1 sample of audio to be played back
  virtual float getSample() = 0;

  // A callback invoked when the audio stream is stopped
  virtual void onPlaybackStopped() = 0;
};
}  // namespace wavetablesynthesizer

As you can see, we generate the samples one by one with the getSample() method. This is not the best approach; the best would be to generate a block of samples upon a single call. However, in this app, this will simplify matters significantly.

Additionally, we provide a way for the AudioPlayer implementer to inform the AudioSource that the playback has stopped through the onPlaybackStopped() callback.

AudioPlayer and oboe::AudioStreamDataCallback implementer

Here comes the “meat” of the Android audio driver connection: the OboeAudioPlayer class.

In Listing 4, you can see its declaration.

Listing 4. OboeAudioPlayer.h.

#pragma once

#include <oboe/Oboe.h>
#include "AudioPlayer.h"

namespace wavetablesynthesizer {
class AudioSource;

class OboeAudioPlayer : public oboe::AudioStreamDataCallback,
                        public AudioPlayer {
 public:
  static constexpr auto channelCount = oboe::ChannelCount::Mono;

  OboeAudioPlayer(std::shared_ptr<AudioSource> source, int samplingRate);
  ~OboeAudioPlayer();

  int32_t play() override;

  void stop() override;

  oboe::DataCallbackResult onAudioReady(oboe::AudioStream* audioStream,
                                        void* audioData,
                                        int32_t framesCount) override;

 private:
  std::shared_ptr<AudioSource> _source;
  std::shared_ptr<oboe::AudioStream> _stream;
  int _samplingRate;
};
}  // namespace wavetablesynthesizer

We need to include the Oboe.h header here because we cannot inherit from a class we don’t have the full declaration of (an incomplete class).

As you can see, OboeAudioPlayer implements both interfaces: AudioPlayer (needed by our synthesizer app) and oboe::AudioStreamDataCallback (needed by the Oboe library).

Furthermore, it receives an AudioSource as a shared_ptr in the constructor. That allows us to play back anything that we want.

The methods of this class are

  • play() and stop() from the AudioPlayer interface, and
  • onAudioReady() from the oboe::AudioStreamDataCallback interface.

onAudioReady() is exactly our audio callback; it is a way for the audio driver to request audio data from our application.

OboeAudioPlayer holds pointers to an AudioSource (to retrieve samples from), to an oboe::AudioStream (to control the playback stream) and the value of the sampling rate.

We additionally define a helper constant channelCount which indicates that we play back only mono (single-channel) audio. For the usage of this constant check out the play() method implementation.

Here’s the implementation of the OboeAudioPlayer’s methods.

Listing 5. OboeAudioPlayer.cpp.

#include "OboeAudioPlayer.h"

#include <utility>
#include "AudioSource.h"

using namespace oboe;

namespace wavetablesynthesizer {
OboeAudioPlayer::OboeAudioPlayer(std::shared_ptr<AudioSource> source,
                                 int samplingRate)
    : _source(std::move(source)), _samplingRate(samplingRate) {}

OboeAudioPlayer::~OboeAudioPlayer() {
  // Ensure that the playback is stopped when the AudioPlayer is destroyed
  OboeAudioPlayer::stop();
}

int32_t OboeAudioPlayer::play() {
  // Create an AudioStream using the Oboe's builder
  AudioStreamBuilder builder;
  const auto result =
      builder.setPerformanceMode(PerformanceMode::LowLatency)
          // we don't want to record the sound, just play back
          ->setDirection(Direction::Output)
          ->setSampleRate(_samplingRate)
          // pass this instance as the audio callback
          // this ensures that onAudioReady is called at regular intervals
          // to generate audio
          ->setDataCallback(this)
          // no other app should play back sound simultaneously
          ->setSharingMode(SharingMode::Exclusive)
          ->setFormat(AudioFormat::Float)
          ->setChannelCount(channelCount)
          // if the audio device does not support the requested sampling
          // rate natively, it will have to resample the output;
          // the better the resampling quality the larger the workload
          ->setSampleRateConversionQuality(
            SampleRateConversionQuality::Best)
          // open the stream for playback
          ->openStream(_stream);

  if (result != Result::OK) {
    // indicate that stream creation has failed
    return static_cast<int32_t>(result);
  }

  // request a playback start but don't wait for it to actually start
  const auto playResult = _stream->requestStart();

  return static_cast<int32_t>(playResult);
}

void OboeAudioPlayer::stop() {
  // if there is an active stream, stop, close, and destroy it
  if (_stream) {
    _stream->stop();
    _stream->close();
    _stream.reset();
  }
  // notify the AudioSource that the playback stopped
  _source->onPlaybackStopped();
}

DataCallbackResult
OboeAudioPlayer::onAudioReady(oboe::AudioStream* audioStream,
                                                 void* audioData,
                                                 int32_t framesCount) {
  // we requested floating-point processing, thus, we treat the given
  // memory block as an array of floats
  // WARNING: the sample format may differ from the requested one.
  // Please, refer to Oboe's documentation for details.
  auto* floatData = reinterpret_cast<float*>(audioData);

  // Let's fill the array with samples.
  // This code works for any number of interleaved channels 
  // and any number of frames.
  for (auto frame = 0; frame < framesCount; ++frame) {
    // retrieve a sample from the AudioSource
    // (in our case, it's a WavetableOscillator)
    const auto sample = _source->getSample();
    // copy the samples to all channels of this frame
    for (auto channel = 0; channel < channelCount; ++channel) {
      floatData[frame * channelCount + channel] = sample;
    }
  }
  // indicate to the Oboe library that the playback should continue
  return oboe::DataCallbackResult::Continue;
}
}  // namespace wavetablesynthesizer

AudioSource implementer

Ok, we know how to play back sound. Now let’s generate some!

For this, we will create a simple A4Oscillator class that plays back a 440 Hz sine wave.

We’ll put the declaration into the include/WavetableOscillator.h file and the definition into the WavetableOscillator.cpp file because that’s where we will implement our wavetable oscillator later on.

Listing 6. WavetableOscillator.h.

#pragma once

#include "AudioSource.h"

namespace wavetablesynthesizer {

class A4Oscillator : public AudioSource {
 public:
  explicit A4Oscillator(float sampleRate);

  float getSample() override;

  void onPlaybackStopped() override;

 private:
  float _phase{0.f};
  float _phaseIncrement{0.f};
};
}  // namespace wavetablesynthesizer

Listing 7. WavetableOscillator.cpp.

#include "WavetableOscillator.h"
#include <cmath>
#include "MathConstants.h"

namespace wavetablesynthesizer {

A4Oscillator::A4Oscillator(float sampleRate)
    : _phaseIncrement{2.f * PI * 440.f / sampleRate} {}

float A4Oscillator::getSample() {
  const auto sample = 0.5f * std::sin(_phase);
  _phase = std::fmod(_phase + _phaseIncrement, 2.f * PI);
  return sample;
}

void A4Oscillator::onPlaybackStopped() {
  _phase = 0.f;
}
}  // namespace wavetablesynthesizer

This oscillator simply calculates the next value of the 440 Hz sine wave based on the given sampling rate. Scaling by 0.5f is given not to damage our phone’s speakers 🙂

We use the std::fmod operation to keep the phase in the [0,2π)[0,2\pi) range and not to overflow the floating-point representation.

When the oscillator is stopped, we simply reset the phase to 0.

include/MathConstants.h file contains only the value of π\pi, which unfortunately is not present in the C++ standard library.

Listing 8. MathConstants.h.

#pragma once

namespace wavetablesynthesizer {
static const auto PI = std::atan(1.f) * 4;
}

Initialization in the WavetableSynthesizer Class

Finally, we need to connect the player, the oscillator, and the WavetableSynthesizer class that we created in the previous part of the tutorial.

Listing 9. WavetableSynthesizer.h.

#pragma once

#include <memory>

namespace wavetablesynthesizer {
class AudioSource;
class AudioPlayer;

constexpr auto samplingRate = 48000;

class WavetableSynthesizer {
 public:
  WavetableSynthesizer();

  ~WavetableSynthesizer();

  void play();

  void stop();

  bool isPlaying() const;

  void setFrequency(float frequencyInHz);

  void setVolume(float volumeInDb);

  void setWavetable(Wavetable wavetable);

 private:
  bool _isPlaying = false;
  std::shared_ptr<AudioSource> _oscillator;
  std::unique_ptr<AudioPlayer> _audioPlayer;
};
}  // namespace wavetablesynthesizer

In the include/WavetableSynthesizer.h file, we need to forward-declare 2 classes: AudioSource and AudioPlayer. Forward-declaring them spares us the necessity to include the headers with these classes’ declarations but forces us to use pointers to the instances of these classes rather than simple instance variables.

We also define the desired sampling rate we want to use as a compile-time constant. 48 000 Hz is a typical value in this scenario.

Finally, we declare 2 members: a shared_ptr to an AudioSource and a unique_ptr to an AudioPlayer. Although the AudioSource is now only passed to the AudioPlayer, we want to control it from the WavetableSynthesizer starting from the next tutorial episode, hence the shared (not unique) pointer.

As for the implementation (WavetableSynthesizer.cpp), we need 4 changes:

  1. We include 2 new headers.

    Listing 10. WavetableSynthesizer.cpp.

    #include "OboeAudioPlayer.h"
    #include "WavetableOscillator.h"
  2. We initialize the member variables in the constructor. Note that the oscillator is passed to the player as an AudioSource. We also specify a default destructor implementation, which is necessary because we provided a custom constructor. We need to do it in the .cpp file because in the header file we used forward declarations (the destructors of forward-declared classes are not known to the compiler when parsing the header file).

    Listing 11. WavetableSynthesizer.cpp.

    WavetableSynthesizer::WavetableSynthesizer()
        : _oscillator{std::make_shared<A4Oscillator>(samplingRate)},
          _audioPlayer{
              std::make_unique<OboeAudioPlayer>(
               _oscillator, samplingRate)} {}
    
    WavetableSynthesizer::~WavetableSynthesizer() = default;
  3. We modify the play() member function of the WavetableSynthesizer. If 0 is returned (=success), we update the play state. If not, we log an error.

    Listing 12. WavetableSynthesizer.cpp.

    void WavetableSynthesizer::play() {
      LOGD("play() called");
      const auto result = _audioPlayer->play();
      if (result == 0) {
        _isPlaying = true;
      } else {
        LOGD("Could not start playback.");
      }
    }
  4. We modify the stop() member function of the WavetableSynthesizer; we stop the player and update the play state.

    Listing 13. WavetableSynthesizer.cpp.

    void WavetableSynthesizer::stop() {
      LOGD("stop() called");
      _audioPlayer->stop();
      _isPlaying = false;
    }

Adjustment to CMakeLists.txt

To build the new source files, we need to list WavetableOscillator.cpp and OboeAudioPlayer.cpp in the source file list of the add_library command in our CMakeLists.txt file:

Listing 14. CMakeLists.txt.

add_library( # Sets the name of the library.
             wavetablesynthesizer

             # Sets the library as a shared library.
             SHARED

             # Provides a relative path to your source file(s).
             wavetablesynthesizer-native-lib.cpp
             WavetableSynthesizer.cpp
             WavetableOscillator.cpp
             OboeAudioPlayer.cpp
        )

To link against the Oboe library we need two commands:

  1. find_package to discover the library, and
  2. target_link_libraries to link against it.

Listing 15. CMakeLists.txt.

find_package(oboe REQUIRED CONFIG)

target_link_libraries(wavetablesynthesizer
                      ${log-lib}
                      oboe::oboe)

The ${log-lib} part was explained in the previous part of the tutorial.

But that’s not over yet…

Specifying Oboe as a Dependency

To make Gradle download the Oboe library and make it discoverable for CMake, we need to declare it as a dependency in our app module’s build.gradle file:

Listing 16. app/build.gradle.

dependencies {
  // ...
  implementation "com.google.oboe:oboe:1.6.1"
}

…and enable the “prefab” build feature:

Listing 17. app/build.gradle.

android {
  //...
  buildFeatures {
    //...
    prefab true
  }
}

…and enable the usage of the shared STL implementation (don’t ask me why):

Listing 18. app/build.gradle.

android {
  //...
  defaultConfig {
    //...
    externalNativeBuild {
      cmake {
        cppFlags '-std=c++2a'
        arguments '-DANDROID_STL=c++_shared' // this bit is important
      }
    }
  }
  //...
}

If something goes wrong in this process, StackOverflow is our friend…

Running the Synthesizer

If everything went as planned, you should be able to run the synthesizer app in the emulator. Upon clicking “Play” your emulator should play back the 440 Hz sine tone. Upon clicking “Stop” it should stop playing it.

Congratulations! You have successfully connected to Android’s audio device using the Oboe library!

Now, how to play back something more interesting than just a sine? That will be the topic of the next part of the tutorial, where we will conclude our application by implementing various wavetables, amplitude and frequency control while being thread-safe.

Summary

In this tutorial episode, we implemented a client of the Oboe library that allows us to connect to Android’s audio device. We explained a bit how audio APIs typically work and we discussed various options available on Android.

Finally, we modified the structure of our project so as to avoid direct dependencies between the Oboe library and our main synthesizer class.

If you would like to learn more about audio processing with code (including Android audio), be sure to subscribe to my newsletter!