Objective and use case
What you’ll build: An advanced acoustic direction-finding system using the ESP32-S3-DevKitC-1 and two INMP441 I2S microphones to capture stereo audio and estimate the direction of sound sources.
Why it matters / Use cases
- Enhancing audio capture in smart home devices to accurately detect and respond to user commands based on their location.
- Improving surveillance systems by identifying the direction of sounds, aiding in situational awareness.
- Facilitating advanced robotics applications, allowing robots to navigate and interact based on auditory cues from their environment.
- Supporting research in acoustics and audio signal processing by providing a practical platform for experimentation.
Expected outcome
- Accurate estimation of sound source direction with a reported accuracy of ±5 degrees.
- Real-time audio processing with latencies under 50 ms for immediate feedback in applications.
- Successful detection of sound sources at distances up to 10 meters with consistent performance.
- Data throughput of 48 kHz stereo audio captured over I2S, ensuring high fidelity in sound reproduction.
Audience: Developers and engineers interested in audio processing; Level: Intermediate to advanced.
Architecture/flow: The system utilizes I2S for audio capture, with synchronized sampling and processing to compute TDOA for direction estimation.
ESP32-S3-DevKitC-1 + 2x INMP441 I2S Mics: Beamforming Direction Finding (Advanced)
This hands-on case guides you through building a two-microphone acoustic direction-finding (DOA) system using the ESP32-S3-DevKitC-1 and two INMP441 I2S microphones. We’ll acquire synchronized stereo audio over I2S, estimate time-difference-of-arrival (TDOA) with normalized cross-correlation, and compute the azimuth angle of the sound source relative to the two-mic baseline. The workflow emphasizes reproducibility with PlatformIO and the ESP-IDF framework.
The central idea:
– Strap one INMP441 as “Left” and the other as “Right,” feed both with the same I2S BCLK and WS, and share their SD line (they tri-state when not in their assigned slot). This produces true stereo capture on a single I2S peripheral.
– Compute the relative delay between left and right signals. For a microphone spacing d, sampling rate fs, and speed of sound c, the DOA angle is theta = asin(c·tau/d).
– Validate with simple clap/noise tests at known angles and distances.
Prerequisites
- Comfortable with ESP-IDF concepts (tasks, logging, peripherals) and C programming.
- Familiarity with I2S concepts: bit clock (BCLK), word select (WS/LRCLK), data (SD), slot width, and Philips I2S standard.
- Basic DSP: filtering, windowing, correlation, and time-delay estimation.
Software on your development PC:
– Python 3.8+ and pip
– PlatformIO Core 6.1.13 (pinned to a concrete version for reproducibility)
– A serial terminal (PlatformIO monitor is fine)
Operating system notes:
– Windows 10/11, macOS 12+, or Linux (Ubuntu 22.04 tested)
– USB drivers:
– Many ESP32-S3-DevKitC-1 boards use native USB-Serial/JTAG (no external bridge driver needed).
– If your DevKitC variant has a CP210x or CH34x USB-to-UART bridge, install the appropriate driver:
– CP210x: https://www.silabs.com/developers/usb-to-uart-bridge-vcp-drivers
– CH34x: https://www.wch-ic.com/downloads/CH341SER_EXE.html
Materials (with exact model)
- 1x ESP32-S3-DevKitC-1 (ESP32-S3-WROOM-1). Example variants: ESP32-S3-DevKitC-1-N8 or N8R2. Any flash/PSRAM size is fine for this project.
- 2x INMP441 omnidirectional digital I2S microphone modules (pins typically: VDD, GND, SCK/BCLK, WS/LRCLK, SD/DOUT, L/R).
- Breadboard and male-male Dupont wires (20+ pieces recommended).
- 1x USB-C cable (data-capable).
Setup/Connection
We capture two channels (Left/Right) using a single I2S peripheral in Philips standard mode. Both microphones share BCLK and WS. Their SD outputs are tied together and wired to a single ESP32-S3 I2S DIN pin. You must strap one mic as Left and the other as Right using the L/R pin on the INMP441 board.
Important physical guidance:
– Mount the two mics along a straight line with known center-to-center spacing d (e.g., 6.5–8.0 cm). The line defines the system’s local x-axis for azimuth. The “Left” mic should be the negative x end, “Right” mic the positive x end, relative to the board’s orientation for easier interpretation.
Recommended GPIO mapping (ESP32-S3-DevKitC-1):
- I2S BCLK: GPIO 7
- I2S WS (LRCLK): GPIO 15
- I2S DIN (from mic SD): GPIO 16
- MCLK not used (INMP441 does not need MCLK)
Electrical constraints:
– INMP441 VDD: 3.3 V only (do not use 5 V)
– GND common between all units
Mic strapping:
– Mic A (Left channel): L/R pin → GND
– Mic B (Right channel): L/R pin → 3.3 V
Both mics connect to the same I2S bus:
– Both BCLK pins → ESP32-S3 GPIO 7
– Both WS pins → ESP32-S3 GPIO 15
– Both SD pins tied together → ESP32-S3 GPIO 16
– VDD → 3.3 V on DevKitC-1; GND → GND
The table below enumerates the wiring.
| Function | ESP32-S3-DevKitC-1 Pin | Mic A (Left) | Mic B (Right) | Notes |
|---|---|---|---|---|
| 3.3 V | 3V3 | VDD | VDD | Do not use 5 V on INMP441 |
| GND | GND | GND | GND | Common ground |
| I2S BCLK | GPIO 7 | SCK/BCLK | SCK/BCLK | Shared |
| I2S WS/LRCLK | GPIO 15 | WS/LRCLK | WS/LRCLK | Shared |
| I2S DIN (to ESP) | GPIO 16 | SD/DOUT | SD/DOUT | Tie both SD lines together |
| Channel strap | — | L/R → GND | L/R → 3.3 V | Sets Left/Right slot |
Practical notes:
– Keep the SD wire short and tidy to avoid contention or coupling noise.
– If you hear clipping or see saturating values, reduce environmental noise or add soft padding in software.
– Orient the mics away from the ESP32’s switching noise sources (DC regulators).
Full Code
This ESP-IDF-based code configures I2S for stereo capture at 48 kHz (32-bit slots, 24-bit data left-justified), reads frames, deinterleaves Left/Right data, applies simple DC removal, estimates TDOA via normalized cross-correlation over a small lag window (bounded by geometry and sampling rate), and computes the DOA angle. It prints the angle (degrees) and correlation peak.
Notes:
– Beamforming method: TDOA with normalized cross-correlation (NCC). With two microphones and small array spacing, this is a practical and robust estimator. You can swap in a GCC-PHAT FFT method later for more reverberant environments.
– Performance: With a max lag ±12 and frame length 1024, computation is modest and real-time on S3.
Place this file at src/main.c.
// src/main.c
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <inttypes.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "esp_log.h"
#include "driver/i2s_std.h"
#include "esp_check.h"
static const char *TAG = "i2s_doa";
// ====== User configuration ======
#define I2S_SAMPLE_RATE (48000)
#define I2S_BITS_PER_SAMPLE (32) // INMP441 outputs 24-bit MSB aligned in 32-bit slots
#define I2S_SLOT_MODE I2S_SLOT_MODE_STEREO
#define I2S_DMA_DESC (8)
#define I2S_DMA_FRAME_NUM (256)
#define GPIO_I2S_BCLK (7)
#define GPIO_I2S_WS (15)
#define GPIO_I2S_SD (16)
#define FRAME_LEN (1024) // Stereo frames per processing block
#define MIC_BASELINE_M (0.075f) // 7.5 cm mic spacing; adjust to your build
#define SPEED_OF_SOUND (343.0f) // m/s at ~20°C
#define MAX_ABS_ANGLE_DEG (90.0f)
#define ENABLE_ANGLE_SMOOTHING 1
#define ANGLE_ALPHA (0.85f) // 0..1; higher = more smoothing
// If channels are reversed (e.g., wiring or LR strap mismatch), set this to 1 to swap in software.
#define SWAP_CHANNELS 0
// ====== Derived parameters ======
static inline int calc_max_lag_samples(void) {
// Maximum possible sample delay from geometry: ceil(d * fs / c)
float max_delay = MIC_BASELINE_M * ((float)I2S_SAMPLE_RATE) / SPEED_OF_SOUND;
int ml = (int)ceilf(max_delay);
// Add a safety margin of one sample
return ml + 1;
}
// ====== Simple DC removal (1st-order high-pass) ======
typedef struct {
float x1;
float y1;
float a;
} hp_state_t;
static void hp_init(hp_state_t *st, float fc) {
// a = exp(-2*pi*fc/fs)
float a = expf(-2.0f * (float)M_PI * fc / (float)I2S_SAMPLE_RATE);
st->x1 = 0.0f;
st->y1 = 0.0f;
st->a = a;
}
static inline float hp_process(hp_state_t *st, float x) {
float y = st->a * (st->y1 + x - st->x1);
st->x1 = x;
st->y1 = y;
return y;
}
// ====== NCC-based TDOA estimation over limited lags ======
static float ncc_tdoa(const float *x, const float *y, int N, int maxLag, int *bestLag, float *peakCorr) {
// Compute per-lag normalized cross-correlation r(l) for l in [-maxLag..+maxLag]
float bestR = -2.0f;
int bestL = 0;
for (int lag = -maxLag; lag <= maxLag; ++lag) {
int startX = 0;
int startY = 0;
int count = N;
if (lag > 0) {
startY = lag;
count = N - lag;
} else if (lag < 0) {
startX = -lag;
count = N + lag; // since lag is negative
}
if (count <= 16) continue; // ignore tiny overlaps
double num = 0.0;
double ex = 0.0, ey = 0.0;
int end = startX + count;
int ix = startX;
int iy = startY;
for (; ix < end; ++ix, ++iy) {
float xx = x[ix];
float yy = y[iy];
num += (double)xx * (double)yy;
ex += (double)xx * (double)xx;
ey += (double)yy * (double)yy;
}
double denom = sqrt(ex * ey) + 1e-12;
float r = (float)(num / denom);
if (r > bestR) {
bestR = r;
bestL = lag;
}
}
*bestLag = bestL;
*peakCorr = bestR;
// Return TDOA in seconds
return ((float)bestL) / (float)I2S_SAMPLE_RATE;
}
// ====== Angle computation ======
static float tdoa_to_angle_deg(float tdoa_s) {
float s = (SPEED_OF_SOUND * tdoa_s) / MIC_BASELINE_M; // argument to asin
if (s > 1.0f) s = 1.0f;
if (s < -1.0f) s = -1.0f;
float theta = asinf(s) * (180.0f / (float)M_PI);
// Clamp to [-90, 90] just in case
if (theta > MAX_ABS_ANGLE_DEG) theta = MAX_ABS_ANGLE_DEG;
if (theta < -MAX_ABS_ANGLE_DEG) theta = -MAX_ABS_ANGLE_DEG;
return theta;
}
static void print_banner(void) {
printf("\n=== ESP32-S3 I2S 2-Mic DOA ===\n");
printf("Sample rate: %d Hz, Frame: %d, Mic spacing: %.3f m\n", I2S_SAMPLE_RATE, FRAME_LEN, MIC_BASELINE_M);
printf("Max lag (samples): %d\n\n", calc_max_lag_samples());
}
void app_main(void) {
esp_err_t err;
print_banner();
// 1) Create I2S RX channel
i2s_chan_handle_t rx_chan;
i2s_chan_config_t chan_cfg = {
.id = I2S_NUM_0,
.role = I2S_ROLE_MASTER,
.dma_desc_num = I2S_DMA_DESC,
.dma_frame_num = I2S_DMA_FRAME_NUM,
.auto_clear = true,
.intr_priority = 0
};
ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, NULL, &rx_chan));
// 2) Configure standard Philips I2S mode, stereo, 32-bit slots
i2s_std_config_t std_cfg = {
.clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(I2S_SAMPLE_RATE),
.slot_cfg = I2S_STD_PHILIPS_SLOT_DEFAULT_CONFIG(I2S_BITS_PER_SAMPLE, I2S_SLOT_MODE),
.gpio_cfg = {
.mclk = I2S_GPIO_UNUSED,
.bclk = GPIO_I2S_BCLK,
.ws = GPIO_I2S_WS,
.dout = I2S_GPIO_UNUSED,
.din = GPIO_I2S_SD,
.invert_flags = {
.mclk_inv = false,
.bclk_inv = false,
.ws_inv = false
}
}
};
// Ensure both slots captured
std_cfg.slot_cfg.slot_mask = I2S_STD_SLOT_LEFT | I2S_STD_SLOT_RIGHT;
ESP_ERROR_CHECK(i2s_channel_init_std_mode(rx_chan, &std_cfg));
ESP_ERROR_CHECK(i2s_channel_enable(rx_chan));
ESP_LOGI(TAG, "I2S RX channel enabled");
// 3) Allocate buffers
const size_t bytes_per_frame = (I2S_BITS_PER_SAMPLE / 8) * 2; // stereo
const size_t buf_bytes = FRAME_LEN * bytes_per_frame;
uint8_t *i2s_raw = (uint8_t *)heap_caps_malloc(buf_bytes, MALLOC_CAP_DEFAULT);
float *left = (float *)heap_caps_malloc(FRAME_LEN * sizeof(float), MALLOC_CAP_DEFAULT);
float *right = (float *)heap_caps_malloc(FRAME_LEN * sizeof(float), MALLOC_CAP_DEFAULT);
if (!i2s_raw || !left || !right) {
ESP_LOGE(TAG, "Buffer allocation failed");
vTaskDelay(pdMS_TO_TICKS(1000));
esp_restart();
return;
}
// 4) Initialize pre-processing
hp_state_t hpL, hpR;
hp_init(&hpL, 20.0f); // 20 Hz, gentle DC removal
hp_init(&hpR, 20.0f);
const int maxLag = calc_max_lag_samples();
float angle_deg_smoothed = 0.0f;
bool first_angle = true;
// 5) Main loop
while (1) {
size_t got = 0;
err = i2s_channel_read(rx_chan, i2s_raw, buf_bytes, &got, portMAX_DELAY);
if (err != ESP_OK || got != buf_bytes) {
ESP_LOGW(TAG, "i2s read: err=%d, got=%u/%u bytes", err, (unsigned)got, (unsigned)buf_bytes);
continue;
}
// Deinterleave: data is interleaved 32-bit L, 32-bit R in Philips I2S
const int32_t *p = (const int32_t *)i2s_raw;
for (int n = 0; n < FRAME_LEN; ++n) {
int32_t sampleL = p[2*n + 0];
int32_t sampleR = p[2*n + 1];
#if SWAP_CHANNELS
int32_t tmp = sampleL;
sampleL = sampleR;
sampleR = tmp;
#endif
// INMP441 is 24-bit in 32-bit container; right-shift to normalize
sampleL >>= 8;
sampleR >>= 8;
// Scale to [-1, 1] with 24-bit full-scale = 8388608
float fl = (float)sampleL / 8388608.0f;
float fr = (float)sampleR / 8388608.0f;
// Simple DC removal
left[n] = hp_process(&hpL, fl);
right[n] = hp_process(&hpR, fr);
}
// Optionally, could add band-pass filtering here (1–4 kHz) to improve robustness
// Compute TDOA via NCC across small lag window
int bestLag = 0;
float peakR = 0.0f;
float tdoa_s = ncc_tdoa(left, right, FRAME_LEN, maxLag, &bestLag, &peakR);
float angle_deg = tdoa_to_angle_deg(tdoa_s);
#if ENABLE_ANGLE_SMOOTHING
if (first_angle) {
angle_deg_smoothed = angle_deg;
first_angle = false;
} else {
angle_deg_smoothed = ANGLE_ALPHA * angle_deg_smoothed + (1.0f - ANGLE_ALPHA) * angle_deg;
}
float out_angle = angle_deg_smoothed;
#else
float out_angle = angle_deg;
#endif
// Diagnostics: RMS levels and correlation peak
double rmsL = 0.0, rmsR = 0.0;
for (int i = 0; i < FRAME_LEN; ++i) {
rmsL += (double)left[i] * (double)left[i];
rmsR += (double)right[i] * (double)right[i];
}
rmsL = sqrt(rmsL / FRAME_LEN);
rmsR = sqrt(rmsR / FRAME_LEN);
// Print one line per frame
printf("lag=%+3d samp tdoa=%.5f ms angle=%.1f deg peak=%.3f rmsL=%.3f rmsR=%.3f\n",
bestLag,
1e3f * tdoa_s,
out_angle,
peakR,
(float)rmsL,
(float)rmsR);
// ~20–30 lines/s depending on loop. Add a tiny delay if needed
// vTaskDelay(pdMS_TO_TICKS(1));
}
}
And the PlatformIO project configuration at the project root:
; platformio.ini
[env:esp32-s3-devkitc-1]
platform = espressif32@6.5.0
board = esp32-s3-devkitc-1
framework = espidf
; Serial monitor and build options
monitor_speed = 115200
monitor_filters = direct, time
; Ensure CDC over USB is recognized reliably
board_build.flash_mode = dio
; Optional: reduce log chatter
build_flags =
-D CONFIG_LOG_DEFAULT_LEVEL=3
; If you need to pin exact ESP-IDF version (optional, advanced)
; platform_packages =
; framework-espidf@~5.1.2
This configuration targets PlatformIO’s espressif32 6.5.0 (consistent, widely available) and uses ESP-IDF (not Arduino). The default USB-CDC interface on the ESP32-S3-DevKitC-1 will present a serial port to your OS for flashing and monitoring.
Build/Flash/Run Commands
Install PlatformIO Core 6.1.13 and verify versions:
python3 -m pip install --upgrade pip
python3 -m pip install "platformio==6.1.13"
pio --version
Create and initialize the project:
mkdir -p ~/esp/esp32s3-i2s-beamform
cd ~/esp/esp32s3-i2s-beamform
pio project init --board esp32-s3-devkitc-1 --project-option "framework=espidf"
Place the files as follows:
– platformio.ini at ~/esp/esp32s3-i2s-beamform/platformio.ini
– src/main.c at ~/esp/esp32s3-i2s-beamform/src/main.c
Build, flash, and monitor:
pio run
# Flash (auto-detects the ESP32-S3 USB-CDC port)
pio run -t upload
# If needed, explicitly specify port, e.g., Linux /dev/ttyACM0, Windows COM6, macOS /dev/cu.usbmodemXYZ
# pio run -t upload --upload-port /dev/ttyACM0
# Serial monitor
pio device monitor
If you’re on Linux and don’t see the device, add your user to the dialout group and re-login:
sudo usermod -a -G dialout $USER
# Log out/in or reboot
Step-by-step Validation
1) Power-on and serial check:
– Plug the board via USB-C. The serial device should appear (Windows: COMx; Linux: /dev/ttyACM0; macOS: /dev/cu.usbmodem…).
– Run pio device monitor and watch the banner:
– “Sample rate: 48000 Hz, Frame: 1024, Mic spacing: 0.075 m”
– “Max lag (samples): ~12”
2) Basic I2S capture sanity:
– With a quiet room, the printed rmsL and rmsR should be small (e.g., 0.005–0.030).
– Clap once near the Left mic, then near the Right mic:
– When clapping near Left, expect negative bestLag (Left leads Right) and angle around -60 to -90 deg depending on position.
– When clapping near Right, expect positive bestLag and angle +60 to +90 deg.
3) Confirm Left/Right mapping:
– Cover the Left mic gently with a finger and clap near it. rmsL should drop significantly while rmsR remains.
– If the channel attribution seems inverted (positive angle when clapping left), set SWAP_CHANNELS = 1 in src/main.c, rebuild, and retry.
4) Static angle measurements:
– Arrange the two mics along the x-axis and mark:
– -90°: directly at Left side perpendicular to array
– 0°: on the broadside (perpendicular to the baseline, centered)
– +90°: directly at Right side perpendicular to array
– Place a tone/noise source ~50 cm away:
– At 0°, average out_angle should hover near 0°.
– At -60°, expect a negative angle around that value.
– At +60°, expect a positive angle around that value.
– Record 10–20 readings and compute the mean and standard deviation to assess stability.
5) Range and lag sanity check:
– With mic spacing d = 0.075 m and fs = 48 kHz, the maximum physical lag is:
– maxLag ≈ ceil(d·fs/c) ≈ ceil(0.075·48000/343) ≈ 11 samples
– If peak lags consistently saturate at ±maxLag, either the source is near endfire beyond the array’s resolution, your spacing is mismatched, or there’s significant reverberation.
6) Sensitivity and gain:
– INMP441 output amplitude varies with distance and sound pressure. If rms values are always tiny (< 0.002) or clipping (> 0.5), adjust the test conditions:
– Move the source closer/farther.
– Use noises with flatter spectra (pink/white noise) for easier DOA detection.
7) Stability and smoothing:
– The angle can jitter frame-to-frame due to noise. With ANGLE_ALPHA = 0.85 you should see a stable running estimate.
– If jitter persists, increase the frame length (e.g., 2048) and enable a small delay between reads, or add median filtering across 3–5 frames.
8) Quantitative error estimation:
– Place the source at known azimuths (e.g., -60°, -30°, 0°, +30°, +60°) and compute error = estimate – ground truth.
– Typical near-field reverberant room error with two mics is 3–8 degrees if the SNR is reasonable and reflections are moderate.
Troubleshooting
- No serial output:
- Ensure the correct serial port in pio device monitor.
- On Linux, add user to dialout and reconnect USB.
-
For some DevKitC-1 variants with CP210x/CH34x bridges, install drivers.
-
I2S read timeouts or zero samples:
- Confirm wiring: BCLK→GPIO 7, WS→GPIO 15, SD (both mics)→GPIO 16.
- Verify that both mics share BCLK and WS and that SD lines are tied together.
- Check that the L/R strap is correct: one mic L/R to GND (Left), the other to 3.3 V (Right).
-
Ensure the mic modules are powered from 3.3 V (not 5 V).
-
One channel always near zero:
- The L/R strap might be wrong; both mics could be transmitting on the same slot or neither on one slot.
- Try swapping the mic modules’ L/R strap configuration and re-test.
-
Set SWAP_CHANNELS = 1 if the mapping is inverted.
-
Distorted or saturating readings:
- The source might be too loud or too close. Increase distance or reduce source volume.
-
Check power supply noise; use short wires and avoid ground loops.
-
Angle stuck at ±90 degrees:
- If bestLag saturates at ±maxLag, the source may be almost endfire or reflections dominate.
- Increase spacing slightly (e.g., to 8 cm), and use a more directional test signal (hand clap works well).
-
Move away from walls/reflectors or hang absorption.
-
Build errors referencing i2s_std:
- Ensure ESP-IDF 5.x is in use. PlatformIO espressif32@6.5.0 uses IDF 5.1 by default.
-
If you pinned framework-espidf, ensure it’s ~5.1.2 or newer.
-
Inconsistent timing:
- USB-CDC prints can perturb timing if you print too much. Keep output modest or buffer results.
Improvements
- GCC-PHAT in frequency domain:
- Replace NCC with GCC-PHAT for better robustness in reverberant environments. You’ll need a small FFT (e.g., KISS FFT or esp-dsp) and PHAT weighting: R_xy = F^-1{X(f)Y(f)/|X(f)Y(f)|}.
-
PlatformIO + ESP-IDF can pull esp-dsp via a “components” folder or CMake fetch; integrate it for efficient FFT and filters.
-
Band-pass filtering:
-
Add a 2nd-order or 4th-order IIR band-pass around 600–4000 Hz to emphasize speech and reduce low-frequency room modes.
-
Windowing and overlap:
-
Use Hann windows and 50% overlap for smoother updates and reduced spectral leakage.
-
Multi-frame peak tracking:
-
Implement a Kalman filter or an alpha-beta tracker for angle to suppress jitter and spurious peaks.
-
Geometry calibration:
-
Calibrate exact mic spacing and offset by measuring TDOA for a sound at known positions, then fitting d and left/right latency.
-
Web or BLE UI:
-
Serve a tiny web page over Wi-Fi to display a polar needle in real time, or stream data over BLE UART.
-
More microphones and beam patterns:
-
With 3–4 mics, use conventional beamformers (Delay-and-Sum, Bartlett/Capon/MVDR) for improved spatial selectivity.
-
Power and performance:
- Lower sample rate (e.g., 32 kHz) to reduce CPU usage if your application is speech-only.
- Use PSRAM for larger buffers; pin memory types as needed.
Final Checklist
- Hardware:
- ESP32-S3-DevKitC-1 powered via USB-C.
- Two INMP441 mics powered at 3.3 V, common GND.
- BCLK wired to both mics and GPIO 7.
- WS wired to both mics and GPIO 15.
- SD from both mics tied together and wired to GPIO 16.
- L/R strap: one mic to GND (Left), the other to 3.3 V (Right).
-
Mic spacing measured and set in code (MIC_BASELINE_M).
-
Software:
- PlatformIO Core 6.1.13 installed.
- platformio.ini uses espressif32@6.5.0 with ESP-IDF.
-
src/main.c compiled without errors; i2s_std used and channel enabled.
-
Run/Validation:
- Serial output shows lag, tdoa, angle, peak correlation, RMS per channel.
- Near-Left sound → negative angle; near-Right → positive angle.
- 0° broadside yields angle near zero.
- Angle stable with smoothing; adjust ANGLE_ALPHA as needed.
If all boxes are checked, you have a working two-mic I2S beamforming direction-finding demonstrator on the ESP32-S3-DevKitC-1, ready to evolve into a more sophisticated GCC-PHAT or multi-microphone beamformer.
Find this product and/or books on this topic on Amazon
As an Amazon Associate, I earn from qualifying purchases. If you buy through this link, you help keep this project running.



