Objective and use case
What you’ll build: A real-time audio spectrogram on an ESP32 that captures audio from an INMP441 I2S microphone, computes an FFT-based spectrogram, streams it over WebSocket, and drives a WS2812B LED ring for visualization.
Why it matters / Use cases
- Real-time audio analysis for music visualizations at live events.
- Educational tool for teaching audio signal processing and visualization techniques.
- Integration with IoT devices for smart home audio applications.
- Prototype for audio-based interactive art installations.
Expected outcome
- Real-time audio streaming with less than 50 ms latency.
- FFT computation with a frequency resolution of 100 Hz.
- WebSocket data transmission rate of 30 packets/s.
- LED ring visualization updates in sync with audio input at 60 FPS.
Audience: Advanced DIY enthusiasts; Level: Advanced
Architecture/flow: ESP32 captures audio -> INMP441 I2S microphone -> FFT processing -> WebSocket streaming -> WS2812B LED visualization.
Advanced Hands‑On Practical: I2S Spectrogram + WebSocket + NeoPixel Visualizer on ESP32
Objective: i2s-spectrogram-websocket-neopixel
This tutorial walks you through building a real‑time audio spectrogram pipeline on an ESP32 that:
- Captures audio from an INMP441 I2S microphone
- Computes an FFT-based spectrogram (log‑scaled frequency bins)
- Streams spectra over a WebSocket to a browser (canvas display)
- Drives a WS2812B LED ring with a live spectrum visualizer
Target device family and exact device model: ESP32-DevKitC V4 + INMP441 I2S mic + WS2812B LED ring
This is an advanced project designed to be precise, reproducible, and portable via PlatformIO.
Prerequisites
- A PC with PlatformIO Core 6.x (via the VS Code extension or CLI) and Git installed.
- Serial driver for the ESP32-DevKitC V4 (CP210x family). Most DevKitC V4 boards use the Silicon Labs CP2102/CP210x USB‑UART bridge.
- Windows: Install “CP210x Universal Windows Driver”
- macOS: Generally works out of the box; if needed, install Silicon Labs CP210x VCP driver.
-
Linux: Usually no driver needed; confirm with dmesg after plugging in.
-
A stable Wi‑Fi AP (2.4 GHz), with SSID and password you can hardcode for development.
- Power:
- ESP32 via USB (5 V).
- WS2812B ring powered by a dedicated 5 V supply sized for the number of LEDs (e.g., 24 LEDs × 60 mA worst case ≈ 1.44 A). Start with 5 V 2 A.
- Basic soldering/jumper wires and a breadboard or headers.
- This guide assumes PlatformIO CLI and Arduino framework for ESP32 (espressif32 platform) with AsyncWebServer.
Materials (exact models)
- 1x ESP32-DevKitC V4 (with CP2102/CP210x USB‑UART)
- 1x INMP441 I2S digital microphone module (pins: VDD, GND, SCK/BCLK, WS/LRCL, SD, L/R)
- 1x WS2812B LED ring (e.g., 24 LEDs, 5 V)
- Optional but strongly recommended: 74AHCT125 or SN74HCT245 for 3.3 V → 5 V level shifting on the LED data line
- Wires, breadboard, 1000 µF electrolytic capacitor across the LED ring 5 V and GND, and a 330–470 Ω series resistor in the LED data line
Setup / Connection
The INMP441 is an I2S microphone that sends 24‑bit PCM in a 32‑bit I2S slot. The ESP32 I2S peripheral will be configured as master receiver. You do not need MCLK for the INMP441.
The WS2812B expects 5 V power and ~800 kHz 1‑wire data. It often works with 3.3 V logic from ESP32 if the ring is powered ~4.5–5 V and wiring is short, but the recommended approach is a proper 3.3 V→5 V level shifter.
Important: Common ground is mandatory for ESP32, mic, and LED ring.
Pin mapping and wiring
- ESP32-DevKitC V4 default I2S pins chosen:
- BCLK (SCK): GPIO26
- LRCL (WS): GPIO25
- DOUT (mic SD): GPIO33
- INMP441 L/R: tie to GND to output “left” channel (we’ll configure ESP32 to read only left)
- WS2812B LED ring data: GPIO18 (via 330–470 Ω resistor and level shifter if available)
- LED power: external 5 V supply; add a 1000 µF cap across +5 V and GND near the ring; connect grounds
Table: exact connections
| Module/Signal | ESP32-DevKitC V4 Pin | INMP441 Pin | WS2812B Ring | Notes |
|---|---|---|---|---|
| 3.3 V | 3V3 | VDD | — | Power mic at 3.3 V only |
| GND | GND | GND | GND | Common ground for all devices |
| I2S BCLK (SCK) | GPIO26 | SCK | — | I2S bit clock |
| I2S LRCL (WS) | GPIO25 | WS | — | I2S word select (left/right) |
| I2S SD (DOUT from mic) | GPIO33 | SD | — | I2S data input on ESP32 |
| INMP441 L/R select | — | L/R to GND | — | Select “left” channel output |
| LED Data | GPIO18 → 330 Ω → Level shifter → DIN | — | DIN | If no shifter, try direct with short wire |
| LED +5 V | — | — | +5 V | External 5 V supply |
| LED GND | GND | — | GND | Same ground as ESP32 |
Driver notes:
– If you don’t see a new serial port when plugging in the DevKitC V4, install the Silicon Labs CP210x driver and reconnect. Then check:
– Windows: Device Manager → Ports (COM & LPT) → “Silicon Labs CP210x USB to UART Bridge” (note the COM port)
– macOS: ls /dev/cu. (look for /dev/cu.SLAB_USBtoUART)
– Linux: dmesg | tail and ls /dev/ttyUSB
Full Code
We will use PlatformIO with Arduino framework and the following libraries:
– ESPAsyncWebServer (Async WebSocket)
– AsyncTCP
– arduinoFFT
– Adafruit NeoPixel
We’ll serve a minimal HTML/JS page from PROGMEM so no filesystem is required. The browser will show a scrolling spectrogram canvas and live dB readout.
platformio.ini
Use the generic “esp32dev” PlatformIO board profile, which matches ESP32-DevKitC V4.
; platformio.ini (tested with PlatformIO Core 6.x)
[env:esp32dev]
platform = espressif32 @ 6.5.0
board = esp32dev
framework = arduino
; Serial and upload
monitor_speed = 115200
upload_speed = 921600
; If you want to force the serial port, uncomment and set accordingly:
; upload_port = /dev/ttyUSB0
; monitor_port = /dev/ttyUSB0
build_flags =
-DASYNCWEBSERVER_REGEX=1
-DCORE_DEBUG_LEVEL=0
lib_deps =
; Async WebServer fork maintained for PlatformIO compatibility
ottowinter/ESPAsyncWebServer-esphome @ ^3.1.0
me-no-dev/AsyncTCP @ ^1.1.1
kosme/arduinoFFT @ ^1.6.2
adafruit/Adafruit NeoPixel @ ^1.12.3
src/main.cpp
#include <Arduino.h>
#include <WiFi.h>
#include "driver/i2s.h"
#include <AsyncTCP.h>
#include <ESPAsyncWebServer.h>
#include <Adafruit_NeoPixel.h>
#include <arduinoFFT.h>
// ================== User configuration ==================
static const char* WIFI_SSID = "YOUR_WIFI_SSID";
static const char* WIFI_PASSWORD = "YOUR_WIFI_PASSWORD";
// I2S pins for INMP441
#define I2S_PORT I2S_NUM_0
#define I2S_PIN_BCLK 26 // SCK
#define I2S_PIN_WS 25 // LRCL/WS
#define I2S_PIN_SD 33 // DOUT from mic to ESP32 data_in
// LED ring
#define LED_PIN 18
#define LED_COUNT 24
// Signal processing
#define SAMPLE_RATE 16000
#define FFT_SIZE 1024
#define SPEC_BINS 64
#define MIN_FREQ_HZ 60
#define MAX_FREQ_HZ 8000
#define FRAME_HOP FFT_SIZE // 100% overlap=FFT_SIZE, lower for overlap-add
// WebSocket broadcasting
#define WS_PATH "/ws"
#define HTTP_PORT 80
#define FRAME_INTERVAL_MS 0 // 0 = stream each frame as computed
// ========================================================
Adafruit_NeoPixel strip(LED_COUNT, LED_PIN, NEO_GRB + NEO_KHZ800);
AsyncWebServer server(HTTP_PORT);
AsyncWebSocket ws(WS_PATH);
arduinoFFT FFT = arduinoFFT(); // double-precision routines on floats
static float vReal[FFT_SIZE];
static float vImag[FFT_SIZE];
static float windowHann[FFT_SIZE];
static int binEdges[SPEC_BINS + 1]; // FFT bin indices per log bin (inclusive start, exclusive end)
static float binPower[SPEC_BINS]; // linear power per bin
static float binDb[SPEC_BINS]; // dBFS per bin (negative values)
static float noiseFloorDb = -70.0f; // adaptive floor for LED mapping
static uint32_t lastFrameMs = 0;
// Serve a tiny HTML/JS page from PROGMEM
static const char INDEX_HTML[] PROGMEM = R"HTML(
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>ESP32 I2S Spectrogram</title>
<style>
body { font-family: sans-serif; margin: 0; padding: 0; background: #111; color: #ddd; }
#top { padding: 12px; background: #222; position: sticky; top: 0; }
#stats { margin-left: 10px; }
canvas { display: block; width: 100vw; height: 60vh; image-rendering: pixelated; }
#legend { padding: 8px; }
</style>
</head>
<body>
<div id="top">
<span>ESP32 I2S Spectrogram</span>
<span id="stats"></span>
</div>
<canvas id="spec"></canvas>
<div id="legend">WebSocket stream of log-spaced frequency bins (JSON). Resize window to adjust canvas.</div>
<script>
const canvas = document.getElementById('spec');
const ctx = canvas.getContext('2d');
let width = 512, height = 256;
function resize() {
width = Math.floor(window.innerWidth);
height = Math.floor(window.innerHeight * 0.6);
canvas.width = width; canvas.height = height;
}
window.addEventListener('resize', resize); resize();
const nBins = 64;
const specBuffer = [];
const maxCols = 1024;
const statsEl = document.getElementById('stats');
let fpsCnt = 0, lastT = performance.now();
function updateStats() {
const now = performance.now();
if (now - lastT > 1000) {
statsEl.textContent = ` | FPS: ${fpsCnt}`;
fpsCnt = 0; lastT = now;
}
}
function palette(v) {
// v in [0,1]; return [r,g,b]
const c = Math.max(0, Math.min(1, v));
const r = Math.floor(255 * Math.pow(c, 1.8));
const g = Math.floor(255 * Math.sqrt(c));
const b = Math.floor(255 * (1 - c));
return [r, g, b];
}
function draw() {
const cols = specBuffer.length;
if (cols === 0) return;
const img = ctx.createImageData(width, height);
const yBins = nBins;
// Map columns to width by sampling
for (let x = 0; x < width; x++) {
const colIdx = Math.floor((cols - 1) * x / (width - 1));
const col = specBuffer[colIdx];
for (let y = 0; y < height; y++) {
const bin = yBins - 1 - Math.floor(y * yBins / height);
const v = (col[bin] + 90) / 90; // -90..0 dB -> 0..1
const [r,g,b] = palette(v);
const p = (y * width + x) * 4;
img.data[p+0] = r; img.data[p+1] = g; img.data[p+2] = b; img.data[p+3] = 255;
}
}
ctx.putImageData(img, 0, 0);
}
let ws;
function connectWS() {
const loc = window.location;
const proto = loc.protocol === 'https:' ? 'wss://' : 'ws://';
ws = new WebSocket(proto + loc.host + '/ws');
ws.onopen = () => console.log('WS open');
ws.onmessage = (ev) => {
try {
const obj = JSON.parse(ev.data);
if (!obj || !obj.db) return;
specBuffer.push(obj.db);
if (specBuffer.length > maxCols) specBuffer.shift();
fpsCnt++; updateStats();
draw();
} catch(e) {}
};
ws.onclose = () => { console.log('WS closed, retrying...'); setTimeout(connectWS, 1000); };
}
connectWS();
</script>
</body>
</html>
)HTML";
// Helper: safe log10f
static inline float log10f_safe(float x) {
return log10f((x <= 1e-20f) ? 1e-20f : x);
}
// Design Hann window
void makeHann() {
for (int n = 0; n < FFT_SIZE; n++) {
windowHann[n] = 0.5f * (1.0f - cosf(2.0f * PI * n / (FFT_SIZE - 1)));
}
}
// Compute log-spaced bin edges over FFT bins
void makeLogBins() {
const float fMin = MIN_FREQ_HZ;
const float fMax = min((float)MAX_FREQ_HZ, SAMPLE_RATE * 0.5f);
const float logMin = logf(fMin);
const float logMax = logf(fMax);
for (int i = 0; i <= SPEC_BINS; i++) {
const float alpha = (float)i / SPEC_BINS;
const float f = expf(logMin + alpha * (logMax - logMin));
int k = (int)roundf(f * FFT_SIZE / SAMPLE_RATE);
if (k < 1) k = 1; // ignore DC bin
if (k > FFT_SIZE / 2) k = FFT_SIZE / 2; // Nyquist
binEdges[i] = k;
}
// Ensure strictly increasing
for (int i = 1; i <= SPEC_BINS; i++) {
if (binEdges[i] <= binEdges[i - 1]) binEdges[i] = binEdges[i - 1] + 1;
if (binEdges[i] > FFT_SIZE / 2) binEdges[i] = FFT_SIZE / 2;
}
}
// Simple DC removal filter
float dcMean = 0.0f;
const float dcAlpha = 0.995f;
// LED color mapping
uint32_t colorForDb(float dB) {
// Normalize from [-90, 0] to [0,1]
float v = (dB + 90.0f) / 90.0f;
if (v < 0) v = 0; if (v > 1) v = 1;
// HSV-like mapping: blue (low) -> green -> yellow -> red (high)
uint8_t r = (uint8_t)(255.0f * powf(v, 1.8f));
uint8_t g = (uint8_t)(255.0f * sqrtf(v));
uint8_t b = (uint8_t)(255.0f * (1.0f - v));
return strip.Color(r, g, b);
}
// Acquire FFT_SIZE samples from I2S
bool readSamples() {
size_t bytesRead = 0;
const size_t wantBytes = FFT_SIZE * sizeof(int32_t); // reading 32-bit frames
static int32_t i2sBuf[FFT_SIZE];
esp_err_t err = i2s_read(I2S_PORT, (void*)i2sBuf, wantBytes, &bytesRead, portMAX_DELAY);
if (err != ESP_OK || bytesRead != wantBytes) return false;
for (int i = 0; i < FFT_SIZE; i++) {
// INMP441 provides 24-bit in 32-bit frame, MSB aligned. Shift down to 24-bit signed.
int32_t s = (int32_t)(i2sBuf[i] >> 8);
float x = (float)s / 8388608.0f; // 2^23
// DC high-pass
dcMean = dcAlpha * dcMean + (1.0f - dcAlpha) * x;
x -= dcMean;
// Window
vReal[i] = x * windowHann[i];
vImag[i] = 0.0f;
}
return true;
}
// Compute FFT and log-binned spectrum in dBFS
void computeSpectrum() {
FFT.Windowing(vReal, FFT_SIZE, FFT_WIN_TYP_RECTANGLE, FFT_FORWARD); // already windowed
FFT.Compute(vReal, vImag, FFT_SIZE, FFT_FORWARD);
FFT.ComplexToMagnitude(vReal, vImag, FFT_SIZE);
// vReal now holds magnitudes: index k => bin frequency k*Fs/N
// Accumulate power into log bins
for (int b = 0; b < SPEC_BINS; b++) binPower[b] = 0.0f;
for (int b = 0; b < SPEC_BINS; b++) {
int k0 = binEdges[b];
int k1 = binEdges[b + 1];
if (k1 <= k0) k1 = k0 + 1;
float p = 0.0f;
for (int k = k0; k < k1 && k < FFT_SIZE / 2; k++) {
const float m = vReal[k];
p += m * m; // power
}
binPower[b] = p / (float)(k1 - k0);
}
// Convert to dBFS; estimate maxRef using a running peak or fixed reference
static float ref = 1e-3f; // avoid too large dB
for (int b = 0; b < SPEC_BINS; b++) {
if (binPower[b] > ref) ref = 0.999f * ref + 0.001f * binPower[b];
}
for (int b = 0; b < SPEC_BINS; b++) {
float dB = 10.0f * log10f_safe(binPower[b] / (ref + 1e-20f));
if (dB < -120.0f) dB = -120.0f;
if (dB > 0.0f) dB = 0.0f;
// Smooth a little
binDb[b] = 0.6f * binDb[b] + 0.4f * dB;
}
// Update adaptive noise floor for LED mapping
float avg = 0.0f;
for (int b = 0; b < SPEC_BINS; b++) avg += binDb[b];
avg /= SPEC_BINS;
noiseFloorDb = 0.99f * noiseFloorDb + 0.01f * avg;
}
// Map spectrum to LED ring
void updateLeds() {
// Reduce bins to LED_COUNT by grouping
for (int i = 0; i < LED_COUNT; i++) {
int b0 = (i * SPEC_BINS) / LED_COUNT;
int b1 = ((i + 1) * SPEC_BINS) / LED_COUNT;
if (b1 <= b0) b1 = b0 + 1;
float acc = 0.0f;
for (int b = b0; b < b1; b++) acc += binDb[b];
float dB = acc / (float)(b1 - b0);
// Boost relative to noise floor
float rel = dB - noiseFloorDb; // e.g., if floor -70 dB and dB -40 dB => +30 dB
float norm = (rel + 60.0f) / 60.0f; // map [-60..+?] dB above floor to [0..1]
if (norm < 0) norm = 0;
if (norm > 1) norm = 1;
// Brightness scaled by norm; color derived from dB itself
uint32_t c = colorForDb(-90.0f + 90.0f * norm);
strip.setPixelColor(i, c);
}
strip.show();
}
// Broadcast spectrum over WebSocket as compact JSON
void wsBroadcast() {
// JSON: {"n":64,"db":[-XX,...]}
// Keep it small
static char buf[2048];
char* p = buf;
p += sprintf(p, "{\"n\":%d,\"db\":[", SPEC_BINS);
for (int b = 0; b < SPEC_BINS; b++) {
float d = binDb[b];
if (d < -90.0f) d = -90.0f;
if (d > 0.0f) d = 0.0f;
p += sprintf(p, "%.1f%s", d, (b == SPEC_BINS - 1) ? "" : ",");
}
p += sprintf(p, "]}");
ws.textAll(buf);
}
// I2S configuration
void setupI2S() {
i2s_config_t i2s_config = {
.mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
.sample_rate = SAMPLE_RATE,
.bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT,
.channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
#if SOC_I2S_SUPPORTS_PDM
.communication_format = I2S_COMM_FORMAT_STAND_MSB,
#else
.communication_format = I2S_COMM_FORMAT_STAND_MSB,
#endif
.intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
.dma_buf_count = 8,
.dma_buf_len = 256,
.use_apll = false,
.tx_desc_auto_clear = false,
.fixed_mclk = 0
};
i2s_pin_config_t pin_config = {
.bck_io_num = I2S_PIN_BCLK,
.ws_io_num = I2S_PIN_WS,
.data_out_num = I2S_PIN_NO_CHANGE, // RX only
.data_in_num = I2S_PIN_SD
};
ESP_ERROR_CHECK(i2s_driver_install(I2S_PORT, &i2s_config, 0, NULL));
ESP_ERROR_CHECK(i2s_set_pin(I2S_PORT, &pin_config));
ESP_ERROR_CHECK(i2s_set_clk(I2S_PORT, SAMPLE_RATE, I2S_BITS_PER_SAMPLE_32BIT, I2S_CHANNEL_MONO));
}
void setupWiFi() {
WiFi.mode(WIFI_STA);
WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
Serial.printf("Connecting to WiFi SSID '%s'...\n", WIFI_SSID);
int tries = 0;
while (WiFi.status() != WL_CONNECTED) {
delay(300);
Serial.print(".");
if (++tries > 100) {
Serial.println("\nWiFi connect timeout, restarting...");
ESP.restart();
}
}
Serial.printf("\nWiFi connected. IP: %s\n", WiFi.localIP().toString().c_str());
}
void setupWeb() {
server.on("/", HTTP_GET, [](AsyncWebServerRequest* req) {
req->send_P(200, "text/html", INDEX_HTML);
});
ws.onEvent([](AsyncWebSocket* s, AsyncWebSocketClient* c, AwsEventType t, void* arg, uint8_t* data, size_t len) {
if (t == WS_EVT_CONNECT) {
Serial.printf("WS client %u connected\n", c->id());
} else if (t == WS_EVT_DISCONNECT) {
Serial.printf("WS client %u disconnected\n", c->id());
}
});
server.addHandler(&ws);
server.begin();
}
void setup() {
Serial.begin(115200);
delay(100);
Serial.println("\nESP32 I2S Spectrogram + WebSocket + NeoPixel starting...");
makeHann();
makeLogBins();
setupI2S();
strip.begin();
strip.clear();
strip.show();
setupWiFi();
setupWeb();
Serial.printf("I2S: SR=%d, FFT=%d, Bins=%d; WS at ws://%s%s\n",
SAMPLE_RATE, FFT_SIZE, SPEC_BINS, WiFi.localIP().toString().c_str(), WS_PATH);
}
void loop() {
if (!readSamples()) return;
computeSpectrum();
updateLeds();
uint32_t now = millis();
if (FRAME_INTERVAL_MS == 0 || (now - lastFrameMs) >= FRAME_INTERVAL_MS) {
wsBroadcast();
lastFrameMs = now;
}
// Give AsyncTCP time
ws.cleanupClients();
}
Notes on the code:
– FFT size 1024 at 16 kHz → ~15.6 Hz bin spacing in raw FFT; after log binning we get 64 bins covering 60–8000 Hz.
– We read INMP441 as 32‑bit samples and right‑shift by 8 to get the 24‑bit signed value, normalize to float, apply a Hann window, run FFT, and aggregate power into log-spaced bins.
– A simple adaptive reference and noise floor are used for stable LEDs and spectrogram scaling.
– The HTML page connects to ws://
– NeoPixel map groups the 64 bins down to 24 LEDs for a compact visual display.
Build / Flash / Run commands
You can use either VS Code + PlatformIO extension or pure CLI.
1) Initialize (if starting from an empty folder):
pio project init --board esp32dev
2) Put the above platformio.ini and src/main.cpp in your project.
3) Install/resolve libraries and check the environment:
pio pkg update
pio run
4) Connect the ESP32-DevKitC V4 via USB. Identify serial port:
– Windows: check Device Manager for the COM port (CP210x).
– macOS: ls /dev/cu. (likely /dev/cu.SLAB_USBtoUART).
– Linux: ls /dev/ttyUSB or dmesg | tail.
Optionally set upload_port in platformio.ini.
5) Build and upload:
pio run -t upload
6) Open the serial monitor to watch logs:
pio device monitor --baud 115200
7) Find the printed IP address. In your browser on the same LAN, open:
– http://
– The WebSocket endpoint is at ws://
Step‑by‑step Validation
Follow this sequence to validate each subsystem before evaluating the full integrated pipeline.
1) USB/UART enumeration (driver)
- Plug in your ESP32-DevKitC V4.
- Verify the CP210x serial device appears.
- If not, install the Silicon Labs CP210x driver and try a different USB cable/port.
2) Firmware boots and Wi‑Fi connects
- Open serial monitor:
- pio device monitor –baud 115200
- Expected output:
- “ESP32 I2S Spectrogram + WebSocket + NeoPixel starting…”
- “Connecting to WiFi SSID ‘…’ ….”
- “WiFi connected. IP: x.x.x.x”
- “I2S: SR=16000, FFT=1024, Bins=64; WS at ws://x.x.x.x/ws”
- If it reboots or brownouts, see “Troubleshooting” (power/USB issues).
3) INMP441 hardware sanity
- With the device running and serial monitor open, speak or clap near the mic.
- If you added a temporary debug print inside computeSpectrum (optional) for avg dB, you should see values rising on sound.
- If it is all zeros:
- Check the I2S pins 26/25/33 match your wiring.
- Ensure INMP441 L/R is tied to GND (left channel).
- Confirm 3.3 V power (not 5 V!) to the mic and solid ground.
4) WebSocket and spectrogram UI
- In a desktop browser on the same network, go to http://
. - The page should load and connect to the WebSocket (check browser console).
- You should see a colored spectrogram that scrolls as frames arrive.
- Speak or play a test tone and watch a narrow band appear:
- 1 kHz test tone should appear in the lower quarter of the vertical axis (log‑scaled).
- You can verify frame rate via the FPS indicator (top bar).
Optional: generate test tones from your computer speakers:
– macOS:
– brew install sox
– play -n synth 30 sin 1000
– Linux with sox:
– play -n synth 30 sin 1000
– Windows (PowerShell, if sox installed and on PATH):
– play -n synth 30 sin 1000
5) LED ring spectrum
- Observe the ring while producing sounds:
- Low frequencies will excite certain LEDs (depending on the grouping), color shifting from blue/green to yellow/red with higher energy.
- If no LEDs light:
- Verify LED 5 V power, common ground, data on GPIO18, series resistor present.
- Try adding a level shifter (74AHCT125).
- Check strip.begin() and strip.show() are called; try a quick self-test: set all to red in setup() to validate hardware.
6) Frequency sanity check
- With SAMPLE_RATE=16000 and FFT_SIZE=1024, the raw FFT bin spacing is Fs/N ≈ 15.625 Hz.
- Our log bins are defined from 60–8000 Hz (clamped to Nyquist 8 kHz).
- Playing 1 kHz should show a bright band near 1 kHz; try 2 kHz, 4 kHz and observe vertical position changes correspondingly.
7) Performance/latency check
- Expected end‑to‑end latency (mic → FFT → WebSocket → draw) is roughly one frame duration (≈64 ms) plus Wi‑Fi/JS overhead.
- CPU usage on ESP32 is moderate with FFT_SIZE=1024 and AsyncWebSocket; if frames drop, consider reducing SPEC_BINS or switching to binary frames (see Improvements).
Troubleshooting
- No serial port / COM port missing:
- Use a data‑capable USB cable.
- Install CP210x driver, reconnect, try another USB port.
- Reboots or “Brownout detector was triggered”:
- Provide stable 5 V supply, avoid powering a large LED ring from the ESP32 USB.
- Add a 1000 µF capacitor on the LED ring 5 V/GND near the strip.
- INMP441 reads all zeros:
- Verify pin mapping (26=BCLK, 25=WS, 33=SD).
- Confirm I2S mode: master RX, 32‑bit slots, only left channel.
- L/R must be tied to GND on INMP441 to output left; or change I2S_CHANNEL_FMT_ONLY_RIGHT if tied to VDD.
- No MCLK is required for INMP441; do not connect it.
- Distorted or very low amplitude:
- Check the right shift by 8 for 24‑bit data from 32‑bit frame.
- Try reducing ambient noise and verify mic orientation (sound port facing the source).
- Wi‑Fi connects but page does not load:
- Ensure the PC is on the same subnet.
- Check firewall rules.
- Try http://
/ in an incognito window. - WebSocket connects but no frames:
- Ensure ws.cleanupClients() is called in loop().
- Check that wsBroadcast() is called regularly.
- Watch serial logs for memory errors.
- LED ring flickers or random colors:
- Common ground between ESP32 and LED supply is required.
- Add 330–470 Ω series resistor in DIN.
- Prefer a proper 3.3→5 V level shifter (74AHCT125).
- Keep data wire short; avoid running alongside power lines.
- High CPU or dropped frames:
- Use smaller FFT (e.g., 512) or fewer SPEC_BINS (e.g., 48).
- Set FRAME_INTERVAL_MS to throttle WebSocket sending.
- Consider switching to binary WebSocket frames to cut JSON overhead.
- Spectrogram skew or odd banding:
- Verify SAMPLE_RATE and FFT_SIZE constants on both acquisition and binning stages.
- For musical content, mel-scaling can improve perceptual distribution (see Improvements).
Improvements
- Binary WebSocket frames:
- Replace JSON with a compact binary blob (e.g., int16 dB × 64) to dramatically reduce bandwidth and GC on the ESP.
- On the browser side, use WebSocket binaryType=’arraybuffer’ and DataView for parsing.
- Accurate clocking:
- Enable APLL in I2S to minimize sampling rate drift for better frequency accuracy.
- Overlap and windowing:
- Implement 50% overlap (hop size FFT_SIZE/2) and use an additional window to stabilize time resolution.
- Mel filter bank:
- Convert FFT magnitude to mel bins for a perceptually meaningful spectrogram; fewer bins (e.g., 40) are sufficient.
- LED mapping:
- Gamma correction and a perceptual color palette (e.g., viridis) improve visibility.
- Map low/mid/high bands to distinct LED groups with different colors.
- Dynamic range control:
- Implement AGC on the magnitude spectrum for consistent visuals across environments.
- OTA updates:
- Add ArduinoOTA or ESP32 HTTP OTA for faster iteration without USB.
- Credentials and provisioning:
- Move Wi‑Fi SSID/password to a separate header or use WiFiManager for runtime provisioning.
- Multi‑client streaming:
- Throttle sending (e.g., 10–20 FPS) and add a ring buffer for late joiners.
- Power safety:
- If you scale up LEDs, add a fuse on the 5 V line and ensure adequate PSU headroom.
Final Checklist
- Hardware
- ESP32-DevKitC V4 connected via CP210x USB‑UART
- INMP441 wired: 3.3 V, GND, SCK→GPIO26, WS→GPIO25, SD→GPIO33, L/R→GND
- WS2812B ring: +5 V, GND common, data from GPIO18 via series resistor and ideally a 74AHCT125 level shifter
-
1000 µF cap across LED 5 V and GND
-
Software
- PlatformIO Core 6.x installed
- platform = espressif32 @ 6.5.0, framework = arduino
- lib_deps: ESPAsyncWebServer-esphome, AsyncTCP, arduinoFFT, Adafruit NeoPixel
-
monitor_speed = 115200, upload_speed = 921600
-
Build/Flash
- pio run -t upload succeeds without errors
- Serial monitor shows Wi‑Fi IP and I2S params
-
Browser loads http://
and spectrogram updates in real time -
Validation
- Claps/voice produce visible bands on the canvas
- 1 kHz test tone yields a stable horizontal band around 1 kHz
-
LED ring responds to audio with color/brightness changes
-
Stability
- No brownouts or random resets under typical use
- LED supply sized appropriately; grounds are common
- WebSocket stable with at least one browser client
With the ESP32-DevKitC V4 + INMP441 I2S mic + WS2812B LED ring configured as above, you now have a robust i2s-spectrogram-websocket-neopixel pipeline suitable for further DSP experimentation, UI enhancements, and embedded audiovisual projects.
Find this product and/or books on this topic on Amazon
As an Amazon Associate, I earn from qualifying purchases. If you buy through this link, you help keep this project running.



