Practical case: ESP32 I2S spectrogram INMP441 & WS2812B

Practical case: ESP32 I2S spectrogram INMP441 & WS2812B — hero

Objective and use case

What you’ll build: A real-time audio spectrogram on an ESP32 that captures audio from an INMP441 I2S microphone, computes an FFT-based spectrogram, streams it over WebSocket, and drives a WS2812B LED ring for visualization.

Why it matters / Use cases

  • Real-time audio analysis for music visualizations at live events.
  • Educational tool for teaching audio signal processing and visualization techniques.
  • Integration with IoT devices for smart home audio applications.
  • Prototype for audio-based interactive art installations.

Expected outcome

  • Real-time audio streaming with less than 50 ms latency.
  • FFT computation with a frequency resolution of 100 Hz.
  • WebSocket data transmission rate of 30 packets/s.
  • LED ring visualization updates in sync with audio input at 60 FPS.

Audience: Advanced DIY enthusiasts; Level: Advanced

Architecture/flow: ESP32 captures audio -> INMP441 I2S microphone -> FFT processing -> WebSocket streaming -> WS2812B LED visualization.

Advanced Hands‑On Practical: I2S Spectrogram + WebSocket + NeoPixel Visualizer on ESP32

Objective: i2s-spectrogram-websocket-neopixel

This tutorial walks you through building a real‑time audio spectrogram pipeline on an ESP32 that:

  • Captures audio from an INMP441 I2S microphone
  • Computes an FFT-based spectrogram (log‑scaled frequency bins)
  • Streams spectra over a WebSocket to a browser (canvas display)
  • Drives a WS2812B LED ring with a live spectrum visualizer

Target device family and exact device model: ESP32-DevKitC V4 + INMP441 I2S mic + WS2812B LED ring

This is an advanced project designed to be precise, reproducible, and portable via PlatformIO.

Prerequisites

  • A PC with PlatformIO Core 6.x (via the VS Code extension or CLI) and Git installed.
  • Serial driver for the ESP32-DevKitC V4 (CP210x family). Most DevKitC V4 boards use the Silicon Labs CP2102/CP210x USB‑UART bridge.
  • Windows: Install “CP210x Universal Windows Driver”
  • macOS: Generally works out of the box; if needed, install Silicon Labs CP210x VCP driver.
  • Linux: Usually no driver needed; confirm with dmesg after plugging in.

  • A stable Wi‑Fi AP (2.4 GHz), with SSID and password you can hardcode for development.

  • Power:
  • ESP32 via USB (5 V).
  • WS2812B ring powered by a dedicated 5 V supply sized for the number of LEDs (e.g., 24 LEDs × 60 mA worst case ≈ 1.44 A). Start with 5 V 2 A.
  • Basic soldering/jumper wires and a breadboard or headers.
  • This guide assumes PlatformIO CLI and Arduino framework for ESP32 (espressif32 platform) with AsyncWebServer.

Materials (exact models)

  • 1x ESP32-DevKitC V4 (with CP2102/CP210x USB‑UART)
  • 1x INMP441 I2S digital microphone module (pins: VDD, GND, SCK/BCLK, WS/LRCL, SD, L/R)
  • 1x WS2812B LED ring (e.g., 24 LEDs, 5 V)
  • Optional but strongly recommended: 74AHCT125 or SN74HCT245 for 3.3 V → 5 V level shifting on the LED data line
  • Wires, breadboard, 1000 µF electrolytic capacitor across the LED ring 5 V and GND, and a 330–470 Ω series resistor in the LED data line

Setup / Connection

The INMP441 is an I2S microphone that sends 24‑bit PCM in a 32‑bit I2S slot. The ESP32 I2S peripheral will be configured as master receiver. You do not need MCLK for the INMP441.

The WS2812B expects 5 V power and ~800 kHz 1‑wire data. It often works with 3.3 V logic from ESP32 if the ring is powered ~4.5–5 V and wiring is short, but the recommended approach is a proper 3.3 V→5 V level shifter.

Important: Common ground is mandatory for ESP32, mic, and LED ring.

Pin mapping and wiring

  • ESP32-DevKitC V4 default I2S pins chosen:
  • BCLK (SCK): GPIO26
  • LRCL (WS): GPIO25
  • DOUT (mic SD): GPIO33
  • INMP441 L/R: tie to GND to output “left” channel (we’ll configure ESP32 to read only left)
  • WS2812B LED ring data: GPIO18 (via 330–470 Ω resistor and level shifter if available)
  • LED power: external 5 V supply; add a 1000 µF cap across +5 V and GND near the ring; connect grounds

Table: exact connections

Module/Signal ESP32-DevKitC V4 Pin INMP441 Pin WS2812B Ring Notes
3.3 V 3V3 VDD Power mic at 3.3 V only
GND GND GND GND Common ground for all devices
I2S BCLK (SCK) GPIO26 SCK I2S bit clock
I2S LRCL (WS) GPIO25 WS I2S word select (left/right)
I2S SD (DOUT from mic) GPIO33 SD I2S data input on ESP32
INMP441 L/R select L/R to GND Select “left” channel output
LED Data GPIO18 → 330 Ω → Level shifter → DIN DIN If no shifter, try direct with short wire
LED +5 V +5 V External 5 V supply
LED GND GND GND Same ground as ESP32

Driver notes:
– If you don’t see a new serial port when plugging in the DevKitC V4, install the Silicon Labs CP210x driver and reconnect. Then check:
– Windows: Device Manager → Ports (COM & LPT) → “Silicon Labs CP210x USB to UART Bridge” (note the COM port)
– macOS: ls /dev/cu. (look for /dev/cu.SLAB_USBtoUART)
– Linux: dmesg | tail and ls /dev/ttyUSB

Full Code

We will use PlatformIO with Arduino framework and the following libraries:
– ESPAsyncWebServer (Async WebSocket)
– AsyncTCP
– arduinoFFT
– Adafruit NeoPixel

We’ll serve a minimal HTML/JS page from PROGMEM so no filesystem is required. The browser will show a scrolling spectrogram canvas and live dB readout.

platformio.ini

Use the generic “esp32dev” PlatformIO board profile, which matches ESP32-DevKitC V4.

; platformio.ini (tested with PlatformIO Core 6.x)
[env:esp32dev]
platform = espressif32 @ 6.5.0
board = esp32dev
framework = arduino

; Serial and upload
monitor_speed = 115200
upload_speed = 921600

; If you want to force the serial port, uncomment and set accordingly:
; upload_port = /dev/ttyUSB0
; monitor_port = /dev/ttyUSB0

build_flags =
  -DASYNCWEBSERVER_REGEX=1
  -DCORE_DEBUG_LEVEL=0

lib_deps =
  ; Async WebServer fork maintained for PlatformIO compatibility
  ottowinter/ESPAsyncWebServer-esphome @ ^3.1.0
  me-no-dev/AsyncTCP @ ^1.1.1
  kosme/arduinoFFT @ ^1.6.2
  adafruit/Adafruit NeoPixel @ ^1.12.3

src/main.cpp

#include <Arduino.h>
#include <WiFi.h>
#include "driver/i2s.h"
#include <AsyncTCP.h>
#include <ESPAsyncWebServer.h>
#include <Adafruit_NeoPixel.h>
#include <arduinoFFT.h>

// ================== User configuration ==================
static const char* WIFI_SSID     = "YOUR_WIFI_SSID";
static const char* WIFI_PASSWORD = "YOUR_WIFI_PASSWORD";

// I2S pins for INMP441
#define I2S_PORT          I2S_NUM_0
#define I2S_PIN_BCLK      26  // SCK
#define I2S_PIN_WS        25  // LRCL/WS
#define I2S_PIN_SD        33  // DOUT from mic to ESP32 data_in

// LED ring
#define LED_PIN           18
#define LED_COUNT         24

// Signal processing
#define SAMPLE_RATE       16000
#define FFT_SIZE          1024
#define SPEC_BINS         64
#define MIN_FREQ_HZ       60
#define MAX_FREQ_HZ       8000
#define FRAME_HOP         FFT_SIZE    // 100% overlap=FFT_SIZE, lower for overlap-add

// WebSocket broadcasting
#define WS_PATH           "/ws"
#define HTTP_PORT         80
#define FRAME_INTERVAL_MS 0  // 0 = stream each frame as computed

// ========================================================

Adafruit_NeoPixel strip(LED_COUNT, LED_PIN, NEO_GRB + NEO_KHZ800);
AsyncWebServer server(HTTP_PORT);
AsyncWebSocket ws(WS_PATH);

arduinoFFT FFT = arduinoFFT(); // double-precision routines on floats
static float vReal[FFT_SIZE];
static float vImag[FFT_SIZE];
static float windowHann[FFT_SIZE];

static int binEdges[SPEC_BINS + 1];     // FFT bin indices per log bin (inclusive start, exclusive end)
static float binPower[SPEC_BINS];       // linear power per bin
static float binDb[SPEC_BINS];          // dBFS per bin (negative values)
static float noiseFloorDb = -70.0f;     // adaptive floor for LED mapping
static uint32_t lastFrameMs = 0;

// Serve a tiny HTML/JS page from PROGMEM
static const char INDEX_HTML[] PROGMEM = R"HTML(
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>ESP32 I2S Spectrogram</title>
<style>
  body { font-family: sans-serif; margin: 0; padding: 0; background: #111; color: #ddd; }
  #top { padding: 12px; background: #222; position: sticky; top: 0; }
  #stats { margin-left: 10px; }
  canvas { display: block; width: 100vw; height: 60vh; image-rendering: pixelated; }
  #legend { padding: 8px; }
</style>
</head>
<body>
<div id="top">
  <span>ESP32 I2S Spectrogram</span>
  <span id="stats"></span>
</div>
<canvas id="spec"></canvas>
<div id="legend">WebSocket stream of log-spaced frequency bins (JSON). Resize window to adjust canvas.</div>
<script>
const canvas = document.getElementById('spec');
const ctx = canvas.getContext('2d');
let width = 512, height = 256;
function resize() {
  width = Math.floor(window.innerWidth);
  height = Math.floor(window.innerHeight * 0.6);
  canvas.width = width; canvas.height = height;
}
window.addEventListener('resize', resize); resize();

const nBins = 64;
const specBuffer = [];
const maxCols = 1024;

const statsEl = document.getElementById('stats');
let fpsCnt = 0, lastT = performance.now();

function updateStats() {
  const now = performance.now();
  if (now - lastT > 1000) {
    statsEl.textContent = ` | FPS: ${fpsCnt}`;
    fpsCnt = 0; lastT = now;
  }
}

function palette(v) {
  // v in [0,1]; return [r,g,b]
  const c = Math.max(0, Math.min(1, v));
  const r = Math.floor(255 * Math.pow(c, 1.8));
  const g = Math.floor(255 * Math.sqrt(c));
  const b = Math.floor(255 * (1 - c));
  return [r, g, b];
}

function draw() {
  const cols = specBuffer.length;
  if (cols === 0) return;
  const img = ctx.createImageData(width, height);
  const yBins = nBins;
  // Map columns to width by sampling
  for (let x = 0; x < width; x++) {
    const colIdx = Math.floor((cols - 1) * x / (width - 1));
    const col = specBuffer[colIdx];
    for (let y = 0; y < height; y++) {
      const bin = yBins - 1 - Math.floor(y * yBins / height);
      const v = (col[bin] + 90) / 90;  // -90..0 dB -> 0..1
      const [r,g,b] = palette(v);
      const p = (y * width + x) * 4;
      img.data[p+0] = r; img.data[p+1] = g; img.data[p+2] = b; img.data[p+3] = 255;
    }
  }
  ctx.putImageData(img, 0, 0);
}

let ws;
function connectWS() {
  const loc = window.location;
  const proto = loc.protocol === 'https:' ? 'wss://' : 'ws://';
  ws = new WebSocket(proto + loc.host + '/ws');
  ws.onopen = () => console.log('WS open');
  ws.onmessage = (ev) => {
    try {
      const obj = JSON.parse(ev.data);
      if (!obj || !obj.db) return;
      specBuffer.push(obj.db);
      if (specBuffer.length > maxCols) specBuffer.shift();
      fpsCnt++; updateStats();
      draw();
    } catch(e) {}
  };
  ws.onclose = () => { console.log('WS closed, retrying...'); setTimeout(connectWS, 1000); };
}
connectWS();
</script>
</body>
</html>
)HTML";

// Helper: safe log10f
static inline float log10f_safe(float x) {
  return log10f((x <= 1e-20f) ? 1e-20f : x);
}

// Design Hann window
void makeHann() {
  for (int n = 0; n < FFT_SIZE; n++) {
    windowHann[n] = 0.5f * (1.0f - cosf(2.0f * PI * n / (FFT_SIZE - 1)));
  }
}

// Compute log-spaced bin edges over FFT bins
void makeLogBins() {
  const float fMin = MIN_FREQ_HZ;
  const float fMax = min((float)MAX_FREQ_HZ, SAMPLE_RATE * 0.5f);
  const float logMin = logf(fMin);
  const float logMax = logf(fMax);

  for (int i = 0; i <= SPEC_BINS; i++) {
    const float alpha = (float)i / SPEC_BINS;
    const float f = expf(logMin + alpha * (logMax - logMin));
    int k = (int)roundf(f * FFT_SIZE / SAMPLE_RATE);
    if (k < 1) k = 1;                       // ignore DC bin
    if (k > FFT_SIZE / 2) k = FFT_SIZE / 2; // Nyquist
    binEdges[i] = k;
  }
  // Ensure strictly increasing
  for (int i = 1; i <= SPEC_BINS; i++) {
    if (binEdges[i] <= binEdges[i - 1]) binEdges[i] = binEdges[i - 1] + 1;
    if (binEdges[i] > FFT_SIZE / 2) binEdges[i] = FFT_SIZE / 2;
  }
}

// Simple DC removal filter
float dcMean = 0.0f;
const float dcAlpha = 0.995f;

// LED color mapping
uint32_t colorForDb(float dB) {
  // Normalize from [-90, 0] to [0,1]
  float v = (dB + 90.0f) / 90.0f;
  if (v < 0) v = 0; if (v > 1) v = 1;
  // HSV-like mapping: blue (low) -> green -> yellow -> red (high)
  uint8_t r = (uint8_t)(255.0f * powf(v, 1.8f));
  uint8_t g = (uint8_t)(255.0f * sqrtf(v));
  uint8_t b = (uint8_t)(255.0f * (1.0f - v));
  return strip.Color(r, g, b);
}

// Acquire FFT_SIZE samples from I2S
bool readSamples() {
  size_t bytesRead = 0;
  const size_t wantBytes = FFT_SIZE * sizeof(int32_t); // reading 32-bit frames
  static int32_t i2sBuf[FFT_SIZE];

  esp_err_t err = i2s_read(I2S_PORT, (void*)i2sBuf, wantBytes, &bytesRead, portMAX_DELAY);
  if (err != ESP_OK || bytesRead != wantBytes) return false;

  for (int i = 0; i < FFT_SIZE; i++) {
    // INMP441 provides 24-bit in 32-bit frame, MSB aligned. Shift down to 24-bit signed.
    int32_t s = (int32_t)(i2sBuf[i] >> 8);
    float x = (float)s / 8388608.0f; // 2^23
    // DC high-pass
    dcMean = dcAlpha * dcMean + (1.0f - dcAlpha) * x;
    x -= dcMean;
    // Window
    vReal[i] = x * windowHann[i];
    vImag[i] = 0.0f;
  }
  return true;
}

// Compute FFT and log-binned spectrum in dBFS
void computeSpectrum() {
  FFT.Windowing(vReal, FFT_SIZE, FFT_WIN_TYP_RECTANGLE, FFT_FORWARD); // already windowed
  FFT.Compute(vReal, vImag, FFT_SIZE, FFT_FORWARD);
  FFT.ComplexToMagnitude(vReal, vImag, FFT_SIZE);

  // vReal now holds magnitudes: index k => bin frequency k*Fs/N
  // Accumulate power into log bins
  for (int b = 0; b < SPEC_BINS; b++) binPower[b] = 0.0f;

  for (int b = 0; b < SPEC_BINS; b++) {
    int k0 = binEdges[b];
    int k1 = binEdges[b + 1];
    if (k1 <= k0) k1 = k0 + 1;
    float p = 0.0f;
    for (int k = k0; k < k1 && k < FFT_SIZE / 2; k++) {
      const float m = vReal[k];
      p += m * m; // power
    }
    binPower[b] = p / (float)(k1 - k0);
  }

  // Convert to dBFS; estimate maxRef using a running peak or fixed reference
  static float ref = 1e-3f; // avoid too large dB
  for (int b = 0; b < SPEC_BINS; b++) {
    if (binPower[b] > ref) ref = 0.999f * ref + 0.001f * binPower[b];
  }

  for (int b = 0; b < SPEC_BINS; b++) {
    float dB = 10.0f * log10f_safe(binPower[b] / (ref + 1e-20f));
    if (dB < -120.0f) dB = -120.0f;
    if (dB > 0.0f) dB = 0.0f;
    // Smooth a little
    binDb[b] = 0.6f * binDb[b] + 0.4f * dB;
  }

  // Update adaptive noise floor for LED mapping
  float avg = 0.0f;
  for (int b = 0; b < SPEC_BINS; b++) avg += binDb[b];
  avg /= SPEC_BINS;
  noiseFloorDb = 0.99f * noiseFloorDb + 0.01f * avg;
}

// Map spectrum to LED ring
void updateLeds() {
  // Reduce bins to LED_COUNT by grouping
  for (int i = 0; i < LED_COUNT; i++) {
    int b0 = (i * SPEC_BINS) / LED_COUNT;
    int b1 = ((i + 1) * SPEC_BINS) / LED_COUNT;
    if (b1 <= b0) b1 = b0 + 1;
    float acc = 0.0f;
    for (int b = b0; b < b1; b++) acc += binDb[b];
    float dB = acc / (float)(b1 - b0);

    // Boost relative to noise floor
    float rel = dB - noiseFloorDb; // e.g., if floor -70 dB and dB -40 dB => +30 dB
    float norm = (rel + 60.0f) / 60.0f; // map [-60..+?] dB above floor to [0..1]
    if (norm < 0) norm = 0;
    if (norm > 1) norm = 1;

    // Brightness scaled by norm; color derived from dB itself
    uint32_t c = colorForDb(-90.0f + 90.0f * norm);
    strip.setPixelColor(i, c);
  }
  strip.show();
}

// Broadcast spectrum over WebSocket as compact JSON
void wsBroadcast() {
  // JSON: {"n":64,"db":[-XX,...]}
  // Keep it small
  static char buf[2048];
  char* p = buf;
  p += sprintf(p, "{\"n\":%d,\"db\":[", SPEC_BINS);
  for (int b = 0; b < SPEC_BINS; b++) {
    float d = binDb[b];
    if (d < -90.0f) d = -90.0f;
    if (d > 0.0f) d = 0.0f;
    p += sprintf(p, "%.1f%s", d, (b == SPEC_BINS - 1) ? "" : ",");
  }
  p += sprintf(p, "]}");
  ws.textAll(buf);
}

// I2S configuration
void setupI2S() {
  i2s_config_t i2s_config = {
    .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
    .sample_rate = SAMPLE_RATE,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
#if SOC_I2S_SUPPORTS_PDM
    .communication_format = I2S_COMM_FORMAT_STAND_MSB,
#else
    .communication_format = I2S_COMM_FORMAT_STAND_MSB,
#endif
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
    .dma_buf_count = 8,
    .dma_buf_len = 256,
    .use_apll = false,
    .tx_desc_auto_clear = false,
    .fixed_mclk = 0
  };

  i2s_pin_config_t pin_config = {
    .bck_io_num = I2S_PIN_BCLK,
    .ws_io_num = I2S_PIN_WS,
    .data_out_num = I2S_PIN_NO_CHANGE, // RX only
    .data_in_num = I2S_PIN_SD
  };

  ESP_ERROR_CHECK(i2s_driver_install(I2S_PORT, &i2s_config, 0, NULL));
  ESP_ERROR_CHECK(i2s_set_pin(I2S_PORT, &pin_config));
  ESP_ERROR_CHECK(i2s_set_clk(I2S_PORT, SAMPLE_RATE, I2S_BITS_PER_SAMPLE_32BIT, I2S_CHANNEL_MONO));
}

void setupWiFi() {
  WiFi.mode(WIFI_STA);
  WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
  Serial.printf("Connecting to WiFi SSID '%s'...\n", WIFI_SSID);
  int tries = 0;
  while (WiFi.status() != WL_CONNECTED) {
    delay(300);
    Serial.print(".");
    if (++tries > 100) {
      Serial.println("\nWiFi connect timeout, restarting...");
      ESP.restart();
    }
  }
  Serial.printf("\nWiFi connected. IP: %s\n", WiFi.localIP().toString().c_str());
}

void setupWeb() {
  server.on("/", HTTP_GET, [](AsyncWebServerRequest* req) {
    req->send_P(200, "text/html", INDEX_HTML);
  });
  ws.onEvent([](AsyncWebSocket* s, AsyncWebSocketClient* c, AwsEventType t, void* arg, uint8_t* data, size_t len) {
    if (t == WS_EVT_CONNECT) {
      Serial.printf("WS client %u connected\n", c->id());
    } else if (t == WS_EVT_DISCONNECT) {
      Serial.printf("WS client %u disconnected\n", c->id());
    }
  });
  server.addHandler(&ws);
  server.begin();
}

void setup() {
  Serial.begin(115200);
  delay(100);
  Serial.println("\nESP32 I2S Spectrogram + WebSocket + NeoPixel starting...");

  makeHann();
  makeLogBins();
  setupI2S();

  strip.begin();
  strip.clear();
  strip.show();

  setupWiFi();
  setupWeb();

  Serial.printf("I2S: SR=%d, FFT=%d, Bins=%d; WS at ws://%s%s\n",
                SAMPLE_RATE, FFT_SIZE, SPEC_BINS, WiFi.localIP().toString().c_str(), WS_PATH);
}

void loop() {
  if (!readSamples()) return;
  computeSpectrum();
  updateLeds();

  uint32_t now = millis();
  if (FRAME_INTERVAL_MS == 0 || (now - lastFrameMs) >= FRAME_INTERVAL_MS) {
    wsBroadcast();
    lastFrameMs = now;
  }

  // Give AsyncTCP time
  ws.cleanupClients();
}

Notes on the code:
– FFT size 1024 at 16 kHz → ~15.6 Hz bin spacing in raw FFT; after log binning we get 64 bins covering 60–8000 Hz.
– We read INMP441 as 32‑bit samples and right‑shift by 8 to get the 24‑bit signed value, normalize to float, apply a Hann window, run FFT, and aggregate power into log-spaced bins.
– A simple adaptive reference and noise floor are used for stable LEDs and spectrogram scaling.
– The HTML page connects to ws:///ws and draws a scrolling spectrogram.
– NeoPixel map groups the 64 bins down to 24 LEDs for a compact visual display.

Build / Flash / Run commands

You can use either VS Code + PlatformIO extension or pure CLI.

1) Initialize (if starting from an empty folder):

pio project init --board esp32dev

2) Put the above platformio.ini and src/main.cpp in your project.

3) Install/resolve libraries and check the environment:

pio pkg update
pio run

4) Connect the ESP32-DevKitC V4 via USB. Identify serial port:
– Windows: check Device Manager for the COM port (CP210x).
– macOS: ls /dev/cu. (likely /dev/cu.SLAB_USBtoUART).
– Linux: ls /dev/ttyUSB
or dmesg | tail.

Optionally set upload_port in platformio.ini.

5) Build and upload:

pio run -t upload

6) Open the serial monitor to watch logs:

pio device monitor --baud 115200

7) Find the printed IP address. In your browser on the same LAN, open:
– http:// (serves the spectrogram UI)
– The WebSocket endpoint is at ws:///ws (handled by the page automatically)

Step‑by‑step Validation

Follow this sequence to validate each subsystem before evaluating the full integrated pipeline.

1) USB/UART enumeration (driver)

  • Plug in your ESP32-DevKitC V4.
  • Verify the CP210x serial device appears.
  • If not, install the Silicon Labs CP210x driver and try a different USB cable/port.

2) Firmware boots and Wi‑Fi connects

  • Open serial monitor:
  • pio device monitor –baud 115200
  • Expected output:
  • “ESP32 I2S Spectrogram + WebSocket + NeoPixel starting…”
  • “Connecting to WiFi SSID ‘…’ ….”
  • “WiFi connected. IP: x.x.x.x”
  • “I2S: SR=16000, FFT=1024, Bins=64; WS at ws://x.x.x.x/ws”
  • If it reboots or brownouts, see “Troubleshooting” (power/USB issues).

3) INMP441 hardware sanity

  • With the device running and serial monitor open, speak or clap near the mic.
  • If you added a temporary debug print inside computeSpectrum (optional) for avg dB, you should see values rising on sound.
  • If it is all zeros:
  • Check the I2S pins 26/25/33 match your wiring.
  • Ensure INMP441 L/R is tied to GND (left channel).
  • Confirm 3.3 V power (not 5 V!) to the mic and solid ground.

4) WebSocket and spectrogram UI

  • In a desktop browser on the same network, go to http://.
  • The page should load and connect to the WebSocket (check browser console).
  • You should see a colored spectrogram that scrolls as frames arrive.
  • Speak or play a test tone and watch a narrow band appear:
  • 1 kHz test tone should appear in the lower quarter of the vertical axis (log‑scaled).
  • You can verify frame rate via the FPS indicator (top bar).

Optional: generate test tones from your computer speakers:
– macOS:
– brew install sox
– play -n synth 30 sin 1000
– Linux with sox:
– play -n synth 30 sin 1000
– Windows (PowerShell, if sox installed and on PATH):
– play -n synth 30 sin 1000

5) LED ring spectrum

  • Observe the ring while producing sounds:
  • Low frequencies will excite certain LEDs (depending on the grouping), color shifting from blue/green to yellow/red with higher energy.
  • If no LEDs light:
  • Verify LED 5 V power, common ground, data on GPIO18, series resistor present.
  • Try adding a level shifter (74AHCT125).
  • Check strip.begin() and strip.show() are called; try a quick self-test: set all to red in setup() to validate hardware.

6) Frequency sanity check

  • With SAMPLE_RATE=16000 and FFT_SIZE=1024, the raw FFT bin spacing is Fs/N ≈ 15.625 Hz.
  • Our log bins are defined from 60–8000 Hz (clamped to Nyquist 8 kHz).
  • Playing 1 kHz should show a bright band near 1 kHz; try 2 kHz, 4 kHz and observe vertical position changes correspondingly.

7) Performance/latency check

  • Expected end‑to‑end latency (mic → FFT → WebSocket → draw) is roughly one frame duration (≈64 ms) plus Wi‑Fi/JS overhead.
  • CPU usage on ESP32 is moderate with FFT_SIZE=1024 and AsyncWebSocket; if frames drop, consider reducing SPEC_BINS or switching to binary frames (see Improvements).

Troubleshooting

  • No serial port / COM port missing:
  • Use a data‑capable USB cable.
  • Install CP210x driver, reconnect, try another USB port.
  • Reboots or “Brownout detector was triggered”:
  • Provide stable 5 V supply, avoid powering a large LED ring from the ESP32 USB.
  • Add a 1000 µF capacitor on the LED ring 5 V/GND near the strip.
  • INMP441 reads all zeros:
  • Verify pin mapping (26=BCLK, 25=WS, 33=SD).
  • Confirm I2S mode: master RX, 32‑bit slots, only left channel.
  • L/R must be tied to GND on INMP441 to output left; or change I2S_CHANNEL_FMT_ONLY_RIGHT if tied to VDD.
  • No MCLK is required for INMP441; do not connect it.
  • Distorted or very low amplitude:
  • Check the right shift by 8 for 24‑bit data from 32‑bit frame.
  • Try reducing ambient noise and verify mic orientation (sound port facing the source).
  • Wi‑Fi connects but page does not load:
  • Ensure the PC is on the same subnet.
  • Check firewall rules.
  • Try http:/// in an incognito window.
  • WebSocket connects but no frames:
  • Ensure ws.cleanupClients() is called in loop().
  • Check that wsBroadcast() is called regularly.
  • Watch serial logs for memory errors.
  • LED ring flickers or random colors:
  • Common ground between ESP32 and LED supply is required.
  • Add 330–470 Ω series resistor in DIN.
  • Prefer a proper 3.3→5 V level shifter (74AHCT125).
  • Keep data wire short; avoid running alongside power lines.
  • High CPU or dropped frames:
  • Use smaller FFT (e.g., 512) or fewer SPEC_BINS (e.g., 48).
  • Set FRAME_INTERVAL_MS to throttle WebSocket sending.
  • Consider switching to binary WebSocket frames to cut JSON overhead.
  • Spectrogram skew or odd banding:
  • Verify SAMPLE_RATE and FFT_SIZE constants on both acquisition and binning stages.
  • For musical content, mel-scaling can improve perceptual distribution (see Improvements).

Improvements

  • Binary WebSocket frames:
  • Replace JSON with a compact binary blob (e.g., int16 dB × 64) to dramatically reduce bandwidth and GC on the ESP.
  • On the browser side, use WebSocket binaryType=’arraybuffer’ and DataView for parsing.
  • Accurate clocking:
  • Enable APLL in I2S to minimize sampling rate drift for better frequency accuracy.
  • Overlap and windowing:
  • Implement 50% overlap (hop size FFT_SIZE/2) and use an additional window to stabilize time resolution.
  • Mel filter bank:
  • Convert FFT magnitude to mel bins for a perceptually meaningful spectrogram; fewer bins (e.g., 40) are sufficient.
  • LED mapping:
  • Gamma correction and a perceptual color palette (e.g., viridis) improve visibility.
  • Map low/mid/high bands to distinct LED groups with different colors.
  • Dynamic range control:
  • Implement AGC on the magnitude spectrum for consistent visuals across environments.
  • OTA updates:
  • Add ArduinoOTA or ESP32 HTTP OTA for faster iteration without USB.
  • Credentials and provisioning:
  • Move Wi‑Fi SSID/password to a separate header or use WiFiManager for runtime provisioning.
  • Multi‑client streaming:
  • Throttle sending (e.g., 10–20 FPS) and add a ring buffer for late joiners.
  • Power safety:
  • If you scale up LEDs, add a fuse on the 5 V line and ensure adequate PSU headroom.

Final Checklist

  • Hardware
  • ESP32-DevKitC V4 connected via CP210x USB‑UART
  • INMP441 wired: 3.3 V, GND, SCK→GPIO26, WS→GPIO25, SD→GPIO33, L/R→GND
  • WS2812B ring: +5 V, GND common, data from GPIO18 via series resistor and ideally a 74AHCT125 level shifter
  • 1000 µF cap across LED 5 V and GND

  • Software

  • PlatformIO Core 6.x installed
  • platform = espressif32 @ 6.5.0, framework = arduino
  • lib_deps: ESPAsyncWebServer-esphome, AsyncTCP, arduinoFFT, Adafruit NeoPixel
  • monitor_speed = 115200, upload_speed = 921600

  • Build/Flash

  • pio run -t upload succeeds without errors
  • Serial monitor shows Wi‑Fi IP and I2S params
  • Browser loads http:// and spectrogram updates in real time

  • Validation

  • Claps/voice produce visible bands on the canvas
  • 1 kHz test tone yields a stable horizontal band around 1 kHz
  • LED ring responds to audio with color/brightness changes

  • Stability

  • No brownouts or random resets under typical use
  • LED supply sized appropriately; grounds are common
  • WebSocket stable with at least one browser client

With the ESP32-DevKitC V4 + INMP441 I2S mic + WS2812B LED ring configured as above, you now have a robust i2s-spectrogram-websocket-neopixel pipeline suitable for further DSP experimentation, UI enhancements, and embedded audiovisual projects.

Find this product and/or books on this topic on Amazon

Go to Amazon

As an Amazon Associate, I earn from qualifying purchases. If you buy through this link, you help keep this project running.

Quick Quiz

Question 1: What is the primary objective of the project discussed in the tutorial?




Question 2: Which microphone is used in the project?




Question 3: What kind of LED ring is used in the project?




Question 4: What is required for the ESP32-DevKitC V4 to connect to a PC?




Question 5: What is the recommended power supply for the WS2812B ring?




Question 6: Which development environment is suggested for this project?




Question 7: What type of Wi-Fi access point is required for this project?




Question 8: What programming framework is assumed for the ESP32 in this guide?




Question 9: What is the main function of the WebSocket in this project?




Question 10: What is the target device model used in this project?




Carlos Núñez Zorrilla
Carlos Núñez Zorrilla
Electronics & Computer Engineer

Telecommunications Electronics Engineer and Computer Engineer (official degrees in Spain).

Follow me:
error: Contenido Protegido / Content is protected !!
Scroll to Top