Objective and use case
What you’ll build: A real-time edge people counter using the Arduino Portenta H7 and the Himax HM01B0 Vision Shield, implementing efficient on-device processing for counting distinct individuals.
Why it matters / Use cases
- Smart building management systems that track occupancy in real-time to optimize energy usage.
- Retail analytics to gather foot traffic data for improving store layouts and marketing strategies.
- Event management solutions that monitor crowd sizes for safety and compliance with regulations.
- Public transportation systems that analyze passenger flow to enhance service efficiency.
- Healthcare facilities that manage patient flow in waiting areas to improve service delivery.
Expected outcome
- Live streaming of people counts with less than 100 ms latency.
- Detection accuracy of over 90% in varied lighting conditions.
- Ability to process and count up to 10 individuals simultaneously.
- Diagnostics output including frame processing time and count validation messages.
- Low power consumption, operating under 500 mW during peak processing.
Audience: Developers and engineers interested in edge computing; Level: Advanced
Architecture/flow: On-device processing using C++ with background subtraction and connected-components analysis.
Camera-Edge People Counting on Arduino Portenta H7 + Portenta Vision Shield (Himax HM01B0)
This hands-on advanced case builds a real-time edge people counter on the Arduino Portenta H7 paired with the Portenta Vision Shield (Himax HM01B0). You will implement a lightweight on-device background subtraction and connected-components pipeline in C++ to detect and count distinct moving people in the camera’s field of view, without sending frames to a host PC. We will use PlatformIO (CLI) to build and flash the firmware for the M7 core.
The end result will:
– Stream live counts and diagnostics over the USB serial port.
– Run entirely on the device at low resolution for efficiency.
– Provide a reproducible validation procedure to confirm the counter’s correctness.
Prerequisites
- You are comfortable with:
- PlatformIO CLI basics (project init, build, upload, serial monitor).
- C++ on embedded targets and basic memory constraints.
- Basic image processing concepts (thresholding, morphological operations, connected components).
- Host OS: Windows 10/11, macOS 12+, or Ubuntu 20.04+.
- PlatformIO Core installed:
- Recommended: PlatformIO Core 6.1.x or newer.
- Install via Python’s pip:
pip install -U platformioor follow PlatformIO docs. - A data-capable USB-C cable (charge-only cables will not work).
Driver notes:
– Windows: The Portenta H7 appears as a USB serial device (Mbed serial). Windows 10/11 typically installs WinUSB/USB-CDC automatically. If you see “Unknown device,” update the driver via Windows Update or use Zadig to bind WinUSB to the Mbed serial interface. No CP210x/CH34x drivers are needed.
– macOS/Linux: No extra drivers typically required. Ensure you have permission to access serial ports (on Linux: add user to dialout group or use udev rules).
Materials
| Item | Exact model | Notes |
|---|---|---|
| Microcontroller board | Arduino Portenta H7 | Use M7 core for application code |
| Camera shield | Portenta Vision Shield (Himax HM01B0) | Either Ethernet or LoRa variant; both use HM01B0 grayscale camera |
| USB cable | USB-C data cable | Must support data |
| Host computer | Windows/macOS/Linux | With PlatformIO Core |
| Optional fixtures | Tripod/stand | To stabilize the camera during validation |
Setup/Connection
- Stack the Portenta Vision Shield firmly onto the Portenta H7:
- Align the high-density connectors; the camera lens faces outward from the board.
- Ensure there is no gap and both connectors are fully seated—misalignment can cause I/O failures.
- Connect the Portenta H7 USB-C data port to your computer.
- Power is supplied via USB-C; no external power required for this project.
- Lighting:
- Use a well-lit environment to improve silhouette separation.
- Avoid strong backlighting that causes low contrast on grayscale frames.
Notes:
– The Himax HM01B0 is a low-power grayscale sensor. To keep processing light, we will use a 160×160 resolution capture.
– This demo does not require Ethernet/LoRa functionality; the shield variant doesn’t matter as long as it includes HM01B0.
Full Code
This firmware:
– Initializes the HM01B0 camera at 160×160 (grayscale).
– Maintains a running-average background model in RAM.
– Computes frame differencing, thresholds to a binary foreground mask.
– Applies a tiny morphological cleanup (dilate then erode) to remove noise.
– Runs a two-pass connected components labeling (CCL) to count blobs.
– Ignores blobs smaller than a configurable area threshold to reduce false positives.
– Streams counts and performance metrics over serial at 115200 bps.
Place this file at src/main.cpp in your PlatformIO project.
#include <Arduino.h>
// Attempt to support the standard Arduino library for the Portenta Vision Shield HM01B0.
// Make sure PlatformIO installs arduino-libraries/Arduino_HM01B0 (see platformio.ini).
#include <Arduino_HM01B0.h>
// Configuration parameters
static const uint16_t IMG_W = 160;
static const uint16_t IMG_H = 160;
static const uint32_t SERIAL_BAUD = 115200;
// Background model parameters
static const float BG_LEARN_RATE = 0.02f; // 0..1, higher learns background faster
static const uint8_t DIFF_THRESHOLD = 25; // grayscale difference threshold
static const uint16_t MIN_BLOB_AREA = 80; // adjust after validation; depends on scene/scale
static const uint8_t MORPH_ITER = 1; // one iteration of 3x3 dilate followed by erode
// Frame buffers
static uint8_t frame[IMG_W * IMG_H];
static uint8_t fgMask[IMG_W * IMG_H]; // 0 or 255
static uint8_t tmpMask[IMG_W * IMG_H]; // scratch for morphology
static float background[IMG_W * IMG_H]; // running-average background
// Camera object
HM01B0 himax;
// Utilities
static inline uint32_t idx(uint16_t x, uint16_t y) { return y * IMG_W + x; }
// Simple 3x3 dilation
static void dilate3x3(const uint8_t* in, uint8_t* out)
{
for (uint16_t y = 0; y < IMG_H; ++y) {
for (uint16_t x = 0; x < IMG_W; ++x) {
uint8_t m = 0;
for (int dy = -1; dy <= 1; ++dy) {
int yy = (int)y + dy;
if (yy < 0 || yy >= (int)IMG_H) continue;
for (int dx = -1; dx <= 1; ++dx) {
int xx = (int)x + dx;
if (xx < 0 || xx >= (int)IMG_W) continue;
m = (in[idx(xx, yy)] > m) ? in[idx(xx, yy)] : m;
if (m == 255) break; // early out
}
}
out[idx(x, y)] = m;
}
}
}
// Simple 3x3 erosion
static void erode3x3(const uint8_t* in, uint8_t* out)
{
for (uint16_t y = 0; y < IMG_H; ++y) {
for (uint16_t x = 0; x < IMG_W; ++x) {
uint8_t m = 255;
for (int dy = -1; dy <= 1; ++dy) {
int yy = (int)y + dy;
if (yy < 0 || yy >= (int)IMG_H) { m = 0; break; }
for (int dx = -1; dx <= 1; ++dx) {
int xx = (int)x + dx;
if (xx < 0 || xx >= (int)IMG_W) { m = 0; break; }
uint8_t v = in[idx(xx, yy)];
if (v < m) m = v;
}
if (m == 0) break; // early out
}
out[idx(x, y)] = m;
}
}
}
// Two-pass connected components labeling (4-connectivity)
static uint16_t connectedComponents(const uint8_t* binary, uint16_t* labels, uint16_t maxLabels)
{
// Very small union-find for labeling; memory-constrained but OK for 160x160
static uint16_t parent[IMG_W * IMG_H / 2]; // upper bound on labels; conservative
uint16_t nextLabel = 1;
// Initialize labels to 0
for (uint32_t i = 0; i < (uint32_t)IMG_W * IMG_H; ++i) labels[i] = 0;
auto uf_find = [&](uint16_t a) {
while (parent[a] != a) {
parent[a] = parent[parent[a]];
a = parent[a];
}
return a;
};
auto uf_union = [&](uint16_t a, uint16_t b) {
a = uf_find(a);
b = uf_find(b);
if (a < b) parent[b] = a;
else if (b < a) parent[a] = b;
};
// Pass 1: provisional labels and equivalences
for (uint16_t y = 0; y < IMG_H; ++y) {
for (uint16_t x = 0; x < IMG_W; ++x) {
if (binary[idx(x, y)] == 0) continue; // background
uint16_t up = (y > 0) ? labels[idx(x, y-1)] : 0;
uint16_t left = (x > 0) ? labels[idx(x-1, y)] : 0;
uint16_t label = 0;
if (up == 0 && left == 0) {
// New label
if (nextLabel >= maxLabels) continue; // out of labels, silently ignore
label = nextLabel;
parent[label] = label;
nextLabel++;
} else if (up != 0 && left == 0) {
label = up;
} else if (up == 0 && left != 0) {
label = left;
} else { // both non-zero
label = (up < left) ? up : left;
if (up != left) uf_union(up, left);
}
labels[idx(x, y)] = label;
}
}
// Pass 2: resolve equivalences
for (uint16_t y = 0; y < IMG_H; ++y) {
for (uint16_t x = 0; x < IMG_W; ++x) {
uint16_t l = labels[idx(x, y)];
if (l) labels[idx(x, y)] = uf_find(l);
}
}
// Compaction: relabel to 1..N
// Count number of labels
// We'll map root label -> compact label
const uint16_t MAX_LABELS = 4096; // safety
static uint16_t mapRoot[4097]; // 0..4096 inclusive
for (uint16_t i = 0; i <= 4096; ++i) mapRoot[i] = 0;
uint16_t nLabels = 0;
for (uint32_t i = 0; i < (uint32_t)IMG_W * IMG_H; ++i) {
uint16_t l = labels[i];
if (l) {
uint16_t root = l;
if (root <= 4096 && mapRoot[root] == 0) {
nLabels++;
mapRoot[root] = nLabels;
}
if (root <= 4096) labels[i] = mapRoot[root];
}
}
return nLabels;
}
static void initBackground(const uint8_t* img)
{
for (uint32_t i = 0; i < (uint32_t)IMG_W * IMG_H; ++i) {
background[i] = (float)img[i];
}
}
static void updateForegroundAndBackground(const uint8_t* img, uint8_t* mask)
{
for (uint32_t i = 0; i < (uint32_t)IMG_W * IMG_H; ++i) {
float bg = background[i];
float v = (float)img[i];
float diff = fabsf(v - bg);
mask[i] = (diff >= DIFF_THRESHOLD) ? 255 : 0;
// Update running average background
background[i] = (1.0f - BG_LEARN_RATE) * bg + BG_LEARN_RATE * v;
}
}
static uint16_t countBlobs(uint8_t* mask, uint16_t minArea)
{
// Morphology: dilate then erode to close small gaps
const uint8_t* in = mask;
for (uint8_t i = 0; i < MORPH_ITER; ++i) {
dilate3x3(in, tmpMask);
in = tmpMask;
}
for (uint8_t i = 0; i < MORPH_ITER; ++i) {
erode3x3(in, mask);
in = mask;
}
// Connected components
static uint16_t labels[IMG_W * IMG_H];
uint16_t nLabels = connectedComponents(mask, labels, 4096);
// Compute areas
static uint16_t areas[4097]; // label -> area
for (uint16_t i = 0; i <= 4096; ++i) areas[i] = 0;
for (uint32_t i = 0; i < (uint32_t)IMG_W * IMG_H; ++i) {
uint16_t l = labels[i];
if (l) areas[l]++;
}
// Count blobs above area threshold
uint16_t count = 0;
for (uint16_t l = 1; l <= nLabels; ++l) {
if (areas[l] >= minArea) count++;
}
return count;
}
void setup()
{
Serial.begin(SERIAL_BAUD);
while (!Serial && millis() < 3000) { /* wait for host */ }
// Initialize camera
if (!himax.begin()) {
Serial.println("ERROR: HM01B0 begin() failed");
while (1) { delay(1000); }
}
// Try to set resolution to 160x160 if available
// Many HM01B0 drivers offer discrete modes; if your library names differ, adjust here.
if (!himax.setResolution(HM01B0::RESOLUTION_160X160)) {
Serial.println("WARN: 160x160 resolution not supported by driver; trying default.");
}
// Optional: set frame rate if your library supports it
// himax.setFrameRate(HM01B0::FPS_30);
// Prime background with initial frames
Serial.println("Priming background model...");
for (int i = 0; i < 5; ++i) {
int bytes = himax.readFrame(frame);
if (bytes <= 0) {
Serial.println("ERROR: Failed to read frame during priming");
while (1) { delay(1000); }
}
delay(50);
}
// One more frame to initialize background array
if (himax.readFrame(frame) <= 0) {
Serial.println("ERROR: Failed to read frame for background init");
while (1) { delay(1000); }
}
initBackground(frame);
Serial.println("Ready. Streaming people counts...");
Serial.println("CSV header: ms,count,fg_pixels,proc_ms");
}
void loop()
{
uint32_t t0 = millis();
int bytes = himax.readFrame(frame);
if (bytes <= 0) {
Serial.println("ERROR: readFrame failed");
delay(50);
return;
}
// Foreground mask and background update
updateForegroundAndBackground(frame, fgMask);
// Count number of foreground pixels (for diagnostics)
uint32_t fgPixels = 0;
for (uint32_t i = 0; i < (uint32_t)IMG_W * IMG_H; ++i) {
if (fgMask[i]) fgPixels++;
}
// Blob counting
uint16_t peopleCount = countBlobs(fgMask, MIN_BLOB_AREA);
uint32_t t1 = millis();
uint32_t procMs = (t1 >= t0) ? (t1 - t0) : 0;
// Stream as CSV: timestamp, people_count, fg_pixels, processing_ms
Serial.print(millis());
Serial.print(",");
Serial.print(peopleCount);
Serial.print(",");
Serial.print(fgPixels);
Serial.print(",");
Serial.println(procMs);
// Target ~10–15 FPS depending on processing time and scene
// Adjust delay as needed. If procMs is large, you can set delay(0).
delay(20);
}
Notes:
– The class/method names assume Arduino’s Arduino_HM01B0 library for the Portenta Vision Shield. If your installed library uses slightly different names (e.g., grab() instead of readFrame() or different resolution enum), adjust those calls. The rest of the pipeline (background subtraction, morphology, CCL) is portable.
Build/Flash/Run Commands (PlatformIO)
We will target the M7 core of the STM32H747 on Portenta H7.
1) Initialize a new PlatformIO project for Portenta H7, M7 core:
pio project init --board portenta_h7_m7
2) Edit platformio.ini to match the following (replace the entire file):
[env:portenta_h7_m7]
platform = ststm32
board = portenta_h7_m7
framework = arduino
; Serial monitor
monitor_speed = 115200
; Use a predictable upload method. For Portenta H7, double-tap reset
; to enter the mbed mass-storage bootloader if needed.
upload_protocol = mbed
; Library dependencies
lib_deps =
arduino-libraries/Arduino_HM01B0 @ ^1.0.3
; Optional: faster build output
build_flags =
-O2
3) Place the firmware in src/main.cpp (from the Full Code section).
4) Build:
pio run
5) Put the board in bootloader mode (only if upload fails automatically):
– Double-press the reset button quickly. The board should enumerate as a mass-storage device (e.g., PORTENTA).
– Then run upload:
pio run -t upload
6) Open the serial monitor:
pio device monitor -b 115200
You should see CSV lines like:
ms,count,fg_pixels,proc_ms
1532,0,412,9
1554,1,2289,10
1576,1,2305,10
...
If you do not see output, press the reset button once while the serial monitor is open.
Step-by-step Validation
Follow these steps to validate the people counting logic in increasingly complex scenarios. Keep the device camera steady (use a stand) and ensure consistent lighting.
1) Baseline (empty scene)
– Aim the camera at a static background (e.g., a wall or an empty corridor).
– Observe serial output for 5–10 seconds.
– Expected:
– count should settle at 0.
– fg_pixels near 0 except for minor noise (hundreds at most).
– proc_ms typically 7–20 ms depending on host noise, power, and scene.
2) Single person enters
– Have one person walk into the field of view and stop.
– Expected:
– fg_pixels spikes as the person enters.
– After morphology and blob analysis, count should go to 1 once they are stationary.
– Slight lag is normal as the background model and morphology stabilize.
3) Two people, well separated
– Add a second person a clear distance away from the first (avoid overlap).
– Expected:
– count should increase to 2.
– If not, increase separation or adjust MIN_BLOB_AREA down slightly if your people appear small in frame.
4) Partial occlusion
– Have two people stand closer so their silhouettes overlap slightly.
– Expected:
– count may drop to 1 due to merged blobs; this is a known limitation without advanced segmentation.
– To mitigate, angle the camera to minimize overlaps or move farther away to reduce merging.
5) Motion robustness
– Have one person walk across the frame.
– Expected:
– During motion, count may fluctuate 0↔1 transiently; when they stop, it should stabilize at 1.
– If flicker is heavy, slightly increase MORPH_ITER or lower DIFF_THRESHOLD to keep the moving silhouette more coherent.
6) Lighting changes
– Turn a light on/off or open a window.
– Expected:
– Brief fg_pixels spike but count should return to baseline.
– If slow drift causes false positives, increase BG_LEARN_RATE so background adapts faster.
7) Log capture for offline inspection (optional)
– You can pipe serial to a CSV file for later plotting/analysis:
– Linux/macOS:
pio device monitor -b 115200 --raw > people_count_log.csv
– Windows PowerShell:
pio device monitor -b 115200 --raw | Tee-Object -FilePath people_count_log.csv
Optional host script to parse and print rolling counts (replace COMx with your port):
import sys, serial
from collections import deque
port = sys.argv[1] if len(sys.argv) > 1 else "/dev/ttyACM0"
ser = serial.Serial(port, 115200, timeout=1)
buf = deque(maxlen=10)
print("Connected to", port)
while True:
line = ser.readline().decode(errors='ignore').strip()
if not line or line.startswith("ms,"):
continue
try:
ms, count, fg, proc = line.split(",")
c = int(count)
buf.append(c)
avg = sum(buf)/len(buf)
print(f"t={ms} ms count={c} avg10={avg:.2f}")
except Exception:
pass
Troubleshooting
- No serial output
- Ensure the right port:
- Windows: check Device Manager under “Ports (COM & LPT)” for “USB Serial Device (COMx)” or “Arduino Portenta H7”.
- macOS: try
/dev/tty.usbmodem*. - Linux: try
/dev/ttyACM0or/dev/ttyACM1.
- Use:
pio device listto enumerate serial ports. -
Press reset once while the monitor is open.
-
Upload fails
- Double-press reset to enter the mbed mass-storage bootloader (a drive named “PORTENTA” or similar appears), then run
pio run -t uploadagain. -
Try a different USB-C cable and port. Avoid USB hubs when possible.
-
Camera readFrame() errors
- Reseat the Vision Shield—ensure connectors are fully aligned and pressed.
- Power-cycle the board.
- Confirm library is installed:
Arduino_HM01B0. -
Reduce resolution if supported; otherwise keep the default and update buffer dimensions accordingly.
-
Counts always 0
- Reduce
DIFF_THRESHOLDfrom 25 to 15–20. - Increase ambient light or reduce backlight.
-
Verify
fg_pixelschanges when you move: iffg_pixelsis near zero, the threshold is too high. -
Counts unstable (flicker)
- Increase
MORPH_ITERfrom 1 to 2 (costs CPU). - Raise
MIN_BLOB_AREAto ignore small noise. -
Decrease
BG_LEARN_RATEif the background adapts too quickly and erodes moving silhouettes. -
Multiple people merge into one blob
- Camera angle: elevate slightly to separate people’s silhouettes.
- Reduce
MIN_BLOB_AREA. -
Increase resolution (if RAM allows) and adjust buffer sizes and morphology accordingly.
-
Performance issues (proc_ms too high)
- Reduce resolution to 128×128 (if the driver supports it) to cut compute and memory.
- Decrease
MORPH_ITER. - Optimize thresholds for your lighting to avoid heavy noise.
Improvements
- Bidirectional line counting
-
Define a virtual line. Track blob centroids across frames and increment “in” or “out” counts as centroids cross the line. You can keep a small history of blob centroids and match by nearest-neighbor.
-
Smarter segmentation
-
Replace background subtraction with a tiny neural model (e.g., 96×96 person detector) using TensorFlow Lite for Microcontrollers. Run bounding-box detection and count boxes above confidence thresholds. This is more robust to lighting changes and merges but adds flash/RAM overhead.
-
Adaptive thresholds
-
Compute an Otsu-like threshold per frame or maintain a running variance of the background to adapt
DIFF_THRESHOLDdynamically. -
Region-of-interest (ROI)
-
Process only the central area or a corridor zone to reduce both compute and false positives.
-
Dual-core partitioning
-
Offload lower-rate background model maintenance to M4 while M7 handles the image pipeline at a fixed rate. Requires inter-core communication primitives in the Portenta environment.
-
Telemetry/export
-
Publish counts over Ethernet/LoRa (depending on shield variant) to a backend (MQTT/HTTP). Use batching to limit bandwidth.
-
On-device logging
- Maintain a small ring buffer of counts and timestamps in RAM or external storage to support offline audits.
Final Checklist
- Hardware
- Arduino Portenta H7 stacked with Portenta Vision Shield (Himax HM01B0)
- Stable USB-C data connection to host
-
Adequate lighting in the test environment
-
Software
- PlatformIO Core installed and accessible in your shell
- Project initialized with
board = portenta_h7_m7 platformio.iniincludesupload_protocol = mbedandArduino_HM01B0library dependency-
src/main.cppadded with the full firmware code -
Build/Flash
pio runcompletes without errors-
Upload via
pio run -t upload(use double-tap reset if needed) -
Run/Validate
- Serial monitor at 115200 bps shows CSV:
ms,count,fg_pixels,proc_ms - Empty scene => count ≈ 0
- One person => stable count ≈ 1
- Two separated people => stable count ≈ 2
- Adjust
DIFF_THRESHOLD,MIN_BLOB_AREA,MORPH_ITER, andBG_LEARN_RATEas needed
By following this guide, you achieve a functional, real-time camera-edge people counter running entirely on the Arduino Portenta H7 + Portenta Vision Shield (Himax HM01B0), built and deployed with PlatformIO, and validated step by step for reliability in your specific environment.
Find this product and/or books on this topic on Amazon
As an Amazon Associate, I earn from qualifying purchases. If you buy through this link, you help keep this project running.



