A Hands-On Coding Tutorial on Qualcomm AI Hub Models for Classification, Object Detection, and Hardware-Aware Deployment

For makers and artists working with edge hardware, the ecosystem has matured enough to move beyond theoretical benchmarks into practical deployment. The latest iteration of Qualcomm AI Hub Models offers a streamlined path from model selection to hardware-aware execution, removing the friction often associated with porting PyTorch networks to mobile silicon. This guide walks through a complete workflow: setting up the environment, resolving input tensor format mismatches, running local inference, and finally compiling models for on-device execution on real Qualcomm hardware.

Environment Setup and Tensor Normalisation

The foundation of this workflow lies in a robust Python environment within Google Colab. We begin by installing the qai_hub_models package, creating a dedicated output directory, and disabling gradient tracking since we are strictly interested in inference performance. A critical step involves the to_nchw() utility function. Many deep learning models, particularly those from the PyTorch ecosystem, expect inputs in Channel-First (NCHW) format, whereas image processing libraries often output Channel-Last (NHWC) tensors. This helper ensures compatibility by automatically reshaping input data to match the model’s expectations.

Copy Code

import subprocess, sys, os, glob, textwrap, traceback
import numpy as np, torch
from PIL import Image
import matplotlib.pyplot as plt
def pip_install(*pkgs):
   subprocess.run([sys.executable, "-m", "pip", "install", "-q", *pkgs], check=True)
pip_install("qai_hub_models")
OUT_DIR = "/content/qaihm_out"; os.makedirs(OUT_DIR, exist_ok=True)
torch.set_grad_enabled(False)
def to_nchw(value):
   arr = value[0] if isinstance(value, (list, tuple)) else value
   t = torch.from_numpy(np.asarray(arr, dtype=np.float32))
   if t.ndim == 3:
       t = t.unsqueeze(0)
   if t.ndim == 4 and t.shape[1] != 3 and t.shape[-1] == 3:
       t = t.permute(0, 3, 1, 2).contiguous()
   return t

Model Discovery and Classification

Once the environment is ready, we explore the repository to identify available pretrained models. The script lists the first forty packages to demonstrate the breadth of options, from lightweight classifiers to complex detection networks. We focus on MobileNet-V2, a compact architecture ideal for edge devices. By loading the model and inspecting its input specification, we confirm the required tensor dimensions and data types. To make raw logits human-readable, we map them against ImageNet class labels and create a function to extract the top five predictions with their confidence scores.

Copy Code

import pkgutil, qai_hub_models.models as _m
model_ids = sorted(n for _, n, p in pkgutil.iter_modules(_m.__path__)
                  if p and not n.startswith("_"))
print(f">>> {len(model_ids)} models available. First 40:\n")
print(textwrap.fill(", ".join(model_ids[:40]), 100), "\n")
from qai_hub_models.models.mobilenet_v2 import Model as MobileNetV2
model = MobileNetV2.from_pretrained().eval()
spec = model.get_input_spec()
input_name = list(spec.keys())[0]
print(">>> Input:", input_name, spec[input_name].shape, spec[input_name].dtype)
from torchvision.models import MobileNet_V2_Weights
IMAGENET_CLASSES = MobileNet_V2_Weights.IMAGENET1K_V1.meta["categories"]
def top5(logits):
   if logits.ndim == 1: logits = logits.unsqueeze(0)
   probs = torch.softmax(logits, dim=1)[0]
   conf, idx = probs.topk(5)
   return [(IMAGENET_CLASSES[i], float(c)) for c, i in zip(conf, idx)]

Running Inference on Sample and Real Data

We validate the setup by running inference on two distinct inputs. First, we utilise the model’s built-in sample input, applying the to_nchw() transformation to ensure the tensor shape matches the NCHW requirement. Next, we fetch a real-world image from the PyTorch Hub repository. This image undergoes standard preprocessing-resizing to 256×256, centre cropping to 224×224, and converting to a tensor-before being fed into the network. The output is visualised alongside the predicted label, providing immediate feedback on the model’s accuracy.

Copy Code

sample = model.sample_inputs()
x = to_nchw(sample[input_name])
print(">>> fed tensor shape:", tuple(x.shape))
print("\n>>> Top-5 for the built-in sample input:")
for label, conf in top5(model(x)):
   print(f"    {conf:6.2%}  {label}")
from torchvision import transforms
preprocess = transforms.Compose([
   transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(),
])
img = None
try:
   import urllib.request
   p = os.path.join(OUT_DIR, "input.jpg")
   urllib.request.urlretrieve(
       "https://raw.githubusercontent.com/pytorch/hub/master/images/dog.jpg", p)
   img = Image.open(p).convert("RGB")
except Exception as e:
   print(">>> photo download skipped:", e)
if img is not None:
   preds = top5(model(preprocess(img).unsqueeze(0)))
   print("\n>>> Top-5 for the downloaded photo:")
   for label, conf in preds: print(f"    {conf:6.2%}  {label}")
   plt.figure(figsize=(5,5)); plt.imshow(img); plt.axis("off")
   plt.title(f"{preds[0][0]}  ({preds[0][1]:.1%})"); plt.show()

Object Detection with YOLOv7

To demonstrate versatility, the workflow extends beyond simple classification to object detection using YOLOv7. We utilise a helper function to execute official Qualcomm AI Hub demos via the command line. After running the MobileNet-V2 demo, we install the YOLOv7 extras and trigger its detection pipeline. The script then scans the output directory for generated images, displaying the final result with bounding boxes to confirm successful detection on the simulated environment.

Copy Code

def run_demo(module, extra=None, timeout=900):

cmd = [sys.executable, "-m", module, "--eval-mode", "fp",

"--output-dir", OUT_DIR] + (extra or [])

print(f"\n>>> {' '.join(cmd)}")

try:

r = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)

print("\n".join((r.stdout + r.stderr).strip().splitlines()[-25:]))

except Exception as e:

print(">>> demo skipped:", e)

run_demo("qai_hub_models.models.mobilenet_v2.demo")

try:

pip_install("qai_hub_models[yolov7]")

run_demo("qai_hub_models.models.yolov7.demo")

imgs = sorted(glob.glob(OUT_DIR + "/*.png") + glob.glob(OUT_DIR + "/*.jpg"),

key=os.path.getmtime)

if imgs:

plt.figure(figsize=(9,9)); plt.imshow(Image.open(imgs[-1]).convert("RGB"))

plt
Source Read original →
Related reading
Tencent Open-Sources AngelSpec: A Unified Training Framework for MTP and Block-Parallel Speculative Decoding on Hy3 Models
Fireworks AI Releases Fireworks Nexus: A Drop-In Routing and Cost-Control Layer That Moves Routine Coding Work to Open-Weight Models
Cursor’s agent swarm suggests cheaper models can handle most coding when frontier models plan the work
The SignalThe Signal: Edition 03Read this edition →Every Friday: the one AI story that actually mattered, plus the tools worth your time.

AM
AI Maestro is an independent British AI publication. We test what we recommend, and we write it the way we would say it. More about us

A Hands-On Coding Tutorial on Qualcomm AI Hub Models for Classification, Object Detection, and Hardware-Aware Deployment

Environment Setup and Tensor Normalisation

Model Discovery and Classification

Running Inference on Sample and Real Data

Object Detection with YOLOv7

`Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.`

`follow us`

`Popular Tag`

`Popular Post`

`smevals – a small…`

`Oxide and Friends: The…`

`DeepSeek Upgrades DeepSeek-V4-Flash-0731 with…`

Environment Setup and Tensor Normalisation

Model Discovery and Classification

Running Inference on Sample and Real Data

Object Detection with YOLOv7

Related articles

Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

smevals – a small…

Oxide and Friends: The…

DeepSeek Upgrades DeepSeek-V4-Flash-0731 with…

`Related articles`

`Empowering Businesses with AI: Smart Tools, Smarter Business Decisions.`

`follow us`

`Popular Tag`

`Popular Post`

`smevals – a small…`

`Oxide and Friends: The…`

`DeepSeek Upgrades DeepSeek-V4-Flash-0731 with…`