Follow-up: adding Ollama support to my open-source cursor-aware AI app – looking for beta testers with vision-capable local models
Disclosure: Some links in this article are affiliate links. AI Maestro may earn a commission if you make a purchase, at no…

Follow-up: adding Ollama support to my open-source cursor-aware AI app – looking for beta testers with vision-capable local models
I’ve added support for Ollama as a first-class built-in provider in the upcoming v1.2.0 release of AIPointer. This implementation now supports:
- Auto-detection on localhost:11434
- Model dropdown populated from /api/tags
- Vision + text input pipeline (region screenshot routes to vision model)
- Tool calling for AIPointer’s 10 built-in tools (fetch_url, open_url, search_web, play_music, set_volume, copy_to_clipboard, read_clipboard, launch_app, save_document, reveal_in_finder)
- Per-model timeout (uncapped option for large models on slower hardware)
- Same config UX as the cloud providers — just point it at Ollama, pick model, done
I’ve received helpful feedback from this community regarding fast vision-capable local models. I’m now implementing support for Ollama and will need beta testers to help with testing.
What We Need From Beta Testers
- M-series Mac (M1/M2/M3/M4, Pro/Max/Ultra) – measuring TTFT against Gemini 2+3 Flash cloud baseline
- RTX 3090, 4090, or 5090 on Windows or Linux – same baseline
- AMD GPU on Linux (ROCm) – would love to know if this works at all
- 16GB-class VRAM cards – checking what’s the realistic model ceiling
- Mac mini M4 or M4 Pro – fastest consumer Apple Silicon, want to see TTFT
To participate in the beta testing, please:
- Install AIPointer (signed + notarized on Mac, NSIS on Windows, AppImage on Linux)
- Point it at your local Ollama, pick a vision model (Qwen2.5-VL, MiniCPM-V, Llama 3.2 Vision, Pixtral, whatever you already have running)
- Use it for 30-60 minutes of normal daily tasks – screenshots, region queries, tool calls
- Send back: TTFT numbers, model + quant + hardware, what worked, what didn’t, any tool-call failures
I’ll fold the feedback into the v1.2.0 release notes and credit testers/contributors if you want. If we find that one model + one inference setup consistently delivers sub-2s TTFT with reliable tool calls on consumer hardware, that becomes the recommended default in onboarding.
This is not meant to compete with any other systems; I’m building this to provide a local-inference option for people in this community. If you’re interested in participating or need more information, please let me know via DM.
Originally published at reddit.com. Curated by AI Maestro.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.


![Would a new result in pre-print be considered by reviewers? [D]](https://ai-maestro.online/wp-content/uploads/2026/05/would-a-new-result-in-pre-print-be-considered-by-reviewers-d-768x768.jpg)
