Beta v1.0.16

Smartloop Studio

Run AI agents on your own machine — connect tools, build custom skills, and get things done without sending your data anywhere.

Features

Everything you need to run AI privately

Small Language Model

Run quantized open models directly on your hardware — no cloud required, no data leaves your machine.

Connect to MCP Servers

Extend capabilities by connecting to Model Context Protocol servers for tools, search, and external services.

Create Custom Skills

Define reusable prompt-based skills to automate repetitive tasks and shape how your AI responds.

Open Model, Local Agents

Spin up autonomous agents powered by open models that plan and execute tasks entirely on your device.

Your own Agent Harness

Personalize your own AI orchestration to fit your context, terminology, and style.

OpenAI Compatible API

Drop-in replacement for the OpenAI API — point any existing integration at your local instance.

AI Agent Harness

Run AI agents on your own machine, extract information and write content for you

$curl -fsSL https://smartloop.ai/install.sh | bash
Smartloop terminal demo

Documentation

Guides to get you up and running

FAQ

01What is Smartloop?

AI orchestration framework to extract information and do tasks. It is your co-worker in your device, always on and ready to serve.

02How does it work?

We have pre-trained small language model (SLM) that can run on your device that to perform the orchestration tasks locally.

03How much does it cost?

Its free to use, however we charge a small $10/month subscription for select capability-specific models, to support our framework and infrastructure. We are constantly updating our base model, therefore at any given point in time it will be efficient enough to handle most regular tasks.

04What model do you use?

We primarily use small language models (SLMs) that can be run and tuned in the scope of your device. We recommend devices like macOS (M-series) and NVIDIA CUDA devices (at least 4GB of GPU memory). Based on available memory, documents, attachments, skills, conversations, MCP, etc. inputs are vectorized and tuned by local agents. Our approach is to train to free up context, as context is not infinite when it comes to running locally.

Accelerated by

NVIDIAMicrosoftGoogle for Startups