ai.Local
An open-source local AI assistant that learns from your documents, runs on your machine, and stays private.
Copy and paste in your terminal to get started
$curl -fsSL https://smartloop.ai/install | shSmartloop Studio
A native desktop experience for macOS, Windows, and Linux. Chat with your documents and run local models — all from one app.

Features
Attach Personal Documents
Upload PDFs, notes, and files. Your assistant learns from them and answers questions grounded in your own data.
Local Small Language Model (GGML)
Run quantized open models directly on your hardware — no cloud required, no data leaves your machine.
Connect to MCP Servers
Extend capabilities by connecting to Model Context Protocol servers for tools, search, and external services.
Create Custom Skills
Define reusable prompt-based skills to automate repetitive tasks and shape how your AI responds.
Open Model, Local Agents
Spin up autonomous agents powered by open models that plan and execute tasks entirely on your device.
Your Own AI Model
Fine-tune and personalise a model on your own data so it knows your context, terminology, and style.
OpenAI Compatible API
Drop-in replacement for the OpenAI API — point any existing integration at your local instance.
Documentation
Introduction
Overview of Smartloop and what it can do for you.
→Getting Started
Install Smartloop and run your first local AI session.
→Credentials
Configure API keys and authentication for models.
→MCP Servers
Connect external tools and services via MCP.
→Developers
OpenAI-compatible API reference and integration guides.
→FAQ
01What is Smartloop?
A free and open-source AI assistant to extract information from documents and generate content 100% locally using open models.
02How does it work?
We use open models that are quantized (compressed) to run on your local devices.
03How much does it cost?
It is absolutely free to use and open-source, however, there is a paid plan which we use to maintain the models and improve the framework.
04What model do you use?
We primarily use small language models (SLMs) that can be run and tuned in the scope of your device. We recommend devices like macOS (M-series) and NVIDIA CUDA devices (at least 4GB of GPU memory). Based on available memory, documents, attachments, skills, conversations, MCP, etc. inputs are vectorized and tuned by local agents. Our approach is to train to free up context, as context is not infinite when it comes to running locally.
Accelerated by

See it in Action
