ai.Local

Your own private AI assistant on your local machine. Attach personal documents, connect to tools, and create custom skills to automate tasks and generate content.

Copy and paste in your terminal to get started

$curl -fsSL https://smartloop.ai/install | bash
gpt.local terminal demo

Features

Attach Personal Documents

Upload PDFs, notes, and files. Your assistant learns from them and answers questions grounded in your own data.

Local Small Language Model (GGML)

Run quantized open models directly on your hardware — no cloud required, no data leaves your machine.

Connect to MCP Servers

Extend capabilities by connecting to Model Context Protocol servers for tools, search, and external services.

Create Custom Skills

Define reusable prompt-based skills to automate repetitive tasks and shape how your AI responds.

Open Model, Local Agents

Spin up autonomous agents powered by open models that plan and execute tasks entirely on your device.

Your own AI assistant

Personalize your own AI orchestration to fit your context, terminology, and style.

OpenAI Compatible API

Drop-in replacement for the OpenAI API — point any existing integration at your local instance.

Documentation

Coming Soon

Smartloop Studio

An AI co-worker for your desktop. Extract information from your documents, orchestrate your workflows, and more — all from a user-friendly interface.

Smartloop Studio

FAQ

01What is Smartloop?

A free and open-source AI assistant to extract information from documents and generate content 100% locally using open models.

02How does it work?

We use open models that are quantized (compressed) to run on your local devices.

03How much does it cost?

It is absolutely free to use and open-source, however, there is a paid plan which we use to maintain the models and improve the framework.

04What model do you use?

We primarily use small language models (SLMs) that can be run and tuned in the scope of your device. We recommend devices like macOS (M-series) and NVIDIA CUDA devices (at least 4GB of GPU memory). Based on available memory, documents, attachments, skills, conversations, MCP, etc. inputs are vectorized and tuned by local agents. Our approach is to train to free up context, as context is not infinite when it comes to running locally.

Accelerated by

NVIDIAMicrosoftGoogle for Startups