Documentation
Everything you need to know about using Bert CLI β commands, models, tips, and best practices.
π Contents
π Getting Started
About Bert
The idea of Bert was actually concived during 2022, During this time, AI assistants were just raising to what they are today, Bert idea begins when the idea of a AI model for you and only you that can be whatever you want it to be, when this Idea raised, Amphydia wasn't yet created, and I, Matias Nisperuza was younger and had no chance on making bert, but times change, for good.
Bert Identity
We frame bert as a service fundamentally oriented around human satisfaction, not raw capability benchmarks or abstract performance metrics.
At its core, bert is built to feel like a reliable presenceβa system that users can return to consistently, trust implicitly, and engage with naturally. In practical terms, this means prioritizing clarity over cleverness, helpfulness over verbosity, and dependability over spectacle.
bert is not positioned as an omniscient authority or a cold computational engine. Instead, it operates as a dependable companion:
a system that understands context, respects user intent, and responds in a way that feels grounded, supportive, and predictable.
Bert palette
Bert Main colors:
- Dark Sage (#598556)
- Light Beige (#F5F5DC)
- Sage (#9C9C9C)
- Olive (#808000)
Installation
# Strongly Recommended- Using PyPI pip install
pip install bert-cli
#Run bert after install
bert
#cloning repo in github - not recommended
git clone https://github.com/mnisperuza/bert-cli.git
cd bert-cli
# Install dependencies
pip install -e .
# Run Bert
bert
By installing bert you are agreeing to the terms of use and privacy.
First Run
When you first run Bert, you'll see an animated banner followed by a quantization picker. Choose based on your GPU:
- β INT4-FAST β Best for most users. Works with 4GB+ VRAM.
- π· INT8-BALANCED β Higher quality. Needs 6GB+ VRAM.
- π FP16-HIGH-END β Best quality. Needs 8GB+ VRAM.
- π₯οΈ FP32-CPU β CPU mode. Slower but no GPU needed.
π‘ Pro Tip
After the banner, Bert automatically loads the Nano model. You don't need to do anything β just start chatting!
β¨οΈ Commands Reference
Model Commands
| Command | Description |
|---|---|
| bert nano | Switch to Bert Nano (fastest, LiquidAI-LFM2-700M) |
| bert mini | Switch to Bert Mini (balanced, LiquidAI-LFM2-1.2B) |
| bert main | Switch to Bert Main (thinking mode, Qwen3-1.7B) |
| bert max | Switch to Bert Max (reasoning, LiquidAI-LFM2-2.6B) |
| bert coder | Switch to Bert Coder (code, Qwen2.5 Coder-1.5B) |
| bert maxcoder | Switch to Bert Max-Coder (heavy code, Qwen 2.5-coder-3B-Instruct) |
Quantization Commands
| Command | Description |
|---|---|
| bert int4 | Switch to INT4 quantization (4GB VRAM) |
| bert int8 | Switch to INT8 quantization (6GB VRAM) |
| bert fp16 | Switch to FP16 (8GB+ VRAM) |
| bert fp32 | Switch to FP32 (CPU mode) |
System Commands
| Command | Description |
|---|---|
| /*token XXXX | Set your weekly token |
| /*tokens | Show token status and remaining count |
| /*think - question | Enable thinking mode for this query (bert main only-BETA-) |
| /*help | Show help and all commands |
| /*status | Show current model, quant, and device info |
| /*memory | Clear conversation memory |
| /*clear | Clear screen and show banner |
| /*exit | Exit Bert CLI |
During Generation
| Key | Action |
|---|---|
| ESC | Stop generation immediately |
| Ctrl+C | Stop generation (interrupt) |
π€ Model Guide
Bert comes with 6 specialized models, each optimized for different tasks.
Bert Nano Fastest
Ultra-fast responses for quick questions, brainstorming, and casual chat. Perfect for low-end GPUs.
Bert Mini
Balanced performance for everyday tasks. Good quality with reasonable speed.
Bert Main π§ Thinking
The flagship model with thinking capabilities. Shows its reasoning process when using /*think.
Bert Max Powerful
Advanced reasoning and complex analysis. Best for nuanced discussions and detailed explanations.
Bert Coder Code
Specialized for programming. Write, debug, and explain code across multiple languages.
Bert Max-Coder Heavy Code
For complex, multi-file projects and production-quality code. Best for professional development.
Which Model Should I Use?
- π¬ Casual chat, quick questions: Bert Nano
- π Writing, summarizing, general tasks: Bert Mini or Main
- π§ Complex reasoning, math, analysis: Bert Main (with /*think) or Max
- π» Simple code, scripts, debugging: Bert Coder
- ποΈ Large codebases, architecture: Bert Max-Coder
π§ Thinking Mode-BETA
Bert Main (Qwen3-1.7B) supports a special thinking mode that shows the model's reasoning process after the response generation We are improving this feature, so expect a major update in a couple of months.
How to Use
/*think - What is the derivative of xΒ³ + 2xΒ²?
The response will show:
- The model's answer (streamed normally)
- A thinking box showing the reasoning process
- Token count (only counts the response, not thinking)
β οΈ Important
Thinking mode only works with Bert Main. Other models will show a warning if you try to use /*think.
When to Use Thinking
- β Math problems and calculations
- β Logic puzzles and reasoning tasks
- β Complex questions with multiple steps
- β When you want to understand HOW Bert reached an answer
π File References-BETA
You can reference files directly in your queries using the @ symbol.
Bert can easily find, check and read files in the current directory, although, Features like reviewing
files or adding a relative path are still in early development.
if you enconter a issue, dont doubt emailing us at mnisperuza1102@gmail.com
Usage
# Reference a file
Check @main.py for bugs
# Multiple files
Compare @old_version.py and @new_version.py
# Relative paths
Review the code in @src/utils/helpers.js
Supported File Types
| Category | Extensions |
|---|---|
| Code | .py, .js, .ts, .java, .c, .cpp, .go, .rs, .rb, .php |
| Web | .html, .css, .jsx, .tsx, .vue, .svelte |
| Data | .json, .yaml, .yml, .xml, .csv, .toml |
| Docs | .md, .txt, .rst, .log |
π‘ Pro Tip
File paths are often resolved relative to your current directory. Bert will show "π Found: filename" when it successfully reads a file.
ποΈ Token System
Bert uses a weekly token system to manage usage. Every week, you get 20,000 free tokens.
Getting a Token
- Visit the Bert CLI homepage
- Enter your email
- Receive your token (format:
BERT-XXXX-XXXX-XXXX-XXXX) - In Bert, type:
/*token YOUR-TOKEN-HERE
Token Commands
# Set your token
/*token BERT-A1B2-C3D4-E5F6-0123
# Check remaining tokens
/*tokens
How Tokens Are Counted
- π Each response uses tokens based on length
- π§ Thinking content does NOT count against your tokens
- π Tokens reset every week (Sunday midnight)
- π§ One token per email per week
βοΈ Quantization Guide
Quantization reduces model size and memory usage, allowing larger models to run on smaller GPUs.
| Level | VRAM | Quality | Speed |
|---|---|---|---|
| INT4 | ~4GB | Good β | Fast |
| INT8 | ~6GB | Very Good | Medium |
| FP16 | ~8GB | Excellent | Medium |
| FP32 | CPU | Best | Slow |
π‘ Recommendation
Start with INT4. It offers the best balance of quality, speed, and memory usage for most users.
β¨ Tips & Best Practices
Getting Better Responses
- 1. Be specific: "Write a Python function that sorts a list" is better than "Write code"
- 2. Provide context: Share relevant background information
- 3. Use the right model: Bert Coder for code, Bert Main for reasoning
-
4.
Reference files: Use
@filenameinstead of pasting code
Keyboard Shortcuts
- β ESC: Stop generation instantly
- β Up Arrow: Previous command (terminal dependent)
- βC Ctrl+C: Interrupt / stop
Troubleshooting
Model won't load?
Try a smaller model or lower quantization. If you're out of VRAM, use bert fp32 for CPU mode.
Slow responses?
Switch to Bert Nano for faster responses, or use INT4 quantization.
Token expired?
Tokens are valid for one week. Get a new one at the homepage.
Model return strange responses?
Let us know which model, email us at mnisperuza1102@gmail.com We will review the issue!.