AI Integration Features
This document outlines the AI capabilities integrated into Virga, powered by a locally run LLM model.
Core Architecture
Virga features are built around an embedded LLM engine that runs directly within the beacon. This allows for autonomous operations and intelligent analysis on the target system without constant operator intervention. The key components are:
- LLM Engine: Utilizes
go-llama.cppbindings to run a GGUF-formatted language model. - In-Memory Database (MemDB): A
go-memdbinstance that stores all operational data during the beacon's lifecycle, including command results and AI interactions. This is not a persistent SQLite database. - Task Executor: A loop within the llama engine that interprets the model's output, executes system commands, and feeds the results back into the model for further analysis.
How It Works: The Autonomous Loop
The core of the AI's capability is a loop that allows the model to "think" and act:
- Prompt: The operator provides an initial prompt (e.g., "Analyze this system for security weaknesses").
- Inference: The LLM model generates a response based on the prompt.
- Action: The beacon's code scans the model's response for a special
[EXECUTE: command]marker. - Execution: If a marker is found, the specified
commandis executed on the target system. - Feedback: The output of the command is fed back into the LLM model as new context.
- Iteration: The model analyzes the command output and decides on the next step, generating a new response. This loop continues until the task is complete or a set number of iterations is reached.
This process allows the beacon to perform tasks like detecting the operating system (echo %OS%), then running OS-specific commands (systeminfo or uname -a) to gather information autonomously.
Key Capabilities
- Autonomous Reconnaissance: The AI can independently perform system enumeration, user analysis, and network discovery by chaining commands based on previous results.
- Adaptive Command Execution: The AI attempts to run commands appropriate for the detected operating system (Windows or Linux).
- Structured Data Collection: The model is prompted to return key information using a
[FINDING: key: value]marker, which can be parsed for structured logging.
Configuration
AI features are configured in your beacon configuration file (e.g., configs/beacon.yaml). This provides fine-grained control over the model's behavior.
llama:
enabled: true
log_enabled: true
model:
context: 8192
gpu_layers: 0
threads: 4
temperature: 0.7
top_k: 40
top_p: 0.95
max_tokens: 2048
prompt:
preset: "enhanced" # Options: default, enhanced, stealth, aggressive
autonomous:
enabled: true
initial_tasks:
- type: "system_reconnaissance"
description: "Complete system analysis and environment mapping"
- type: "user_activity"
description: "Monitor and analyze user behavior patterns"
- type: "network_discovery"
description: "Map network topology and discover connected systems"
max_iterations: 50
timeout_minutes: 15
report_interval: 300Using AI Features via the CLI
Once you are interacting with an AI-enabled beacon, you can use the llama and memdb commands.
llama command
llama prompt "<your objective>": Kicks off an autonomous task with your specified goal.llama auto: Starts the pre-defined autonomous tasks configured inbeacon.yaml.llama stop: Stops the current AI task.llama status: Shows the current status of the AI engine.
Example:
virga (session)> llama prompt "Find all running processes not signed by Microsoft"
[*] Sending prompt to Llama AI...memdb command
memdb query "<query>": Query the in-memory database. (Note: query functionality is limited).memdb dump: Dumps the contents of thellama_interactionsandcommand_resultstables from the beacon's memory.
Example:
virga (session)> memdb dump
[*] Dumping MemDB contents...
--- llama_interactions ---
ID: ..., TaskID: ..., Prompt: "Find all running processes...", ...
---Enabling AI in Beacons
To create a beacon with AI capabilities, you must:
- Download Dependencies: Run
make download-llama-depson the server to fetch the required model and libraries. - Enable in Config: Set
llama.enabled: truein yourbeacon.yamlfile. - Generate: Build the beacon using
generate beacon --config beacon.yaml.
Alternatively, you can use the --enable-llama flag as a shortcut:
virga> generate beacon --os windows --arch amd64 --enable-llamaDefault LLM Model
The default LLM model is TinyLlama-1.1B-Chat-v1.0-GGUF (Q4_K_M quantization, ~669MB). This model provides a good balance between size and performance for embedded AI operations.
Downloading the Model
To download the default model along with required libraries:
# Download model and libraries for current platform only
make download-llama-deps
# Or download libraries for all platforms and the model
make download-llama-allChanging the Default Model
You can use a different GGUF-format model by modifying scripts/download-llama-model.go:
const (
modelURL = "https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf" // <- Change this URL
modelPath = "internal/implant/llama/models/model.gguf" // DO NOT change this path
)Compatible Models
Based on the embedded llama.cpp version (commit ac43576), the following model families are supported:
| Model Family | Examples | Notes |
|---|---|---|
| LLaMA / LLaMA 2 | TinyLlama, Llama-2-7B, Llama-2-13B | Most widely supported |
| Falcon | Falcon-7B, Falcon-40B | Alternative architecture |
| Alpaca | Alpaca-7B, Alpaca-13B | Fine-tuned LLaMA |
| Vicuna | Vicuna-7B, Vicuna-13B | Chat-optimized |
| GPT4All | GPT4All-J, GPT4All-Snoozy | Optimized for CPU |
| WizardLM | WizardLM-7B, WizardLM-13B | Instruction-following |
| Baichuan | Baichuan-7B | Chinese language model |
| OpenBuddy | OpenBuddy-7B | Multilingual support |
Recommended GGUF Models
| Model | Size | Use Case | Download |
|---|---|---|---|
| TinyLlama-1.1B Q4_K_M | ~669MB | Default, balanced performance | HuggingFace |
| TinyLlama-1.1B Q8_0 | ~1.2GB | Higher quality responses | HuggingFace |
| Vicuna-7B Q4_K_M | ~3.8GB | Better chat capabilities | HuggingFace |
| WizardLM-7B Q4_K_M | ~3.8GB | Better instruction following | HuggingFace |
Important: Only models in GGUF format from the supported families above will work. Models like Phi-2, Mistral, or others not listed in the supported families will not function with this llama.cpp version.
Steps to Change the Model
- Choose a model from the supported families listed above
- Find the GGUF version on Hugging Face (search for "ModelName GGUF")
- Copy the direct download URL (must end with
.gguf) - Edit
scripts/download-llama-model.goand update themodelURL - Run
make download-llama-depsto download the new model
Note: After changing the model, you must rebuild the beacon with
--enable-llamafor the changes to take effect.
Important Considerations
- Model Size: Larger models require more memory in the implant
- Quantization: Lower quantization (Q2, Q3) reduces quality but saves space
- Compatibility: Only GGUF format models are supported
- Performance: Model inference speed depends on target system resources
- Security: Only download models from trusted sources to avoid malicious code
- Testing: Always test new models thoroughly before production deployment