Tool Executor Arm: Sandboxed Command Execution
Components > Arms > Tool Executor Arm
Version: 1.0 Technology: Rust / actix-web Cost Tier: 3 (Medium-High) Average Latency: 0.5-5 seconds Status: Phase 1 Complete
Table of Contents
- Overview
- Architecture
- Security Model
- Core Functionality
- Implementation
- API Specification
- Data Models
- Configuration
- Performance Characteristics
- Testing
- Deployment
- Security Considerations
- See Also
Overview
The Tool Executor Arm is a security-first component that executes external commands, API calls, and scripts in isolated sandboxes with strict capability controls. It provides the system with the ability to interact with external tools while maintaining strong security boundaries.
Key Features
- Capability-Based Access Control: Fine-grained permissions for command execution
- Command Allowlist: Only pre-approved commands can be executed
- Sandbox Isolation: All executions run in isolated Docker containers
- Resource Limits: Timeouts, memory limits, and CPU restrictions
- Provenance Tracking: Complete audit trail of all executions
- Network Control: Host allowlisting for HTTP requests
- Non-Root Execution: All commands run as unprivileged users
Design Principles
- Security by Default: Deny all, permit explicitly
- Defense in Depth: Multiple layers of security controls
- Least Privilege: Minimal capabilities granted for each operation
- Auditability: Complete logging and provenance metadata
- Fail-Safe: Errors default to blocking execution
Architecture
graph TB
subgraph "Executor Arm"
API[API Endpoint]
VAL[Validator]
EXEC[Executor]
SAND[Sandbox Manager]
PROV[Provenance Tracker]
end
subgraph "Security Layer"
CAP[Capability Checker]
ALLOW[Allowlist]
HOST[Host Validator]
end
subgraph "Execution Environment"
DOCKER[Docker Container]
FS[Restricted Filesystem]
NET[Network Namespace]
end
ORCH[Orchestrator] -->|Execute Request + Token| API
API --> VAL
VAL --> CAP
VAL --> ALLOW
VAL --> HOST
CAP -->|Authorized| EXEC
ALLOW -->|Permitted| EXEC
HOST -->|Valid| EXEC
EXEC --> SAND
SAND --> DOCKER
DOCKER --> FS
DOCKER --> NET
EXEC --> PROV
PROV -->|Provenance Metadata| API
API -->|Execution Result| ORCH
CAP -->|Denied| API
ALLOW -->|Blocked| API
HOST -->|Invalid| API
style DOCKER fill:#f9f,stroke:#333
style CAP fill:#ff9,stroke:#333
style PROV fill:#9ff,stroke:#333
Execution Flow
sequenceDiagram
participant O as Orchestrator
participant E as Executor API
participant V as Validator
participant S as Sandbox
participant D as Docker
O->>E: POST /execute (command + token)
E->>V: Validate request
alt Token Valid
V->>V: Check capabilities
alt Capability Granted
V->>V: Check allowlist
alt Command Allowed
V->>S: Prepare sandbox
S->>D: Create container
D-->>S: Container ready
S->>D: Execute command
D-->>S: Output + exit code
S->>E: Execution result
E->>E: Generate provenance
E-->>O: Success response
else Command Blocked
V-->>E: Allowlist violation
E-->>O: Error: Command not allowed
end
else No Capability
V-->>E: Capability violation
E-->>O: Error: Insufficient privileges
end
else Token Invalid
V-->>E: Auth failure
E-->>O: Error: Invalid token
end
Security Model
Capability-Based Access Control
The Executor Arm uses a capability-based security model where each operation requires specific permissions granted through time-limited tokens.
#[derive(Debug, Clone, Serialize, Deserialize)]
struct CapabilityToken {
token_id: String,
granted_capabilities: HashSet<Capability>,
expires_at: DateTime<Utc>,
issued_to: String,
}
#[derive(Debug, Clone, Hash, Eq, PartialEq, Serialize, Deserialize)]
enum Capability {
// Shell command execution
ShellRead, // Read-only commands (ls, cat, grep)
ShellWrite, // Write commands (echo >, mkdir)
ShellExecute, // Execute scripts
// Network access
HttpGet, // HTTP GET requests
HttpPost, // HTTP POST requests
HttpAllHosts, // Access any host (vs allowlist)
// File system
FilesystemRead, // Read files
FilesystemWrite, // Write files
FilesystemDelete, // Delete files
// Special
PythonExec, // Run Python scripts
DockerAccess, // Access Docker API
}
impl CapabilityToken {
fn can_execute(&self, required: &Capability) -> bool {
!self.is_expired() && self.granted_capabilities.contains(required)
}
fn is_expired(&self) -> bool {
Utc::now() > self.expires_at
}
}
Capability Types
| Capability | Description | Risk Level |
|---|---|---|
ShellRead | Read-only shell commands (ls, cat, grep) | Low |
ShellWrite | Write operations (echo >, mkdir) | Medium |
ShellExecute | Execute scripts | High |
HttpGet | HTTP GET requests to allowlisted hosts | Low |
HttpPost | HTTP POST requests to allowlisted hosts | Medium |
HttpAllHosts | HTTP requests to any host | High |
FilesystemRead | Read files from sandbox | Low |
FilesystemWrite | Write files to sandbox | Medium |
FilesystemDelete | Delete files in sandbox | Medium |
PythonExec | Execute Python scripts | High |
DockerAccess | Access Docker API (privileged) | Critical |
Core Functionality
Command Allowlist
Only pre-approved commands can be executed, with required capabilities mapped to each command.
struct Executor {
allowed_commands: HashMap<String, Vec<Capability>>,
allowed_hosts: Vec<String>,
timeout: Duration,
}
impl Executor {
fn default_safe() -> Self {
let mut allowed_commands = HashMap::new();
// Read-only commands
allowed_commands.insert("echo".to_string(), vec![Capability::ShellRead]);
allowed_commands.insert("cat".to_string(), vec![Capability::ShellRead, Capability::FilesystemRead]);
allowed_commands.insert("ls".to_string(), vec![Capability::ShellRead, Capability::FilesystemRead]);
allowed_commands.insert("grep".to_string(), vec![Capability::ShellRead]);
allowed_commands.insert("find".to_string(), vec![Capability::ShellRead, Capability::FilesystemRead]);
allowed_commands.insert("head".to_string(), vec![Capability::ShellRead, Capability::FilesystemRead]);
allowed_commands.insert("tail".to_string(), vec![Capability::ShellRead, Capability::FilesystemRead]);
// Network commands
allowed_commands.insert("curl".to_string(), vec![Capability::HttpGet]);
allowed_commands.insert("wget".to_string(), vec![Capability::HttpGet]);
// Version control (read-only)
allowed_commands.insert("git".to_string(), vec![Capability::ShellRead, Capability::FilesystemRead]);
Self {
allowed_commands,
allowed_hosts: vec![
"api.github.com".to_string(),
"registry.npmjs.org".to_string(),
"pypi.org".to_string(),
],
timeout: Duration::from_secs(30),
}
}
}
Sandboxed Execution
All commands execute in isolated environments with resource limits.
impl Executor {
async fn execute(&self, req: ExecutionRequest, token: &CapabilityToken) -> Result<ExecutionResult> {
// 1. Validate command is allowed
self.validate_command(&req.command, token)?;
// 2. For HTTP requests, validate host
if req.action_type == "http" {
self.validate_host(&req.command, token)?;
}
// 3. Execute with timeout and resource limits
let result = self.execute_sandboxed(req).await?;
// 4. Generate provenance metadata
let provenance = self.generate_provenance(&req, &result);
Ok(ExecutionResult {
success: result.status.success(),
stdout: String::from_utf8_lossy(&result.stdout).to_string(),
stderr: String::from_utf8_lossy(&result.stderr).to_string(),
exit_code: result.status.code(),
duration_ms: result.duration.as_millis() as u64,
provenance,
})
}
async fn execute_sandboxed(&self, req: ExecutionRequest) -> Result<CommandOutput> {
use tokio::process::Command;
use tokio::time::timeout;
let start = Instant::now();
// Build command with resource limits
let mut cmd = Command::new(&req.command);
cmd.args(&req.args)
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.kill_on_drop(true);
// Execute with timeout
let output = timeout(self.timeout, cmd.output())
.await
.map_err(|_| Error::Timeout)?
.map_err(|e| Error::Execution(e.to_string()))?;
Ok(CommandOutput {
status: output.status,
stdout: output.stdout,
stderr: output.stderr,
duration: start.elapsed(),
})
}
}
Resource Limits
| Resource | Limit | Rationale |
|---|---|---|
| Execution Timeout | 30 seconds (default) | Prevent infinite loops |
| Memory | 512 MB | Limit resource consumption |
| CPU | 1 core | Fair sharing |
| Disk I/O | Read-only root, writable /tmp | Prevent system modification |
| Network | Allowlisted hosts only | Prevent data exfiltration |
| Process Count | 10 max | Prevent fork bombs |
Implementation
Executor Structure
use actix_web::{web, App, HttpResponse, HttpServer};
use serde::{Deserialize, Serialize};
use std::collections::{HashMap, HashSet};
use std::time::{Duration, Instant};
use tokio::process::{Command, Stdio};
use chrono::{DateTime, Utc};
#[derive(Debug, Deserialize)]
struct ExecutionRequest {
action_type: String, // "shell", "http", "python"
command: String,
args: Vec<String>,
timeout_seconds: Option<u64>,
capability_token: String,
metadata: HashMap<String, String>,
}
#[derive(Debug, Serialize)]
struct ExecutionResult {
success: bool,
stdout: String,
stderr: String,
exit_code: Option<i32>,
duration_ms: u64,
provenance: ProvenanceMetadata,
}
#[derive(Debug, Serialize)]
struct ProvenanceMetadata {
arm_id: String,
timestamp: DateTime<Utc>,
action_type: String,
command_hash: String,
capabilities_used: Vec<String>,
}
struct CommandOutput {
status: std::process::ExitStatus,
stdout: Vec<u8>,
stderr: Vec<u8>,
duration: Duration,
}
Command Validation
impl Executor {
fn validate_command(&self, command: &str, token: &CapabilityToken) -> Result<()> {
// Check if command is in allowlist
let required_caps = self.allowed_commands
.get(command)
.ok_or(Error::CommandNotAllowed(command.to_string()))?;
// Check if token has all required capabilities
for cap in required_caps {
if !token.can_execute(cap) {
return Err(Error::InsufficientCapability {
required: cap.clone(),
command: command.to_string(),
});
}
}
Ok(())
}
fn validate_host(&self, url: &str, token: &CapabilityToken) -> Result<()> {
// If token has HttpAllHosts, allow any host
if token.can_execute(&Capability::HttpAllHosts) {
return Ok(());
}
// Otherwise, check allowlist
let host = extract_host(url)?;
if !self.allowed_hosts.contains(&host) {
return Err(Error::HostNotAllowed(host));
}
Ok(())
}
fn generate_provenance(&self, req: &ExecutionRequest, result: &CommandOutput) -> ProvenanceMetadata {
use sha2::{Sha256, Digest};
let command_str = format!("{} {}", req.command, req.args.join(" "));
let mut hasher = Sha256::new();
hasher.update(command_str.as_bytes());
let command_hash = format!("{:x}", hasher.finalize());
ProvenanceMetadata {
arm_id: "executor".to_string(),
timestamp: Utc::now(),
action_type: req.action_type.clone(),
command_hash,
capabilities_used: self.get_used_capabilities(&req.command),
}
}
}
Execution Pipeline
graph LR
A[Request] --> B{Token Valid?}
B -->|No| Z[Error: Auth]
B -->|Yes| C{Capability?}
C -->|No| Z
C -->|Yes| D{Allowlist?}
D -->|No| Z
D -->|Yes| E{HTTP?}
E -->|Yes| F{Host OK?}
F -->|No| Z
E -->|No| G[Execute]
F -->|Yes| G
G --> H[Result]
H --> I[Provenance]
I --> J[Response]
style Z fill:#f99,stroke:#333
style J fill:#9f9,stroke:#333
API Specification
Execute Command
Endpoint: POST /execute
Headers:
Content-Type: application/json
X-Request-ID: uuid (optional)
Request Body:
{
"action_type": "shell",
"command": "ls",
"args": ["-la", "/tmp"],
"timeout_seconds": 10,
"capability_token": "tok_abc123xyz",
"metadata": {
"task_id": "task-123",
"requested_by": "orchestrator"
}
}
Field Descriptions:
| Field | Type | Required | Description |
|---|---|---|---|
action_type | string | Yes | Type of action: "shell", "http", "python" |
command | string | Yes | Command to execute |
args | array[string] | No | Command arguments |
timeout_seconds | integer | No | Execution timeout (default: 30, max: 300) |
capability_token | string | Yes | Authorization token with capabilities |
metadata | object | No | Additional context for logging |
Response Formats
Success Response (200 OK):
{
"success": true,
"stdout": "total 32\ndrwxrwxrwt 10 root root 4096 Nov 10 10:30 .\ndrwxr-xr-x 20 root root 4096 Oct 15 08:12 ..",
"stderr": "",
"exit_code": 0,
"duration_ms": 45,
"provenance": {
"arm_id": "executor",
"timestamp": "2025-11-10T10:30:00Z",
"action_type": "shell",
"command_hash": "5d41402abc4b2a76b9719d911017c592",
"capabilities_used": ["ShellRead", "FilesystemRead"]
}
}
Blocked Command (403 Forbidden):
{
"success": false,
"error": "Command 'rm' not in allowlist",
"error_type": "CapabilityViolation",
"allowed_commands": ["echo", "cat", "ls", "grep", "curl"]
}
Invalid Token (401 Unauthorized):
{
"success": false,
"error": "Capability token expired or invalid",
"error_type": "AuthenticationFailure"
}
Execution Timeout (408 Request Timeout):
{
"success": false,
"error": "Command execution exceeded timeout of 30 seconds",
"error_type": "ExecutionTimeout",
"partial_output": "...",
"duration_ms": 30000
}
Data Models
Capability Token
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CapabilityToken {
pub token_id: String,
pub granted_capabilities: HashSet<Capability>,
pub expires_at: DateTime<Utc>,
pub issued_to: String,
}
Error Types
#[derive(Debug, thiserror::Error)]
pub enum Error {
#[error("Command '{0}' not in allowlist")]
CommandNotAllowed(String),
#[error("Host '{0}' not in allowlist")]
HostNotAllowed(String),
#[error("Insufficient capability: {command} requires {required:?}")]
InsufficientCapability {
required: Capability,
command: String,
},
#[error("Token expired or invalid")]
InvalidToken,
#[error("Execution timeout")]
Timeout,
#[error("Execution failed: {0}")]
Execution(String),
}
Configuration
Environment Variables
# Executor Configuration
EXECUTOR_PORT=8003
EXECUTOR_TIMEOUT_SECONDS=30
EXECUTOR_MAX_CONCURRENT=10
# Security
EXECUTOR_ALLOWLIST_PATH=/etc/executor/allowlist.yaml
EXECUTOR_HOST_ALLOWLIST_PATH=/etc/executor/hosts.yaml
CAPABILITY_TOKEN_VERIFIER_URL=http://orchestrator:8000/verify-token
# Sandbox
SANDBOX_TYPE=docker # docker, kubernetes, firecracker
SANDBOX_IMAGE=executor-sandbox:latest
SANDBOX_MEMORY_LIMIT=512m
SANDBOX_CPU_LIMIT=1.0
# Logging
LOG_LEVEL=info
LOG_FORMAT=json
PROVENANCE_LOG_PATH=/var/log/executor/provenance.jsonl
Allowlist Configuration
allowlist.yaml:
commands:
# Read-only commands
- name: echo
capabilities:
- ShellRead
description: "Print text"
- name: cat
capabilities:
- ShellRead
- FilesystemRead
description: "Display file contents"
- name: ls
capabilities:
- ShellRead
- FilesystemRead
description: "List directory contents"
# Network commands
- name: curl
capabilities:
- HttpGet
description: "HTTP GET requests"
- name: wget
capabilities:
- HttpGet
description: "Download files"
# Host allowlist
hosts:
- api.github.com
- registry.npmjs.org
- pypi.org
- api.openai.com
# Sandbox configuration
sandbox:
memory_limit: "512m"
cpu_limit: 1.0
timeout_seconds: 30
max_processes: 10
readonly_root: true
writable_paths:
- /tmp
- /workspace
Performance Characteristics
Latency
| Operation | P50 | P95 | P99 |
|---|---|---|---|
| Command validation | 5ms | 10ms | 15ms |
| Sandbox creation | 200ms | 500ms | 1s |
| Command execution | 50ms | 2s | 5s |
| Total latency | 255ms | 2.5s | 6s |
Throughput
- Concurrent Executions: 10 (configurable)
- Queue Depth: 100 requests
- Requests/Second: ~40 (with 10 workers)
Resource Usage
- Memory: 50 MB base + 512 MB per sandbox
- CPU: Minimal (execution in sandbox)
- Disk: 10 MB logs per hour
Testing
Unit Tests
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_capability_validation() {
let mut caps = HashSet::new();
caps.insert(Capability::ShellRead);
let token = CapabilityToken {
token_id: "test".to_string(),
granted_capabilities: caps,
expires_at: Utc::now() + Duration::from_secs(3600),
issued_to: "test".to_string(),
};
assert!(token.can_execute(&Capability::ShellRead));
assert!(!token.can_execute(&Capability::ShellWrite));
}
#[test]
fn test_token_expiration() {
let token = CapabilityToken {
token_id: "test".to_string(),
granted_capabilities: HashSet::new(),
expires_at: Utc::now() - Duration::from_secs(1),
issued_to: "test".to_string(),
};
assert!(token.is_expired());
}
#[tokio::test]
async fn test_command_allowlist() {
let executor = Executor::default_safe();
let mut caps = HashSet::new();
caps.insert(Capability::ShellRead);
caps.insert(Capability::FilesystemRead);
let token = CapabilityToken {
token_id: "test".to_string(),
granted_capabilities: caps,
expires_at: Utc::now() + Duration::from_secs(3600),
issued_to: "test".to_string(),
};
// Should succeed
assert!(executor.validate_command("ls", &token).is_ok());
// Should fail (not in allowlist)
assert!(executor.validate_command("rm", &token).is_err());
}
}
Integration Tests
#[tokio::test]
async fn test_execute_safe_command() {
let executor = Executor::default_safe();
let mut caps = HashSet::new();
caps.insert(Capability::ShellRead);
let token = CapabilityToken {
token_id: "test".to_string(),
granted_capabilities: caps,
expires_at: Utc::now() + Duration::from_secs(3600),
issued_to: "test".to_string(),
};
let req = ExecutionRequest {
action_type: "shell".to_string(),
command: "echo".to_string(),
args: vec!["Hello, World!".to_string()],
timeout_seconds: Some(5),
capability_token: token.token_id.clone(),
metadata: HashMap::new(),
};
let result = executor.execute(req, &token).await.unwrap();
assert!(result.success);
assert_eq!(result.stdout.trim(), "Hello, World!");
assert_eq!(result.exit_code, Some(0));
}
#[tokio::test]
async fn test_blocked_command() {
let executor = Executor::default_safe();
let mut caps = HashSet::new();
caps.insert(Capability::ShellRead);
let token = CapabilityToken {
token_id: "test".to_string(),
granted_capabilities: caps,
expires_at: Utc::now() + Duration::from_secs(3600),
issued_to: "test".to_string(),
};
let req = ExecutionRequest {
action_type: "shell".to_string(),
command: "rm".to_string(), // Not in allowlist
args: vec!["-rf".to_string(), "/".to_string()],
timeout_seconds: Some(5),
capability_token: token.token_id.clone(),
metadata: HashMap::new(),
};
let result = executor.execute(req, &token).await;
assert!(result.is_err());
}
Deployment
Docker Sandbox
Dockerfile:
FROM debian:bookworm-slim
# Install minimal toolset
RUN apt-get update && apt-get install -y \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
# Create non-root user
RUN useradd -m -s /bin/bash executor
USER executor
# Set restrictive umask
RUN echo "umask 077" >> /home/executor/.bashrc
WORKDIR /workspace
# No CMD - controlled by executor service
Kubernetes Configuration
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: executor-arm
namespace: octollm
spec:
replicas: 2
selector:
matchLabels:
app: executor-arm
template:
metadata:
labels:
app: executor-arm
spec:
serviceAccountName: executor-arm
# Security Context
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: executor
image: octollm/executor-arm:1.0
# Container Security
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# Resource Limits
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1000m"
# Port
ports:
- containerPort: 8003
name: http
# Configuration
env:
- name: EXECUTOR_PORT
value: "8003"
- name: EXECUTOR_TIMEOUT_SECONDS
value: "30"
- name: SANDBOX_TYPE
value: "docker"
# Config Volume
volumeMounts:
- name: config
mountPath: /etc/executor
readOnly: true
- name: tmp
mountPath: /tmp
volumes:
- name: config
configMap:
name: executor-config
- name: tmp
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: executor-arm
namespace: octollm
spec:
selector:
app: executor-arm
ports:
- port: 8003
targetPort: 8003
name: http
type: ClusterIP
ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: executor-config
namespace: octollm
data:
allowlist.yaml: |
commands:
- name: echo
capabilities: [ShellRead]
- name: cat
capabilities: [ShellRead, FilesystemRead]
- name: ls
capabilities: [ShellRead, FilesystemRead]
- name: curl
capabilities: [HttpGet]
hosts:
- api.github.com
- pypi.org
sandbox:
memory_limit: "512m"
timeout_seconds: 30
Security Considerations
Threat Model
| Threat | Mitigation |
|---|---|
| Command Injection | Strict allowlist, no shell interpolation |
| Privilege Escalation | Non-root execution, capability restrictions |
| Resource Exhaustion | Timeouts, memory limits, process limits |
| Data Exfiltration | Host allowlist, network namespace isolation |
| Sandbox Escape | Defense in depth: seccomp, AppArmor, read-only root |
| Token Theft | Short-lived tokens, secure storage, HTTPS only |
Security Best Practices
- Never Run as Root: All executions use unprivileged users
- Minimal Capabilities: Grant only required capabilities
- Short-Lived Tokens: Tokens expire after 1 hour by default
- Audit Logging: Log all executions with provenance metadata
- Network Isolation: Use network policies in Kubernetes
- Regular Updates: Keep sandbox images and tools updated
- Penetration Testing: Regular security assessments
See Also
- Orchestrator Component - Token issuance and coordination
- Planner Arm - Task decomposition that generates execution plans
- Safety Guardian Arm - Pre-execution validation
- Security Architecture - System-wide security model
- Capability Isolation - Detailed capability design
- API Reference - Complete API documentation
Document Status: Phase 1 Complete Last Updated: 2025-11-10 Maintainer: OctoLLM Core Team Next Review: 2025-12-10