Service Detection

Learn how ProRT-IP identifies services, versions, and operating systems through intelligent protocol analysis.

Overview

ProRT-IP's service detection combines two complementary approaches for industry-leading accuracy:

  1. Regex-Based Detection (service_db.rs): Fast pattern matching using nmap-service-probes database (187 probes, 5,572 match patterns)
  2. Protocol-Specific Detection: Deep protocol parsing for accurate version and OS information (Sprint 5.2)

Detection Coverage

ProtocolCoverageImprovementConfidence
HTTP25-30%+3-5pp0.5-1.0
SSH10-15%+2-3pp0.6-1.0
SMB5-10%+2-3pp0.7-0.95
MySQL3-5%+1-2pp0.7-0.95
PostgreSQL3-5%+1-2pp0.7-0.95
Total46-65%+10-15ppVariable

Overall Detection Rate: 85-90% (improved from 70-80% baseline)

Key Features

  • Protocol-Aware Parsing: Understands protocol structure beyond regex patterns
  • OS Detection: Extracts OS hints from banners and version strings
  • Version Mapping: Maps package versions to OS releases (e.g., "4ubuntu0.3" → Ubuntu 20.04)
  • Priority System: Highest-priority detectors run first (HTTP=1, PostgreSQL=5)
  • Fallback Chain: Protocol-specific → Regex → Generic detection
  • Performance: <1% overhead compared to regex-only detection

Architecture

ProtocolDetector Trait

All protocol modules implement the ProtocolDetector trait for consistent detection:

#![allow(unused)]
fn main() {
pub trait ProtocolDetector {
    /// Detect service from response bytes
    fn detect(&self, response: &[u8]) -> Result<Option<ServiceInfo>, Error>;

    /// Base confidence level for this detector
    fn confidence(&self) -> f32;

    /// Priority (1=highest, 5=lowest)
    fn priority(&self) -> u8;
}
}

ServiceInfo Structure

Unified data structure for all detection results:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
pub struct ServiceInfo {
    pub service: String,           // Service name (e.g., "http", "ssh")
    pub product: Option<String>,   // Product name (e.g., "nginx", "OpenSSH")
    pub version: Option<String>,   // Version string (e.g., "1.21.6", "8.2p1")
    pub info: Option<String>,      // Additional info (protocol, OS hints)
    pub os_type: Option<String>,   // Detected OS (e.g., "Ubuntu 20.04 LTS")
    pub confidence: f32,           // Confidence score (0.0-1.0)
}
}

Detection Flow

Raw Response
    ↓
Protocol Detection (Priority Order)
    ↓
HTTP (Priority 1) → SSH (2) → SMB (3) → MySQL (4) → PostgreSQL (5)
    ↓
Match Found? → YES → Return ServiceInfo
    ↓ NO
Regex Detection (service_db.rs)
    ↓
Match Found? → YES → Return Basic ServiceInfo
    ↓ NO
Generic Detection (Port-based)

Protocol Modules

1. HTTP Fingerprinting

Priority: 1 (Highest) Confidence: 0.5-1.0 Coverage: 25-30% of services

Detection Method

Parses HTTP response headers to extract:

  • Server: Web server name and version (e.g., "nginx/1.21.6")
  • X-Powered-By: Technology stack (e.g., "PHP/7.4.3")
  • X-AspNet-Version: ASP.NET version

Version Extraction

#![allow(unused)]
fn main() {
// Example: "nginx/1.21.6 (Ubuntu)" → product="nginx", version="1.21.6", os="Ubuntu"
if let Some(server) = headers.get("Server") {
    if let Some(slash_pos) = server.find('/') {
        product = server[..slash_pos];
        version = server[slash_pos+1..];
    }
}
}

OS Detection

  • Apache: (Ubuntu), (Debian), (Red Hat) in Server header
  • nginx: OS info after version string
  • IIS: Infers Windows from Server: Microsoft-IIS/10.0

Confidence Calculation

Base: 0.5
+ 0.2 if Server header present
+ 0.15 if version extracted
+ 0.1 if OS detected
+ 0.05 if X-Powered-By present
Maximum: 1.0

Example Output

Service: http
Product: nginx
Version: 1.21.6
OS: Ubuntu
Info: Ubuntu + PHP/7.4.3
Confidence: 0.9

2. SSH Banner Parsing

Priority: 2 Confidence: 0.6-1.0 Coverage: 10-15% of services

Detection Method

Parses SSH protocol banners (RFC 4253 format):

SSH-protoversion-softwareversion [comments]

Examples:

  • SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.3
  • SSH-2.0-Dropbear_2020.81
  • SSH-1.99-Cisco-1.25

Version Extraction

#![allow(unused)]
fn main() {
// Split by underscore or hyphen
"OpenSSH_8.2p1" → product="OpenSSH", version="8.2p1"
"Dropbear_2020.81" → product="Dropbear", version="2020.81"
"libssh-0.9.0" → product="libssh", version="0.9.0"
}

OS Detection

Ubuntu Mapping (digit before "ubuntu" keyword):

"4ubuntu0.3" → digit='4' → Ubuntu 20.04 LTS (Focal)
"5ubuntu0.1" → digit='5' → Ubuntu 22.04 LTS (Jammy)
"6ubuntu0.0" → digit='6' → Ubuntu 24.04 LTS (Noble)

Debian Mapping:

"deb9" → Debian 9 (Stretch)
"deb10" → Debian 10 (Buster)
"deb11" → Debian 11 (Bullseye)
"deb12" → Debian 12 (Bookworm)

Red Hat Mapping:

"el6" → Red Hat Enterprise Linux 6
"el7" → Red Hat Enterprise Linux 7
"el8" → Red Hat Enterprise Linux 8

Confidence Calculation

Base: 0.6
+ 0.1 if protocol version present
+ 0.2 if software version extracted
+ 0.1 if OS hint found
Maximum: 1.0

3. SMB Dialect Negotiation

Priority: 3 Confidence: 0.7-0.95 Coverage: 5-10% of services

Detection Method

Analyzes SMB protocol responses to determine dialect and infer Windows version.

SMB2/3 Magic Bytes: 0xFE 'S' 'M' 'B' (4 bytes) SMB1 Magic Bytes: 0xFF 'S' 'M' 'B' (4 bytes)

Dialect Extraction

Dialect code at offset 0x44 (72 bytes) in SMB2 Negotiate Response:

#![allow(unused)]
fn main() {
const DIALECT_OFFSET: usize = 0x44;
let dialect = u16::from_le_bytes([
    response[DIALECT_OFFSET],
    response[DIALECT_OFFSET + 1]
]);
}

Windows Version Mapping

Dialect CodeSMB VersionWindows VersionConfidence
0x0202SMB 1.0Windows XP/20030.75
0x02FFSMB 2.002Windows Vista/20080.80
0x0210SMB 2.1Windows 7/2008 R20.85
0x0300SMB 3.0Windows 8/20120.90
0x0302SMB 3.02Windows 8.1/2012 R20.90
0x0311SMB 3.11Windows 10/2016+0.95

Example Output

Service: microsoft-ds
Product: Samba/Windows SMB
Version: SMB 3.11
OS: Windows 10/2016+
Info: SMB 3.11 (Windows 10/2016+)
Confidence: 0.95

4. MySQL Handshake Parsing

Priority: 4 Confidence: 0.7-0.95 Coverage: 3-5% of services

Detection Method

Parses MySQL protocol handshake packets:

Structure:

Bytes 0-3: Packet length (3 bytes, little-endian) + sequence ID (1 byte)
Byte 4: Protocol version (always 10 for MySQL 5.x+)
Bytes 5+: Server version string (null-terminated)

Version Extraction

#![allow(unused)]
fn main() {
// Protocol version must be 10
if response[4] != 10 { return None; }

// Extract null-terminated version string
let version_str = extract_until_null(&response[5..]);
// Example: "8.0.27-0ubuntu0.20.04.1"
}

OS Detection

Ubuntu Version Extraction (handles "0.X.Y" pattern):

#![allow(unused)]
fn main() {
// "0ubuntu0.20.04.1" → skip leading "0." → "Ubuntu 20.04"
let parts = version_part.split('.').collect();
if parts.len() >= 3 && parts[0] == "0" {
    format!("Ubuntu {}.{}", parts[1], parts[2])
}
}

MySQL vs MariaDB:

Contains "MariaDB" → product="MariaDB"
Otherwise → product="MySQL"

Confidence Calculation

Base: 0.7
+ 0.15 if version extracted
+ 0.1 if OS/distribution hint found
Maximum: 1.0

5. PostgreSQL ParameterStatus Parsing

Priority: 5 (Lowest) Confidence: 0.7-0.95 Coverage: 3-5% of services

Detection Method

Parses PostgreSQL startup response messages:

Message Types:

  • 'R' (0x52): Authentication
  • 'S' (0x53): ParameterStatus (contains server_version)
  • 'K' (0x4B): BackendKeyData
  • 'Z' (0x5A): ReadyForQuery
  • 'E' (0x45): ErrorResponse

ParameterStatus Format

Byte 0: 'S' (0x53)
Bytes 1-4: Message length (4 bytes, big-endian, includes length field)
Bytes 5+: Parameter name (null-terminated) + Value (null-terminated)

Version Extraction

#![allow(unused)]
fn main() {
// Scan for ParameterStatus messages with parameter "server_version"
if msg_type == b'S' {
    let length = u32::from_be_bytes(...);
    let content = &response[pos+5..pos+1+length];

    if param_name == "server_version" {
        // Extract value: "14.2 (Ubuntu 14.2-1ubuntu1)"
        version = parse_null_terminated_value(content);
    }
}
}

OS Detection

#![allow(unused)]
fn main() {
// "14.2 (Ubuntu 14.2-1ubuntu1)" → version="14.2", os="Ubuntu"
// "13.7 (Debian 13.7-1.pgdg110+1)" → version="13.7", os="Debian"
// "12.9 (Red Hat 12.9-1RHEL8)" → version="12.9", os="Red Hat Enterprise Linux"
}

Confidence Calculation

Base: 0.7
+ 0.15 if version extracted
+ 0.1 if OS hint found
Maximum: 1.0

Confidence Scoring

Scoring Philosophy

Confidence reflects information richness rather than detection certainty:

  • 0.5-0.6: Basic detection (service identified, no version)
  • 0.7-0.8: Good detection (service + version)
  • 0.9-1.0: Excellent detection (service + version + OS + additional info)

Per-Protocol Ranges

ProtocolMinMaxTypicalNotes
HTTP0.51.00.75Depends on header richness
SSH0.61.00.85Usually has version + OS
SMB0.70.950.90Dialect → Windows version
MySQL0.70.950.85Version usually present
PostgreSQL0.60.950.85ParameterStatus reliable

Usage Examples

Basic Service Scan

# Scan with service detection enabled (default)
prtip -sS -sV -p 80,22,445,3306,5432 192.168.1.0/24

# Output format
PORT     STATE  SERVICE      VERSION
22/tcp   open   ssh          OpenSSH 8.2p1 (Ubuntu 20.04 LTS)
80/tcp   open   http         nginx/1.21.6 (Ubuntu)
445/tcp  open   microsoft-ds SMB 3.11 (Windows 10/2016+)
3306/tcp open   mysql        MySQL 8.0.27 (Ubuntu 20.04)
5432/tcp open   postgresql   PostgreSQL 14.2 (Ubuntu)

Advanced Service Detection

# Aggressive scan with all detection methods
prtip -A -p 1-1000 target.com

# Fast scan (disable protocol-specific detection)
prtip -sS -p- --no-service-detect target.com

# Service detection only (no port scan)
prtip -sV -p 80,443,8080 --no-ping target.com

Programmatic Usage

#![allow(unused)]
fn main() {
use prtip_core::detection::{
    HttpFingerprint, SshBanner, ProtocolDetector
};

// HTTP detection
let detector = HttpFingerprint::new();
if let Ok(Some(info)) = detector.detect(response) {
    println!("Service: {}", info.service);
    println!("Product: {:?}", info.product);
    println!("Version: {:?}", info.version);
    println!("Confidence: {:.2}", info.confidence);
}

// SSH detection
let detector = SshBanner::new();
if let Ok(Some(info)) = detector.detect(banner) {
    println!("Product: {:?}", info.product);
    println!("OS: {:?}", info.os_type);
}
}

Performance Characteristics

Overhead Analysis

ProtocolParsing TimeMemoryCPU
HTTP~2-5μs2-4 KBNegligible
SSH~1-3μs1-2 KBNegligible
SMB~0.5-1μs512 BNegligible
MySQL~1-2μs1 KBNegligible
PostgreSQL~2-4μs2 KBNegligible

Benchmarks

Sprint 5.2 introduces <1% overhead vs regex-only detection:

Regex-Only Detection:     5.1ms per target
Protocol + Regex:         5.15ms per target
Overhead:                 0.05ms (0.98%)

Scalability Features

  • Zero allocations: Uses references and slices for maximum efficiency
  • Early exit: Returns None immediately if magic bytes don't match
  • Stateless: No shared mutable state, safe for concurrent use
  • Fallback chain: Fast rejection before expensive regex matching

Integration

With Existing service_db.rs

Protocol-specific detection complements regex-based detection:

  1. Priority Order: Protocol detectors run BEFORE regex matching
  2. Higher Confidence: Protocol parsing provides more accurate version/OS info
  3. Fallback: If protocol detection returns None, regex matching proceeds
  4. Combination: Some services may match both (protocol takes precedence)

Detection Pipeline

#![allow(unused)]
fn main() {
// Pseudo-code for detection pipeline
fn detect_service(response: &[u8], port: u16) -> ServiceInfo {
    // 1. Try protocol-specific detection (priority order)
    for detector in [http, ssh, smb, mysql, postgresql] {
        if let Some(info) = detector.detect(response)? {
            return info; // High-confidence result
        }
    }

    // 2. Fallback to regex matching (service_db.rs)
    if let Some(info) = service_db.match_response(response, port) {
        return info; // Medium-confidence result
    }

    // 3. Generic detection (port-based)
    return generic_service_for_port(port); // Low-confidence result
}
}

Troubleshooting

Issue: Low Detection Rate

Symptom: Services detected as "unknown" despite known protocols

Possible Causes:

  1. Firewall blocking probe packets
  2. Service using non-standard banner format
  3. Encrypted protocol (TLS/SSL wrapper)

Solutions:

# Try different probe types
prtip -sS -sV --probe-all target.com

# Disable TLS for HTTP services
prtip -sV --no-tls -p 443 target.com

# Verbose output shows detection attempts
prtip -sV -v target.com

Issue: Incorrect OS Detection

Symptom: Wrong OS version reported (e.g., Ubuntu 14.04 instead of 20.04)

Possible Causes:

  1. Custom banner modification by admin
  2. Container/virtualization masking host OS
  3. Load balancer presenting different banner

Solutions:

  • Cross-reference with other detection methods (TTL, TCP options)
  • Use --os-fingerprint for active OS detection
  • Verify banner format with manual connection: telnet target.com 22

Issue: Performance Degradation

Symptom: Service detection slower than expected

Possible Causes:

  1. Too many concurrent probes
  2. Network latency
  3. Service rate limiting

Solutions:

# Reduce parallelism
prtip -sV --max-parallel 50 target.com

# Faster timing template (less accurate)
prtip -sV -T4 target.com

# Disable protocol-specific detection
prtip -sS --no-service-detect target.com

See Also


References:

  1. Nmap Service Probes: nmap-service-probes database (187 probes, 5,572 patterns)
  2. RFC 4253: SSH Protocol Architecture
  3. MS-SMB2: SMB 2 and 3 Protocol Specification
  4. MySQL Protocol: Client/Server Protocol Documentation
  5. PostgreSQL Protocol: Frontend/Backend Protocol Documentation

Sprint 5.2 Achievement:

  • Detection rate improvement: +10-15pp (70-80% → 85-90%)
  • Test coverage: 23 new unit tests (2,111 total passing as of Phase 6)
  • Performance overhead: <1% vs regex-only detection
  • New protocol modules: 5 (HTTP, SSH, SMB, MySQL, PostgreSQL)