Skip to main content

Directors: Overview

Directors are the core data processing engines within the DataStream platform, responsible for collecting, processing, transforming, and routing security telemetry data from various sources to target destinations. They serve as the central orchestration layer that maintains data sovereignty by keeping sensitive information within your environment while providing centralized cloud-based management.

Provider → Device → Preprocessing → Pipeline → Postprocessing → Target → Consumer

Director (executes the entire processing flow as a service)

Image directors-flow

What is a Director?

A Director is a lightweight, containerized service that acts as a secure data processing hub in your infrastructure. It connects securely to the DataStream cloud platform for configuration management while ensuring all sensitive security data remains within your controlled environment.

Key Capabilities

A Director provides comprehensive data processing capabilities, ingesting security data from multiple sources including syslog, APIs, files, and databases. It applies real-time transformation and normalization using YAML-defined pipelines, supports multiple security schemas (ASIM, OCSF, ECS, CIM, UDM), and routes processed data to various destinations such as SIEM platforms, data lakes, and security tools.

From a security and compliance perspective, Directors maintain data sovereignty by processing all data locally. They establish outbound-only HTTPS connections to cloud management services, provide comprehensive audit logging and activity tracking, and support enterprise security requirements and compliance frameworks.

Directors are designed for scalability and reliability, offering horizontal scaling through clustering capabilities and high availability configurations for mission-critical environments. They provide resource-efficient processing with minimal infrastructure requirements and support automatic failover and load balancing in clustered deployments.

Platform Management Options

DataStream provides two distinct management approaches for Directors, each designed for different organizational needs and security requirements:

Self-Managed Directors

Self-Managed Directors provide complete control over the deployment and management of your data processing infrastructure. This option is ideal for organizations with specific security requirements or existing infrastructure management processes.

Self-Managed Directors give you full control over your deployment environment and configuration. You handle updates, patches, and maintenance directly, and can implement custom security controls and compliance configurations. This approach integrates with existing infrastructure monitoring and management tools, and supports air-gapped or restricted network environments.

Suitable For:

  • Organizations with strict data governance requirements
  • Environments with existing container orchestration systems
  • Companies requiring custom security configurations
  • Regulated industries with specific compliance needs

Managed Directors (Enterprise Feature)

Managed Directors offer a fully-managed service where VirtualMetric handles the infrastructure management, monitoring, and maintenance of your Directors while still maintaining data sovereignty.

With Managed Directors, VirtualMetric handles automated deployment and configuration management, along with proactive monitoring and maintenance. You receive automatic updates and security patches, 24/7 support and incident response, and performance optimization with capacity planning.

Enterprise Feature

Managed Directors are available as part of the Enterprise subscription.

Suitable For:

  • Organizations seeking reduced operational overhead
  • Teams without dedicated infrastructure management resources
  • Companies prioritizing time-to-value over operational control
  • Environments requiring guaranteed SLA and support coverage

Installation Types

Directors support different installation architectures to accommodate various operational requirements and scale needs:

Standalone Installation

Standalone is the default installation type, designed for straightforward deployments where a single Director instance handles all data processing needs. This configuration offers simplified management, resource-efficient operation, and quick deployment and setup.

Standalone Limitations

Standalone installations do not include built-in high availability or load balancing. A single Director instance represents a single point of failure, with limited horizontal scaling capabilities. Backup and disaster recovery procedures must be managed manually.

Recommended For:

  • Small to medium-scale deployments
  • Development and testing environments
  • Organizations with basic availability requirements
  • Initial proof-of-concept implementations

Clustered Installation (Enterprise Feature)

Clusters provide high availability by grouping multiple Directors together for automatic failover. When one Director in a cluster fails, the remaining Directors take over its workload, ensuring continuous telemetry processing. Agents and Devices connect to the cluster as a whole rather than to individual Directors, so operations continue uninterrupted as long as at least one Director remains healthy.

To create a cluster, you first deploy Directors as standalone instances, then group them through the Clusters tab in the Directors management interface. A cluster requires a minimum of 3 Directors, and the count must be an odd number (3, 5, 7...) to maintain quorum and prevent split-brain scenarios. Routes configured for a cluster are automatically replicated across all member Directors.

Enterprise Feature

Clustered installations are available as part of the Enterprise subscription.

Recommended For:

  • Mission-critical security data processing
  • High-volume environments requiring guaranteed availability
  • Organizations with strict SLA requirements
  • Production deployments requiring enterprise-grade reliability

Director Architecture and Data Flow

Directors operate as secure intermediaries between your security data sources and target destinations, implementing a data sovereignty model that keeps sensitive information within your controlled environment.

Data Processing Architecture

The Input Layer handles multiple simultaneous data source connections through protocol-agnostic ingestion supporting Syslog, REST APIs, and file monitoring. It provides both real-time streaming and batch processing capabilities, with built-in buffering and queuing for reliability.

The Processing Layer executes YAML-defined transformation pipelines that perform multi-schema normalization and enrichment. This layer applies real-time data validation and quality checks, and enables custom logic implementation through processors.

The Output Layer manages multi-destination routing and delivery, adapting formats for different target systems. It handles delivery confirmation and retry mechanisms, with performance optimization tailored for various endpoint types.

Security and Connectivity Model

Directors use an outbound-only communication model where they initiate all cloud platform connections. This design eliminates the need for inbound firewall rules. All cloud interactions use encrypted HTTPS communication with certificate-based authentication and authorization.

The architecture maintains strict data sovereignty: all security data processing occurs locally, and no sensitive data is transmitted to cloud services. Cloud synchronization is limited to configuration and metadata only. A complete audit trail supports compliance and governance requirements.

Agent Pre-Processing Architecture

VirtualMetric Agents support optional pipeline-based pre-processing before sending data to Directors. This distributed processing model reduces Director workload and enables edge-based data transformation.

Processing Models

In the Traditional Model, the Agent collects logs locally at the endpoint and sends raw data to the Director. The Director then processes data through pipelines and forwards the processed data to targets.

In the Pre-Processing Model, the Agent collects logs locally at the endpoint and processes data through configured pipelines before sending pre-processed data to the Director. The Director forwards data to targets, with optional additional processing if needed.

Pre-Processing Benefits

Pre-processing reduces Director processing load through distributed computation and lowers network bandwidth consumption via edge-based filtering and transformation. This approach improves scalability for large-scale deployments with multiple Agents and enables faster data delivery through parallel processing at collection points.

From an architectural perspective, edge-based filtering reduces unnecessary data transmission while local transformation enables compliance requirements at the data source. The distributed processing model supports horizontal scaling and reduces central processing bottlenecks in high-volume environments.

Pre-Processing Configuration

Agent pre-processing is configured through the Director's device configuration for that Agent. Pipelines assigned to Agent devices execute locally on the Agent, using the same pipeline syntax and processors available as Director pipelines. Configuration is managed centrally through the Director for consistency.

tip

Agent pipelines support hot configuration reload. Changes made in the Director interface are synchronized to Agents automatically without requiring an Agent restart.

Use Cases for Agent Pre-Processing

In high-volume environments, you can filter non-essential logs at the collection point before transmission, reduce network bandwidth for high-volume log sources, and distribute processing load across multiple Agent endpoints.

For compliance and privacy, mask sensitive data (PII, credentials) at the source before transmission. Apply regulatory transformations at the data collection point to ensure data compliance before leaving the endpoint network.

In edge computing scenarios, process data locally in remote or branch offices to minimize data transmission to the central Director. This approach supports disconnected or intermittent connectivity scenarios.

For cost optimization, reduce Director infrastructure requirements through distributed processing. Lower network bandwidth costs via edge-based filtering and optimize central processing capacity allocation.

Configuration Considerations

When implementing Agent pre-processing, balance processing load between Agents and Directors based on infrastructure capacity. Consider network latency and bandwidth when deciding what to process at the edge. Use Agent pre-processing for filtering and basic transformations, reserving complex processing (enrichment, external lookups) for the Director when possible. Monitor Agent resource utilization to prevent endpoint performance impact.