Skip to main content

Self Hosted Virtual Machine

Guidelines for Setting Up and Maintaining a Virtual Machine for ETL Workloads. This document outlines the requirements and best practices for setting up and maintaining a virtual machine to run ETL p…

Guidelines for Setting Up and Maintaining a Virtual Machine for ETL Workloads

This document outlines the requirements and best practices for setting up and maintaining a virtual machine to run ETL pipelines (e.g., dbt, Airbyte, Stitch, custom Docker workloads), tailored to support TargetBoard. Adhering to these guidelines will ensure optimal performance, security, and compatibility with our services.

As you will be hosting your own VM, you are responsible for its security and maintenance.

VM Type and Specifications

Minimal Compute Resources
  • CPU:
    • Minimum of 2 vCPUs.
    • Memory: At least 8 GB RAM.
  • Disk:
    • Allocate a minimum of 256 GB disk storage.
    • Enable storage auto-scaling (if supported by the cloud provider).
  • Supported Platforms:
    • AWS, Google Cloud, Azure, or equivalent.
Operating System
  • Recommended OS: Linux (Debian).
  • Keep the OS minimal (disable unnecessary packages and services).
Disk and Storage
  • Use SSD-backed storage for ETL workloads.
  • Configure file system with sufficient I/O throughput for Docker containers.
  • Enable disk monitoring and alerts for low free space.

Network and Security Configuration

Firewall and IP Whitelisting
  • Deploy the VM behind a firewall.
  • Restrict inbound traffic only to whitelisted IPs used by TargetBoard.
    • TargetBoard VPN:
      • 69.55.59.137
  • If the database is hosted on-premises, ensure one of the following:
    • It resides on the same VPC as the VM.
    • Alternatively, its IP address is explicitly added to the whitelist above.
Outbound Ports
  • The VM requires outbound access on the following ports/protocols:
    • HTTP (80)
    • HTTPS (443)
    • UDP (for services that require lightweight data transfer or DNS resolution)
    • TCP (for postgres connection - whitelist the postgres port).
Encryption and Security
  • Encryption at Rest:
    • Enable disk encryption with provider-managed keys (AWS KMS, GCP KMS, etc.).
  • Encryption in Transit:
    • Enforce TLS/SSL for all connections.
    • Avoid plain-text passwords in environment files.
Access and Permissions
  • Provide TargetBoard support team with a restricted SSH user account if access is required.
  • Enforce key-based authentication (no password login).
  • Apply role-based access controls for multi-user environments.

Maintenance and Updates

Maintenance Windows
  • Schedule maintenance windows during off-peak hours (Friday or Sunday)
  • Communicate downtime to stakeholders.
Updates and Patching
  • Apply automatic security updates to the OS.
  • Regularly update Docker and Docker Compose to the latest stable versions.
Backups and Recovery
  • VM Snapshots:
    • Take weekly full VM snapshots as a minimum.
    • Retain backups for at least 1 month.
  • Recovery:
    • Test VM restoration quarterly to ensure business continuity.
  • Monitoring:
    • Use provider tools (CloudWatch, Stackdriver, or equivalent).
    • Track CPU, memory, disk usage, and network throughput.
  • Logging:
    • Centralized logs are recommended (Cloud Logging, ELK, or Datadog).
    • Enable auditing of SSH access.
Containerization
  • Install Docker and Docker Compose for ETL workloads.
  • Keep container images updated with security patches.

How did we do?

Self Hosted Database

Contact