Skip to content

Cloud Infrastructure Strategy

Module Purpose: This module defines the underlying cloud architecture required to support Cazo's "Private AI" strategy. It focuses on isolating tenant data and providing GPU-accelerated compute for self-hosted inference.

Strategic Goals

  1. Data Sovereignty: Ensure customer PII (chat logs, images) never leaves our VPC.
  2. Cost Predictability: Use reserved instances for predictable 24/7 inference workloads.
  3. Security: Strict network segmentation between "Public Web Tier" and "Private AI Tier".

Use Case Matrix

ID Capability Description
[CLD-001] VPC Provisioning Setup isolated network with public/private subnets.
[CLD-002] Security Groups Whitelisting traffic for internal AI APIs.
[CLD-003] Inference Cluster Provisioning GPU nodes for Ollama/vLLM.
[CLD-004] Auto-Scaling Dynamic scaling based on inference queue depth.

WhatsApp Bot Infrastructure Requirements

The WhatsApp Bot is a key workload that spans both Public Web Tier (Meta API) and Private AI Tier (NLP Inference).

Infrastructure Components

Component Tier Purpose Scaling
WhatsApp Webhook Public Receive messages from Meta Cloud API Horizontal (Lambda/ECS)
Message Queue Private Buffer incoming messages SQS/Redis
NLP Inference Private AI Intent classification, sentiment GPU nodes
Session State Store Private Persist booking state (TTL-based) Redis Cluster
Audit Logs Private Compliance trail, immutable S3 + Athena

Security Considerations

  • Webhook Authentication: Verify Meta signature on all incoming requests
  • Encryption: TLS 1.3 for all API calls, AES-256 for stored messages
  • Data Residency: Customer PII stays within VPC (no external logging)
  • Audit Trail: All conversations logged for GDPR/DPDP compliance