Environment Strategy

Section 3.4: Environment Strategy (Dev, Staging, Prod) (Organizing Our Kitchen Stations)

We adopt a standard three-environment strategy to ensure stability and quality.

  • 3.4.1 Purpose and Configuration Goals for Each Environment

    • Development (Dev):

      • Purpose: For developers to write and test code locally/in personal cloud workspaces. Focus on iteration speed and individual productivity.

      • Configuration: Local machines (with Docker for consistency) or cloud-based IDEs (GitHub Codespaces, AWS Cloud9, SageMaker Studio). Access to sampled, anonymized, or synthetic data. Minimal resources. Uses feature branches.

    • Staging (Pre-Production):

      • Purpose: To test code changes in an environment that mirrors production before deploying to live users. Focus on integration, end-to-end testing, and performance validation.

      • Configuration: Dedicated AWS account. Infrastructure managed by Terraform, identical or scaled-down version of Prod. Deploys from main branch after PR merge. Uses staging-specific data sources (e.g., a separate S3 bucket with a larger, more realistic dataset than dev, but not live prod data). Runs full integration tests, load tests.

    • Production (Prod):

      • Purpose: To serve live user traffic. Focus on stability, reliability, performance, and security.

      • Configuration: Dedicated AWS account. Infrastructure managed by Terraform. Deploys from main branch after successful Staging validation and manual approval. Uses live production data sources. Comprehensive monitoring and alerting.

  • 3.4.2 Data Access Strategy and Permissions Across Environments

    • Dev: Read-only access to specific, small, and potentially anonymized/synthetic datasets (e.g., sample of S3 data). No access to production databases or sensitive user data.

    • Staging: Read-only access to dedicated staging data sources that mimic production data structure and volume but are not live production data. This might be a regularly refreshed, sanitized snapshot of production data or a large, curated test dataset.

    • Prod:

      • Data Ingestion Pipeline: Read access to raw data sources (scraping targets, APIs). Write access to its S3 processed data bucket.

      • Training Pipeline: Read access to processed data in S3 (Prod). Write access to Model Registry (W&B) and artifact stores.

      • Inference Pipeline (LLM path): Read access to processed data in S3 (Prod). Write access to the enriched data store for the FastAPI backend.

      • FastAPI Backend: Read access to its enriched data store. No direct write access to core data pipelines, only to its own logs.

    • IAM Roles: Define specific IAM roles for each pipeline/service within each environment to enforce least privilege.