In Part 1 of this series, we introduced our Stock Market Analytics Platform built on Google Cloud Platform (GCP). We covered the architecture, infrastructure deployment with Terraform, batch processing with Dataproc/Spark, and real-time data processing with Kafka.
In this second part, we'll explore the remaining components: data transformation with dbt, workflow orchestration with Airflow, data visualization, performance optimization, and monitoring.
After collecting both historical and real-time stock data in BigQuery, we use dbt (data build tool) to transform this raw data into analytics-ready models.
Our dbt project follows a three-layer architecture:
stg_daily_prices
: Cleans historical daily price datastg_realtime_prices
: Standardizes streaming datastg_company_metadata
: Company information and sector classificationint_daily_metrics
: Combines batch and streaming dataint_technical_indicators
: Calculates technical analysis metricsint_volatility_measures
: Computes volatility and risk metricsmart_stock_performance
: Stock-level performance metricsmart_sector_performance
: Sector-level aggregationsmart_market_indicators
: Market-wide indicatorsOur transformations include:
Each dbt model includes: