AWS Observability with Splunk

Contact us to book this course
Delivery methods icon
Delivery methods

On-Site, Virtual

Duration icon
Duration

1 day

This comprehensive, vertical-specific training demonstrates Splunk's strategic value for enterprise observability and security monitoring on AWS. Participants will deploy distributed Splunk architectures with indexer clustering, search head clustering, and SmartStore for S3-based data tiering.

The curriculum covers the complete data pipeline from Universal Forwarders through indexing to search optimization, with deep integration of AWS-native data sources including CloudTrail, VPC Flow Logs, GuardDuty, and Security Hub. By day's end, students will architect production Splunk deployments with multi-AZ high availability, implement site-aware replication for fault tolerance, and optimize both indexing throughput and search performance using parallelIngestionPipelines, tstats, and data model acceleration.

Learning Objectives

  • Deploy and configure distributed Splunk Enterprise on AWS with indexer clusters and site-aware replication across multiple Availability Zones.

  • Implement SmartStore with Amazon S3 for cost-effective data tiering, configuring cache sizing formulas and monitoring cache hit rates for optimal search performance.

  • Master Advanced SPL including subsearches, lookups, macros, and the eval command to build correlation searches that link IAM changes in CloudTrail to anomalous traffic in VPC Flow Logs.

  • Configure search head clustering with Raft consensus for automatic captain election, deployer workflows for app distribution, and ALB integration with sticky sessions.

  • Integrate AWS-native security data sources using Splunk Add-on for AWS, including GuardDuty findings, Security Hub aggregation, and real-time CloudWatch metrics streaming.

  • Optimize indexing performance using parallelIngestionPipelines (for throughput increase) and search performance with tstats for faster aggregations.

  • Execute Business-Aligned Observability by completing a vertical-specific lab (Fintech, Healthcare, or Media) implementing production monitoring patterns for AWS workloads.

Who Should Attend

Splunk Administrators, Cloud Architects, Security Engineers, and DevOps/SREs responsible for deploying enterprise observability platforms on AWS. Previous experience with Linux CLI and basic AWS services (EC2, S3, IAM, VPC) is assumed. Familiarity with log management concepts is helpful but not required.

Course outline

  • Distributed Architecture: Search Heads, Indexers, Forwarders, and Cluster Manager roles
  • The Bucket Lifecycle: Hot → Warm → Cold → Frozen and retention policies
  • EC2 Instance Selection: c5/c6i (compute-optimized) for indexers, r5/r6i (memory-optimized) for search heads
  • Lab: Installing Splunk Enterprise on Amazon EC2 with proper security groups and IAM roles 
  • Universal Forwarders vs. Heavy Forwarders: When to use each architecture pattern
  • The Indexing Pipeline: Input Queue → Parsing Queue → Indexing Queue → Bucket Write
  • HTTP Event Collector (HEC): Token-based ingestion for applications and containers
  • Pipeline Health Monitoring: Queue fill percentages and bottleneck identification
  • Demo: Configuring data inputs with props.conf and transforms.conf for field extraction 
  • Splunk Add-on for AWS: CloudTrail, VPC Flow Logs, Config, and CloudWatch integration
  • S3-based Ingestion: SQS-based S3 input for scalable log collection
  • AWS Security Data: GuardDuty findings correlation and Security Hub aggregation
  • Real-time Streaming: CloudWatch Metrics via Kinesis Firehose to HEC
  • Lab: Onboarding CloudTrail and VPC Flow Logs with proper index routing  
  • Search Performance Principles: Filter early, specify time, use indexed fields
  • Advanced SPL: Subsearches, lookups, macros, and the eval command
  • tstats: 10-100x faster searches using indexed metadata (tsidx)
  • Data Model Acceleration: Pre-calculated summaries for instant dashboard queries
  • Demo: Converting slow searches to tstats and building accelerated data models  
  • Fintech Lab: High-Frequency Fraud Detection
  • Healthcare Lab: Redacting PII with Sensitive Data Scanner
  • Media Lab: Monitoring Global Streaming Performance
  • Indexing Optimization: parallelIngestionPipelines = 2-4 for 30-50% throughput increase per pipeline
  • Compression Tuning: zstd for best compression-to-CPU ratio on new deployments
  • Disaster Recovery: Cross-region S3 replication and RTO/RPO planning
  • Monitoring with Internal Indexes: _internal metrics, _audit for search tracking, queue health dashboards

Ready to accelerate your team's innovation?