Anthropic Models on Amazon Bedrock
Contact us to book this courseOn-Site, Virtual
1 day
This course teaches developers and cloud practitioners to build production AI applications using Anthropic’s Claude models on Amazon Bedrock. Participants learn to invoke Claude programmatically via boto3, implement RAG with Knowledge Bases, scaffold Bedrock integration code using Claude Code, configure production guardrails, and deploy serverless APIs on AWS Lambda. The course covers billing relationships, prompt caching, cost optimization, governance and compliance considerations, and architectural patterns for multi-region deployment. All content is delivered at the 200 level, assuming working knowledge of AWS and Python.
Learning Objectives
Through three hands-on labs, participants build a complete e-commerce customer support system integrating S3, OpenSearch Serverless, Lambda, API Gateway, and CloudWatch with Claude via Bedrock APIs. The course provides practical guidance on model selection across Opus, Sonnet, and Haiku, programmatic token consumption and cost measurement, Claude Code for agentic development workflows, guardrail design as a policy decision, and multi-region inference profile configuration. By course end, participants have deployed a working production-patterned API and understand the full stack from model invocation through serverless deployment.
Who Should Attend
Software Developers, Solutions Architects, DevOps Engineers, Machine Learning Engineers, and Technical Managers building AI-powered applications on AWS.
Prerequisites
- Basic familiarity with AWS console and core services (S3, Lambda, IAM)
- Understanding of REST APIs and JSON
- Programming experience in Python or JavaScript
- Familiarity with command line and terminal operations
- Basic understanding of cloud costs and IAM policies
Course outline
- Anthropic’s Claude model family: capabilities, context windows, cross-region inference profiles
- Model selection framework: task complexity, latency requirements, and cost trade-offs
- Model deprecation and version pinning production best practices
- Three billing relationships: Bedrock on-demand, Bedrock provisioned throughput, and Anthropic API direct
- What gets charged: input tokens, output tokens, tool overhead, and the 5x output multiplier
- Monitoring spend: Bedrock Console usage tab, CloudWatch token metrics, AWS Budgets
- Regional availability, latency trade-offs, and cross-region inference profile routing
- Messages API: requests, responses, system prompts, and conversation structure
- Claude-specific prompt engineering: XML tags for structure, chain-of-thought reasoning, and few-shot examples
- Prompt caching: cache write vs. cache read pricing, break-even analysis, and implementation pattern with cache_control
- Streaming implementation and error handling with exponential backoff
- Knowledge bases in Bedrock: data-to-embedding conversion and retrieval-augmented generation
- The retrieve_and_generate API vs. the two-step retrieve then invoke_model pattern
- Chunking strategies, embedding model selection, and vector store options
- Retrieval accuracy, citation generation, and relevance score interpretation
- Query optimization: vague vs. specific phrasing and its effect on retrieval scores
- When to use tools vs. RAG vs. prompts: a decision framework for production system design
- The agentic loop: how stop_reason drives iteration, why max_iterations matters, and what end_turn signals
- Tool design best practices: separate tools vs. generic lookup, error-as-data vs. exception, optional parameter defaults
- Bedrock Agents: managed orchestration, action groups, and the relationship to a hand-built agentic loop
- Constitutional AI: how Claude's trained values interact with configured guardrails as defense in depth
- IAM least privilege for Bedrock: inference profile ARNs, wildcard resource risk, and student permission scoping
- Bedrock Guardrails: content filtering, denied topics, PII anonymization vs. blocking, and prompt attack protection
- Governance: Bedrock vs. direct Anthropic API -- CloudTrail integration, data residency, and what “data stays on AWS” actually means
- Governance configuration checklist: VPC endpoints, CloudTrail, KMS, IAM scoping, Budgets, guardrail publishing
- Responsible AI: hallucination, bias, scope creep, over-reliance, and the escalate-to-human pattern
- CloudWatch monitoring: guardrail block metrics, token counts, latency, and cost attribution
- Five-stage pipeline: ingestion, retrieval, invocation, response, and monitoring
- Synchronous vs. asynchronous patterns and when each applies
- Prompt caching as an architecture pattern: structuring requests to maximize cache hit rates
- Cost optimization strategies: model selection, prompt caching, batching, and inference profile selection
- CLAUDE.md as team memory: code conventions, writing standards, and persistence across sessions
- Claude Code Skills (SKILL.md): reusable playbooks, slash commands, and project vs. user scope
- Plan mode and model switching: Opus for architecture planning, Sonnet for iterative implementation
- Git safety patterns with Claude Code: commit-before-change, revert-not-fix, and why AI-generated code changes the habit
- Session management: /compact, /clear, and context cost awareness
- Devcontainer security: network isolation, sandboxed credentials, firewall rules, and security audit patterns
- Claude Code vs. Amazon Kiro: developer-focused terminal workflows vs. spec-driven enterprise platform
- Bedrock AgentCore: managed agent runtime, built-in session memory, observability, and when to use it vs. Lambda
- Three-tool decision framework: Claude Code for exploration, Kiro for structured delivery, AgentCore for production agent deployment
- Invoke Claude Sonnet, Opus, and Haiku programmatically via boto3 using inference profile IDs
- Measure and compare latency, output quality, and JSON schema compliance across models
- Review production-scale cost projections generated from actual token counts at 10,000 and 100,000 requests per day
- Run RAG queries, evaluate citation accuracy against source documents, and compare relevance scores across query phrasing
- Install and configure Claude Code with Bedrock as the backend using the setup wizard
- Create CLAUDE.md with code conventions and writing standards before planning begins
- Use Plan mode with Opus to analyze the full project before writing code; switch to Sonnet for implementation
- Practice interrupt and steer, verification criteria, and delegation collaboration techniques
- Create a /bedrock-debug Claude Code skill for use in Lab 3
- Inspect, audit, and fix Critical and High security findings in a production devcontainer configuration
- Generate a production-ready Dockerfile to complete the devcontainer configuration
- Manage session context with /compact and /clear
- Complete a tool execution function handling order lookup, ticket creation, and account status tools
- Classify customer requests by appropriate approach: tool, RAG, or prompt
- Create a Bedrock Guardrail with content filters, denied topics, and PII anonymization
- Configure guardrail version publishing and test guardrail protection against prompt injection and denied topics
- Deploy a serverless API using AWS SAM: Lambda, API Gateway, and CloudWatch integration
- Debug tool execution issues using CloudWatch log streams