Anthropic Models on Amazon Bedrock

Delivery methods

On-Site, Virtual

Duration

1 day

This course teaches developers and cloud practitioners to build production AI applications using Anthropic’s Claude models on Amazon Bedrock. Participants learn to invoke Claude programmatically via boto3, implement RAG with Knowledge Bases, scaffold Bedrock integration code using Claude Code, configure production guardrails, and deploy serverless APIs on AWS Lambda. The course covers billing relationships, prompt caching, cost optimization, governance and compliance considerations, and architectural patterns for multi-region deployment. All content is delivered at the 200 level, assuming working knowledge of AWS and Python.

Learning Objectives

Through three hands-on labs, participants build a complete e-commerce customer support system integrating S3, OpenSearch Serverless, Lambda, API Gateway, and CloudWatch with Claude via Bedrock APIs. The course provides practical guidance on model selection across Opus, Sonnet, and Haiku, programmatic token consumption and cost measurement, Claude Code for agentic development workflows, guardrail design as a policy decision, and multi-region inference profile configuration. By course end, participants have deployed a working production-patterned API and understand the full stack from model invocation through serverless deployment.

Who Should Attend

Software Developers, Solutions Architects, DevOps Engineers, Machine Learning Engineers, and Technical Managers building AI-powered applications on AWS.

Prerequisites

Basic familiarity with AWS console and core services (S3, Lambda, IAM)
Understanding of REST APIs and JSON
Programming experience in Python or JavaScript
Familiarity with command line and terminal operations
Basic understanding of cloud costs and IAM policies

1Foundations: Generative AI and Claude Models on AWS

Anthropic’s Claude model family: capabilities, context windows, cross-region inference profiles
Model selection framework: task complexity, latency requirements, and cost trade-offs
Model deprecation and version pinning production best practices

2AWS Infrastructure and Deployment Strategy

Three billing relationships: Bedrock on-demand, Bedrock provisioned throughput, and Anthropic API direct
What gets charged: input tokens, output tokens, tool overhead, and the 5x output multiplier
Monitoring spend: Bedrock Console usage tab, CloudWatch token metrics, AWS Budgets
Regional availability, latency trade-offs, and cross-region inference profile routing

3Claude API Fundamentals and Prompt Engineering

Messages API: requests, responses, system prompts, and conversation structure
Claude-specific prompt engineering: XML tags for structure, chain-of-thought reasoning, and few-shot examples
Prompt caching: cache write vs. cache read pricing, break-even analysis, and implementation pattern with cache_control
Streaming implementation and error handling with exponential backoff

4Data Preparation and Knowledge Bases for RAG

Knowledge bases in Bedrock: data-to-embedding conversion and retrieval-augmented generation
The retrieve_and_generate API vs. the two-step retrieve then invoke_model pattern
Chunking strategies, embedding model selection, and vector store options
Retrieval accuracy, citation generation, and relevance score interpretation
Query optimization: vague vs. specific phrasing and its effect on retrieval scores

5Tool Use and Agentic Workflows with Claude

When to use tools vs. RAG vs. prompts: a decision framework for production system design
The agentic loop: how stop_reason drives iteration, why max_iterations matters, and what end_turn signals
Tool design best practices: separate tools vs. generic lookup, error-as-data vs. exception, optional parameter defaults
Bedrock Agents: managed orchestration, action groups, and the relationship to a hand-built agentic loop

6Security, Governance, and Guardrails

Constitutional AI: how Claude's trained values interact with configured guardrails as defense in depth
IAM least privilege for Bedrock: inference profile ARNs, wildcard resource risk, and student permission scoping
Bedrock Guardrails: content filtering, denied topics, PII anonymization vs. blocking, and prompt attack protection
Governance: Bedrock vs. direct Anthropic API -- CloudTrail integration, data residency, and what “data stays on AWS” actually means
Governance configuration checklist: VPC endpoints, CloudTrail, KMS, IAM scoping, Budgets, guardrail publishing
Responsible AI: hallucination, bias, scope creep, over-reliance, and the escalate-to-human pattern
CloudWatch monitoring: guardrail block metrics, token counts, latency, and cost attribution

7Practical Architecture and Deployment Patterns

Five-stage pipeline: ingestion, retrieval, invocation, response, and monitoring
Synchronous vs. asynchronous patterns and when each applies
Prompt caching as an architecture pattern: structuring requests to maximize cache hit rates
Cost optimization strategies: model selection, prompt caching, batching, and inference profile selection

8Claude Code: Building Bedrock Applications from the Terminal

CLAUDE.md as team memory: code conventions, writing standards, and persistence across sessions
Claude Code Skills (SKILL.md): reusable playbooks, slash commands, and project vs. user scope
Plan mode and model switching: Opus for architecture planning, Sonnet for iterative implementation
Git safety patterns with Claude Code: commit-before-change, revert-not-fix, and why AI-generated code changes the habit
Session management: /compact, /clear, and context cost awareness
Devcontainer security: network isolation, sandboxed credentials, firewall rules, and security audit patterns

9Amazon Kiro and Enterprise Code Generation

Claude Code vs. Amazon Kiro: developer-focused terminal workflows vs. spec-driven enterprise platform
Bedrock AgentCore: managed agent runtime, built-in session memory, observability, and when to use it vs. Lambda
Three-tool decision framework: Claude Code for exploration, Kiro for structured delivery, AgentCore for production agent deployment

10Lab 1: Claude on Bedrock with RAG

Invoke Claude Sonnet, Opus, and Haiku programmatically via boto3 using inference profile IDs
Measure and compare latency, output quality, and JSON schema compliance across models
Review production-scale cost projections generated from actual token counts at 10,000 and 100,000 requests per day
Run RAG queries, evaluate citation accuracy against source documents, and compare relevance scores across query phrasing

11Lab 2: Claude Code on Bedrock

Install and configure Claude Code with Bedrock as the backend using the setup wizard
Create CLAUDE.md with code conventions and writing standards before planning begins
Use Plan mode with Opus to analyze the full project before writing code; switch to Sonnet for implementation
Practice interrupt and steer, verification criteria, and delegation collaboration techniques
Create a /bedrock-debug Claude Code skill for use in Lab 3
Inspect, audit, and fix Critical and High security findings in a production devcontainer configuration
Generate a production-ready Dockerfile to complete the devcontainer configuration
Manage session context with /compact and /clear

12Lab 3: Production Tool Use, Guardrails, and Serverless Deployment

Complete a tool execution function handling order lookup, ticket creation, and account status tools
Classify customer requests by appropriate approach: tool, RAG, or prompt
Create a Bedrock Guardrail with content filters, denied topics, and PII anonymization
Configure guardrail version publishing and test guardrail protection against prompt injection and denied topics
Deploy a serverless API using AWS SAM: Lambda, API Gateway, and CloudWatch integration
Debug tool execution issues using CloudWatch log streams

Ready to accelerate your team's innovation?

Schedule a meeting

Unlock your team’s potential and get the most from your tech stack

Schedule a meeting