LearnTool Deep DivesAnthropic Claude API Guide: Building Production AI Applications
advanced
16 min read
15 January 2025

Anthropic Claude API Guide: Building Production AI Applications

Master the Claude API for sophisticated AI applications. Extended context windows, tool use, vision capabilities, and production patterns for Australian businesses building with Anthropic's models.

Clever Ops Team

Anthropic's Claude API offers unique capabilities that differentiate it from alternatives: a massive 200K token context window, thoughtful safety-first responses, and exceptional performance on complex reasoning tasks. For Australian businesses building AI applications, understanding Claude's strengths—and how to leverage them—opens opportunities for solutions that other models struggle with.

This guide covers production Claude API development: from basic completions through tool use, vision capabilities, and patterns for building robust applications. You'll learn techniques for handling long documents, implementing AI agents, and optimising costs—all with practical code examples ready for your Australian business applications.

Key Takeaways

  • Claude's 200K token context window enables whole-document analysis without chunking—ideal for contracts, reports, and codebases
  • Claude 3.5 Sonnet is the best default: strong performance at $3/$15 per million tokens (input/output)
  • Tool use enables building AI agents that take real actions: API calls, database queries, multi-step workflows
  • Vision capabilities process images alongside text for document extraction, analysis, and multimodal applications
  • Prompt caching reduces costs by 90% for repeated context—essential for chatbots and document Q&A
  • Implement retry logic with exponential backoff—Claude API can be temporarily overloaded during peak times
  • Monitor usage carefully and use the right model for each task—Haiku costs 60x less than Opus

Claude API Overview

Anthropic offers the Claude model family through a straightforward API that's similar to but distinct from OpenAI's approach.

200K

Token context window

$3

Per 1M input tokens (Sonnet)

8K

Max output tokens

Claude Model Comparison

Model Best For Input/1M Output/1M Speed
claude-3-5-sonnet Best all-rounder $3 USD $15 USD Fast
claude-3-opus Complex reasoning $15 USD $75 USD Slower
claude-3-5-haiku Speed & cost $0.25 USD $1.25 USD Fastest

Model Selection: Claude 3.5 Sonnet is the recommended default for most Australian business applications. It matches or exceeds GPT-4o on most benchmarks at lower cost. Use Opus only for your most demanding analytical tasks, Haiku for high-volume simple operations.

Getting Started

Python Setup

pip install anthropic

from anthropic import Anthropic

client = Anthropic()  # Uses ANTHROPIC_API_KEY env var

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude!"}
    ]
)

print(message.content[0].text)

TypeScript/JavaScript Setup

npm install @anthropic-ai/sdk

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

const message = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20241022",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Hello, Claude!" }
  ]
});

console.log(message.content[0].text);

The 200K Context Advantage

Claude's 200,000 token context window is its most distinctive feature—approximately 150,000 words or 500+ pages. This enables applications impossible with smaller context models.

What Fits in 200K Tokens?

  • ~500 pages of typical business documents
  • Full codebase of a medium-sized application
  • Complete contracts with all schedules and annexures
  • Multiple research papers for synthesis
  • Entire book or lengthy report

Long Document Processing Pattern

Analysing Long Documents

from anthropic import Anthropic

client = Anthropic()

# Load your long document
with open("contract.txt", "r") as f:
    document = f.read()

# Analyse in single call - no chunking needed
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": f"""Analyse this contract and provide:
1. Key terms and conditions
2. Unusual or concerning clauses
3. Missing standard provisions
4. Risk summary with severity ratings

Contract:
{document}"""
        }
    ]
)

print(message.content[0].text)

Multi-Document Comparison

Compare Multiple Documents

# Compare multiple documents in one call
documents = {
    "proposal_a": load_document("proposal_a.pdf"),
    "proposal_b": load_document("proposal_b.pdf"),
    "proposal_c": load_document("proposal_c.pdf"),
}

prompt = f"""Compare these three vendor proposals for our CRM implementation:

PROPOSAL A:
{documents['proposal_a']}

PROPOSAL B:
{documents['proposal_b']}

PROPOSAL C:
{documents['proposal_c']}

Provide:
1. Comparison table of key features, pricing, timeline
2. Strengths and weaknesses of each
3. Risk analysis
4. Recommendation with justification"""

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=4096,
    messages=[{"role": "user", "content": prompt}]
)

Case Study: Sydney Law Firm

Commercial law firm using Claude API for due diligence.

  • Use case: M&A contract analysis
  • Documents: 50-200 pages per deal
  • Previous approach: Manual review, 40+ hours
  • With Claude: Initial analysis in minutes, review in hours
  • Key benefit: Entire contract fits in context—no chunking artifacts

Context Strategy: Even with 200K tokens, put the most important content first. Claude maintains coherence throughout but attention is naturally stronger at the beginning. Structure your prompts: question → most relevant content → supporting context.

📚 Want to learn more?

Tool Use: Building AI Agents

Claude's tool use (function calling) enables building AI agents that can take actions: query databases, call APIs, execute workflows. The model decides when and how to use tools based on the conversation.

Defining Tools

Tool Definition Structure

tools = [
    {
        "name": "get_customer_data",
        "description": "Retrieve customer information from the CRM system. Use this when the user asks about a specific customer.",
        "input_schema": {
            "type": "object",
            "properties": {
                "customer_id": {
                    "type": "string",
                    "description": "The unique customer identifier (e.g., 'CUS-12345')"
                },
                "include_orders": {
                    "type": "boolean",
                    "description": "Whether to include order history"
                }
            },
            "required": ["customer_id"]
        }
    },
    {
        "name": "search_products",
        "description": "Search the product catalog. Use when the user wants to find or compare products.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search terms"
                },
                "category": {
                    "type": "string",
                    "description": "Product category filter"
                },
                "max_results": {
                    "type": "integer",
                    "description": "Maximum results to return"
                }
            },
            "required": ["query"]
        }
    }
]

Complete Tool Use Pattern

Full Tool Use Implementation

from anthropic import Anthropic
import json

client = Anthropic()

def process_tool_call(tool_name, tool_input):
    """Execute the actual tool and return results."""
    if tool_name == "get_customer_data":
        # Your actual implementation
        return get_customer_from_crm(tool_input["customer_id"])
    elif tool_name == "search_products":
        return search_product_catalog(**tool_input)
    else:
        return {"error": f"Unknown tool: {tool_name}"}

def chat_with_tools(user_message):
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )

        # Check if Claude wants to use a tool
        if response.stop_reason == "tool_use":
            # Process each tool call
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = process_tool_call(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result)
                    })

            # Add assistant response and tool results to messages
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})

        else:
            # No more tool calls, return final response
            return response.content[0].text

# Usage
result = chat_with_tools("What's the order history for customer CUS-12345?")
print(result)

Tool Use Best Practices

  • Clear Descriptions: Tool descriptions guide when Claude uses them. Be specific about when each tool is appropriate.
  • Validation: Always validate tool inputs before execution. Claude usually gets it right but edge cases exist.
  • Error Handling: Return meaningful errors as tool results so Claude can adapt or explain to the user.
  • Idempotency: For tools that modify data, consider requiring confirmation before destructive actions.

Agent Pattern: For complex agents, implement a tool that can "think" or "plan" without executing actions. Claude can call this to reason through multi-step problems before committing to actions.

Vision: Multimodal Applications

Claude 3 models can process images alongside text, enabling powerful multimodal applications for Australian businesses.

Vision Capabilities

  • Document Analysis: Read and analyse scanned documents, invoices, receipts
  • Image Understanding: Describe, analyse, and answer questions about images
  • Chart Interpretation: Extract data and insights from charts and graphs
  • Code from Screenshots: Generate code from UI mockups or wireframes
  • Comparison: Compare multiple images and identify differences

Processing Images with Claude

import base64
from anthropic import Anthropic

client = Anthropic()

# Load image as base64
with open("invoice.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Extract the following from this invoice: vendor name, invoice number, date, line items with amounts, total, and GST amount. Return as JSON."
                }
            ]
        }
    ]
)

print(message.content[0].text)

Multi-Image Analysis

Compare Multiple Images

# Compare before/after images
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=2048,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Compare these two property photos and describe the changes:"},
                {
                    "type": "image",
                    "source": {"type": "base64", "media_type": "image/jpeg", "data": before_image}
                },
                {"type": "text", "text": "BEFORE"},
                {
                    "type": "image",
                    "source": {"type": "base64", "media_type": "image/jpeg", "data": after_image}
                },
                {"type": "text", "text": "AFTER"},
                {"type": "text", "text": "List all visible differences and assess the renovation quality."}
            ]
        }
    ]
)

Australian Business Use Cases

Receipt Processing

Extract vendor, amount, date, GST from receipts for expense management. Handle Australian formats and ABN extraction.

Property Assessment

Real estate: analyse property photos for listing descriptions, condition assessment, or renovation estimates.

Insurance Claims

Analyse damage photos for insurance assessments. Describe damage, estimate severity, identify repair needs.

Compliance Verification

Check photos against requirements: safety signage in place, proper equipment usage, site condition verification.

💡 Need expert help with this?

Streaming for Real-Time UX

Streaming returns responses token-by-token as they're generated, creating responsive user experiences essential for chat interfaces.

Python Streaming

from anthropic import Anthropic

client = Anthropic()

# Stream response
with client.messages.stream(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain AI in 200 words"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

print()  # Newline at end

TypeScript Streaming (Next.js API Route)

// app/api/chat/route.ts
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

export async function POST(req: Request) {
  const { messages } = await req.json();

  const stream = anthropic.messages.stream({
    model: 'claude-3-5-sonnet-20241022',
    max_tokens: 1024,
    messages
  });

  // Return as Server-Sent Events
  const encoder = new TextEncoder();
  const readable = new ReadableStream({
    async start(controller) {
      for await (const event of stream) {
        if (event.type === 'content_block_delta' &&
            event.delta.type === 'text_delta') {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({text: event.delta.text})}\n\n`)
          );
        }
      }
      controller.enqueue(encoder.encode('data: [DONE]\n\n'));
      controller.close();
    }
  });

  return new Response(readable, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    }
  });
}

Streaming with Tools

When streaming with tools, you receive events for both text generation and tool use decisions:

with client.messages.stream(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    tools=tools,
    messages=messages
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            if event.content_block.type == "tool_use":
                print(f"Calling tool: {event.content_block.name}")
        elif event.type == "content_block_delta":
            if hasattr(event.delta, "text"):
                print(event.delta.text, end="", flush=True)
            elif hasattr(event.delta, "partial_json"):
                # Tool arguments being streamed
                pass

Cost Optimisation Strategies

Claude API costs can be managed effectively with the right strategies.

Model Selection Impact

Cost Comparison: 1 Million Tokens/Month

Claude 3.5 Haiku

$0.25 + $1.25 = $1.50 USD

Claude 3.5 Sonnet

$3 + $15 = $18 USD

Claude 3 Opus

$15 + $75 = $90 USD

Strategy: Use Haiku for simple tasks (60x cheaper than Opus), Sonnet for most applications, Opus only when quality demands it.

Prompt Caching

Claude supports prompt caching for repeated context—critical for applications that use the same system prompt or documents repeatedly.

Using Prompt Caching

# Long system prompt or document that's reused
SYSTEM_CONTEXT = "..." # Your long context

# Mark content for caching
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": SYSTEM_CONTEXT,
            "cache_control": {"type": "ephemeral"}  # Enable caching
        }
    ],
    messages=[{"role": "user", "content": user_query}]
)

# Cache hits reduce input token costs by 90%

Additional Optimisation Techniques

  • Limit Max Tokens: Set appropriate max_tokens to prevent excessive output
  • Efficient Prompts: Concise prompts reduce input costs—every token matters
  • Response Caching: Cache responses for repeated identical queries
  • Batch Processing: Process multiple items per API call where logical
  • Pre-filtering: Use cheaper methods (regex, rules) before sending to Claude

Monitor Usage: Use Anthropic's usage dashboard to track spending. Set up alerts for unexpected spikes. Consider implementing per-user or per-feature usage limits in your application.

Error Handling & Production Patterns

Production Claude applications need robust error handling and resilience patterns.

Common Error Types

RateLimitError

Exceeded rate limits. Implement exponential backoff and retry.

APIStatusError (overloaded_error)

Anthropic servers overloaded. Retry with backoff.

BadRequestError

Invalid request (too many tokens, invalid params). Fix request, don't retry.

AuthenticationError

Invalid API key. Check configuration.

Robust Error Handling

from anthropic import Anthropic, RateLimitError, APIStatusError
from tenacity import retry, stop_after_attempt, wait_exponential

client = Anthropic()

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=60),
    retry=lambda e: isinstance(e, (RateLimitError, APIStatusError))
)
def call_claude(messages, model="claude-3-5-sonnet-20241022"):
    try:
        response = client.messages.create(
            model=model,
            max_tokens=1024,
            messages=messages
        )
        return response.content[0].text

    except RateLimitError:
        print("Rate limited, will retry...")
        raise  # Let tenacity retry

    except APIStatusError as e:
        if "overloaded" in str(e):
            print("API overloaded, will retry...")
            raise
        raise  # Other API errors, don't retry

    except Exception as e:
        print(f"Unexpected error: {e}")
        raise

Production Checklist

Before Going Live

  • ☐ Retry logic with exponential backoff implemented
  • ☐ Timeouts configured for all API calls
  • ☐ Error logging captures full context
  • ☐ Usage monitoring and alerts in place
  • ☐ Fallback responses for when API unavailable
  • ☐ Rate limiting on your application endpoints
  • ☐ Input validation before sending to API
  • ☐ Output validation for downstream processing

Conclusion

The Claude API offers distinctive advantages for Australian businesses: a massive context window enabling whole-document analysis, thoughtful responses suited to professional contexts, and strong reasoning capabilities for complex tasks. Whether you're building customer-facing chatbots, internal document processing systems, or sophisticated AI agents, Claude provides the foundation for applications that require nuanced understanding and careful outputs.

Start with Claude 3.5 Sonnet as your default—it handles most tasks excellently at reasonable cost. Leverage the 200K context window for document-heavy applications where other models force awkward chunking. Implement tool use for agents that need to take actions. And always build with production reliability in mind: proper error handling, cost monitoring, and fallback strategies.

The Australian businesses seeing the best results treat Claude not as a magic box but as a capable team member that needs clear instructions, appropriate tools, and good error handling. Get these foundations right, and you'll build AI applications that genuinely transform how your business operates.

Frequently Asked Questions

How does Claude API pricing compare to OpenAI?

What can I do with Claude's 200K context window?

Is the Claude API suitable for Australian compliance requirements?

How do I choose between Claude models?

What is tool use and how does it work?

Can Claude process images and documents?

How do I handle Claude API errors in production?

How does prompt caching reduce costs?

Ready to Implement?

This guide provides the knowledge, but implementation requires expertise. Our team has done this 500+ times and can get you production-ready in weeks.

✓ FT Fast 500 APAC Winner✓ 500+ Implementations✓ Results in Weeks