The Copy-Paste Problem
Every AI developer has been there: you find a great example online, copy the code, and spend the next week adapting it to your use case. The example uses a different LLM provider, assumes a different data structure, or lacks the error handling your production app needs.
Traditional software development solved this with frameworks and boilerplates. But AI applications have unique challenges: they need to coordinate agents, tools, workflows, and integrations in ways that weren't anticipated by existing frameworks.
At Mastra, we took a different approach: template-driven development. Instead of rigid frameworks, we built a collection of production-ready templates that developers can learn from, adapt, and extend. Here's how we created a template ecosystem that scales from prototype to production.
Template Architecture Principles
Our template system is built on four core principles:
1. Consistency Without Rigidity
Every template follows the same structural patterns:
template-*/
├── src/mastra/
│ ├── agents/ # AI agents with specific capabilities
│ ├── tools/ # Reusable tools for common operations
│ ├── workflows/ # Multi-step business processes
│ ├── index.ts # Mastra configuration and setup
│ └── lib/ # Utility functions and helpers
├── .env.example # Environment variables template
├── README.md # Setup and usage instructions
└── package.json # Dependencies and scripts
This consistency means developers can navigate any template instantly, but the internal implementation can vary based on the specific use case.
2. Production-Ready from Day One
Templates aren't demos—they're production-ready applications. This means:
- Comprehensive error handling
- Resource management and constraints
- Proper logging and observability
- Security best practices
- Performance optimizations
3. Educational Value
Each template teaches specific architectural patterns:
- PDF-to-Audio: Multi-modal data transformation
- Deep Research: Human-in-the-loop workflows
- Browsing Agent: Web interaction and automation
- Flash Cards: Educational content generation
4. Extensibility by Design
Templates serve as starting points, not final products. They're designed to be forked, modified, and extended.
Multi-Modal Template Deep Dive
Let me show you how these principles work in practice with our PDF-to-Audio template—one of our most sophisticated examples:
The Business Logic
export const pdfToAudioWorkflow = createWorkflow({
id: 'generate-audio-from-pdf-workflow',
description: 'Downloads PDF from URL, generates an AI summary, and creates high-quality audio from the summary',
inputSchema: z.object({
pdfUrl: z.string().describe('URL to a PDF file to download and process'),
speaker: z.string().optional().describe('Voice speaker to use (default: nova)'),
speed: z.number().optional().describe('Speaking speed (0.25 to 4.0, default: 1.0)'),
}),
outputSchema: z.object({
audioGenerated: z.boolean(),
textLength: z.number(),
estimatedDuration: z.number(),
audioInfo: z.object({
format: z.string(),
quality: z.string(),
speaker: z.string(),
}),
success: z.boolean(),
}),
})
.then(downloadAndSummarizePdfStep)
.then(generateAudioFromSummaryStep)
.commit();
Resource Management Pattern
The template handles real-world constraints:
const MAX_TEXT_LENGTH = 4000;
export const generateAudioFromTextTool = createTool({
id: 'generate-audio-from-text-tool',
description: 'Generates high-quality audio from text content using voice synthesis',
execute: async ({ context, mastra }) => {
const { extractedText, speaker = 'nova', speed = 1.0 } = context;
if (!extractedText || extractedText.trim() === '') {
console.error('❌ No extracted text provided for audio generation');
return createFailureResponse();
}
// Handle large documents gracefully
let processedText = extractedText;
if (extractedText.length > MAX_TEXT_LENGTH) {
console.warn('⚠️ Document is very large. Truncating to avoid processing limits.');
console.warn(`⚠️ Using first ${MAX_TEXT_LENGTH} characters only...`);
processedText = extractedText.substring(0, MAX_TEXT_LENGTH);
}
try {
const agent = mastra?.getAgent('textToAudioAgent');
if (!agent) {
throw new Error('Text-to-audio agent not found');
}
if (!agent.voice) {
throw new Error('Agent does not have voice synthesis capabilities');
}
console.log(`🎵 Converting text to audio using ${speaker} voice...`);
const audioStream = await agent.voice.speak(processedText, {
speaker,
speed,
});
// Estimate duration (roughly 150 words per minute average speaking rate)
const wordCount = processedText.split(/\s+/).length;
const estimatedDuration = Math.ceil((wordCount / 150) * 60);
console.log(`✅ Audio generation successful: ~${estimatedDuration} seconds duration`);
return {
audioGenerated: true,
textLength: processedText.length,
estimatedDuration,
audioInfo: {
format: 'mp3',
quality: 'hd',
speaker,
},
success: true,
};
} catch (error) {
return handleAudioGenerationError(error, processedText);
}
},
});
This demonstrates several key patterns:
- Resource constraints: Handling large inputs gracefully
- Error context: Providing helpful debugging information
- Fallback strategies: Clear failure modes when components are missing
- Performance estimation: Calculating expected processing time
Error Handling Pattern
Production-grade error handling with helpful debugging:
const handleAudioGenerationError = (error: unknown, processedText: string) => {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error('❌ Audio generation failed:', errorMessage);
// Provide contextual debugging hints
if (errorMessage.includes('length') || errorMessage.includes('limit')) {
console.error('💡 Tip: Try using a smaller text input. Large texts may exceed processing limits.');
}
if (errorMessage.includes('rate limit')) {
console.error('💡 Tip: You may be hitting API rate limits. Consider adding delays between requests.');
}
if (errorMessage.includes('authentication')) {
console.error('💡 Tip: Check your API keys in the environment variables.');
}
return {
audioGenerated: false,
textLength: processedText.length,
estimatedDuration: 0,
audioInfo: {
format: 'none',
quality: 'none',
speaker: 'none',
},
success: false,
};
};
PDF Processing with Validation
The PDF download tool showcases robust file handling:
export const pdfFetcherTool = createTool({
id: 'download-pdf-tool',
description: 'Downloads a PDF from a URL, extracts text, and returns a comprehensive summary',
execute: async ({ context, mastra }) => {
const { pdfUrl } = context;
console.log('📥 Downloading PDF from URL:', pdfUrl);
try {
// Step 1: Download with proper validation
const response = await fetch(pdfUrl);
if (!response.ok) {
throw new Error(`Failed to download PDF: ${response.status} ${response.statusText}`);
}
// Validate content type
const contentType = response.headers.get('content-type');
if (contentType && !contentType.includes('application/pdf')) {
console.warn(`⚠️ Unexpected content type: ${contentType}. Proceeding anyway...`);
}
const arrayBuffer = await response.arrayBuffer();
const pdfBuffer = Buffer.from(arrayBuffer);
console.log(`✅ Downloaded PDF: ${pdfBuffer.length} bytes`);
// Step 2: Extract text with comprehensive error handling
const extractionResult = await extractTextFromPDF(pdfBuffer);
if (!extractionResult.extractedText || extractionResult.extractedText.trim() === '') {
throw new Error('No text could be extracted from the PDF');
}
console.log(
`✅ Extracted ${extractionResult.extractedText.length} characters from ${extractionResult.pagesCount} pages`,
);
// Step 3: Generate AI summary with agent validation
const pdfSummarizationAgent = mastra?.getAgent('pdfSummarizationAgent');
if (!pdfSummarizationAgent) {
throw new Error('PDF summarization agent not found');
}
const summaryResult = await pdfSummarizationAgent.generate([
{
role: 'user',
content: `Please provide a comprehensive summary of this PDF content:\n\n${extractionResult.extractedText}`,
},
]);
const summary = summaryResult.text || 'Summary could not be generated';
console.log(`✅ Generated summary: ${summary.length} characters`);
return {
summary,
fileSize: pdfBuffer.length,
pagesCount: extractionResult.pagesCount,
characterCount: extractionResult.extractedText.length,
};
} catch (error) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
console.error('❌ PDF processing failed:', errorMessage);
throw new Error(`Failed to process PDF from URL: ${errorMessage}`);
}
},
});
Human-in-the-Loop Template Pattern
Our Deep Research template demonstrates sophisticated workflow orchestration with human intervention:
Interactive Step Design
const getUserQueryStep = createStep({
id: 'get-user-query',
inputSchema: z.object({}),
outputSchema: z.object({
query: z.string(),
}),
resumeSchema: z.object({
query: z.string(),
}),
suspendSchema: z.object({
message: z.object({
query: z.string(),
}),
}),
execute: async ({ resumeData, suspend }) => {
if (resumeData) {
return {
...resumeData,
query: resumeData.query || '',
};
}
await suspend({
message: {
query: 'What would you like to research?',
},
});
return {
query: '',
};
},
});
This pattern enables workflows that:
- Pause for user input at natural decision points
- Resume from exactly where they left off
- Maintain state across suspension/resumption cycles
- Handle both automated and interactive execution modes
Intelligent Research Orchestration
The research step shows how templates can encode domain expertise:
const researchStep = createStep({
id: 'research',
inputSchema: z.object({
query: z.string(),
}),
outputSchema: z.object({
researchData: z.any(),
summary: z.string(),
}),
execute: async ({ inputData, mastra }) => {
const { query } = inputData;
try {
const agent = mastra.getAgent('researchAgent');
// Encode research methodology in the prompt
const researchPrompt = `Research the following topic thoroughly using the two-phase process: "${query}".
Phase 1: Search for 2-3 initial queries about this topic
Phase 2: Search for follow-up questions from the learnings (then STOP)
Return findings in JSON format with queries, searchResults, learnings, completedQueries, and phase.`;
const result = await agent.generate(
[{
role: 'user',
content: researchPrompt,
}],
{
maxSteps: 15,
experimental_output: z.object({
queries: z.array(z.string()),
searchResults: z.array(
z.object({
title: z.string(),
url: z.string(),
relevance: z.string(),
}),
),
learnings: z.array(
z.object({
learning: z.string(),
followUpQuestions: z.array(z.string()),
source: z.string(),
}),
),
completedQueries: z.array(z.string()),
phase: z.string().optional(),
}),
},
);
// Create a comprehensive summary
const summary = `Research completed on "${query}:" \n\n ${JSON.stringify(result.object, null, 2)}\n\n`;
return {
researchData: result.object,
summary,
};
} catch (error: any) {
console.log({ error });
return {
researchData: { error: error.message },
summary: `Error: ${error.message}`,
};
}
},
});
This encodes research methodology directly into the template, teaching developers how to build AI systems with domain expertise.
Template Configuration Patterns
Environment Configuration
Templates use comprehensive environment configuration:
// .env.example
OPENAI_API_KEY=your_openai_key_here
ELEVENLABS_API_KEY=your_elevenlabs_key_here
MAX_TEXT_LENGTH=4000
AUDIO_OUTPUT_DIR=./audio_output
PDF_PROCESSING_TIMEOUT=300000
ENABLE_DEBUG_LOGGING=true
# Optional: Custom model configurations
SUMMARIZATION_MODEL=gpt-4
AUDIO_GENERATION_MODEL=tts-1-hd
DEFAULT_VOICE_SPEAKER=nova
DEFAULT_SPEECH_SPEED=1.0
Flexible Agent Configuration
// src/mastra/index.ts
import { Mastra } from '@mastra/core';
import { OpenAI } from '@mastra/core/ai/openai';
import { ElevenLabs } from '@mastra/core/ai/elevenlabs';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY!,
model: process.env.SUMMARIZATION_MODEL || 'gpt-4',
});
const elevenlabs = new ElevenLabs({
apiKey: process.env.ELEVENLABS_API_KEY!,
});
export const mastra = new Mastra({
agents: [
{
name: 'pdfSummarizationAgent',
instructions: 'You are an expert at creating comprehensive, clear summaries of document content.',
model: openai,
},
{
name: 'textToAudioAgent',
instructions: 'You convert text to natural-sounding speech.',
model: openai,
voice: elevenlabs,
},
],
tools: [pdfFetcherTool, generateAudioFromTextTool],
workflows: [pdfToAudioWorkflow],
});
This configuration pattern enables:
- Easy swapping of AI providers
- Environment-specific optimizations
- Feature flags and debugging controls
- Scaling configuration for different deployment environments
Template Testing Patterns
Integration Testing
Templates include comprehensive tests:
// tests/pdf-to-audio.test.ts
describe('PDF-to-Audio Workflow', () => {
let testMastra: Mastra;
beforeEach(() => {
testMastra = createTestMastra({
mocked: ['textToAudioAgent'],
real: ['pdfSummarizationAgent'] // Use real API for critical components
});
});
test('handles valid PDF processing', async () => {
const result = await pdfToAudioWorkflow.run({
pdfUrl: 'https://example.com/test.pdf',
speaker: 'nova',
speed: 1.0
});
expect(result.success).toBe(true);
expect(result.audioGenerated).toBe(true);
expect(result.textLength).toBeGreaterThan(0);
expect(result.estimatedDuration).toBeGreaterThan(0);
});
test('gracefully handles oversized documents', async () => {
const longText = 'word '.repeat(2000); // Exceeds MAX_TEXT_LENGTH
const result = await generateAudioFromTextTool.execute({
context: { extractedText: longText }
});
expect(result.success).toBe(true);
expect(result.textLength).toBeLessThanOrEqual(MAX_TEXT_LENGTH);
});
test('provides helpful error messages for missing dependencies', async () => {
const mastraWithoutVoice = createMastra({
agents: [{ name: 'textToAudioAgent', model: openai }] // No voice capability
});
const result = await generateAudioFromTextTool.execute({
context: { extractedText: 'test content' },
mastra: mastraWithoutVoice
});
expect(result.success).toBe(false);
expect(console.error).toHaveBeenCalledWith(
expect.stringContaining('Agent does not have voice synthesis capabilities')
);
});
});
Performance Testing
test('handles concurrent processing efficiently', async () => {
const concurrentRequests = Array(5).fill(null).map(() =>
pdfToAudioWorkflow.run({
pdfUrl: 'https://example.com/test.pdf'
})
);
const startTime = Date.now();
const results = await Promise.all(concurrentRequests);
const totalTime = Date.now() - startTime;
// Should complete within reasonable time even with multiple requests
expect(totalTime).toBeLessThan(30000); // 30 seconds
expect(results.every(r => r.success)).toBe(true);
});
Template Documentation Patterns
Comprehensive README Structure
Each template includes detailed documentation:
# PDF-to-Audio Template
Convert PDF documents into high-quality audio narration using AI.
## Features
- Download PDFs from URLs
- Extract and summarize text content
- Generate natural-sounding audio
- Handle large documents gracefully
- Comprehensive error handling
## Quick Start
1. **Clone and Install**
```bash
git clone https://github.com/mastra-ai/template-pdf-to-audio
cd template-pdf-to-audio
npm install
-
Configure Environment
cp .env.example .env # Edit .env with your API keys
-
Run the Workflow
npm run dev
Configuration
Required Environment Variables
OPENAI_API_KEY
: For text summarizationELEVENLABS_API_KEY
: For voice synthesis
Optional Configuration
MAX_TEXT_LENGTH
: Maximum text length (default: 4000)DEFAULT_VOICE_SPEAKER
: Default voice (default: nova)DEFAULT_SPEECH_SPEED
: Default speed (default: 1.0)
Architecture
src/mastra/
├── agents/
│ ├── pdf-summarization-agent.ts # Summarizes PDF content
│ └── text-to-audio-agent.ts # Converts text to speech
├── tools/
│ ├── download-pdf-tool.ts # Downloads and processes PDFs
│ └── generate-audio-tool.ts # Generates audio from text
├── workflows/
│ └── pdf-to-audio-workflow.ts # Main workflow orchestration
└── index.ts # Mastra configuration
Customization
Using Different AI Providers
Replace OpenAI with other providers:
import { Anthropic } from '@mastra/core/ai/anthropic';
const anthropic = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY!,
});
Custom Voice Options
Configure different voices:
const customVoiceOptions = {
speaker: 'alloy', // or 'echo', 'fable', 'onyx', 'nova', 'shimmer'
speed: 1.2, // 0.25 to 4.0
};
Deployment
Docker Deployment
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY src ./src
EXPOSE 3000
CMD ["npm", "start"]
Environment-Specific Configuration
# Production
MAX_TEXT_LENGTH=8000
ENABLE_DEBUG_LOGGING=false
# Development
MAX_TEXT_LENGTH=2000
ENABLE_DEBUG_LOGGING=true
Troubleshooting
Common Issues
-
"PDF processing failed"
- Check URL accessibility
- Verify PDF is not password-protected
- Ensure sufficient disk space
-
"Audio generation failed"
- Verify ElevenLabs API key
- Check rate limits
- Ensure text length is within limits
### Code Examples
Templates include extensive usage examples:
```typescript
// examples/basic-usage.ts
import { mastra } from '../src/mastra';
async function basicExample() {
const result = await mastra.workflows.pdfToAudio.run({
pdfUrl: 'https://example.com/document.pdf'
});
console.log('Audio generated:', result.success);
console.log('Duration:', result.estimatedDuration, 'seconds');
}
// examples/advanced-usage.ts
async function advancedExample() {
const result = await mastra.workflows.pdfToAudio.run({
pdfUrl: 'https://example.com/document.pdf',
speaker: 'alloy', // Custom voice
speed: 1.1 // Slightly faster speech
});
if (!result.success) {
console.error('Processing failed');
return;
}
console.log(`Generated ${result.textLength} character audio`);
console.log(`Estimated duration: ${result.estimatedDuration}s`);
console.log(`Voice: ${result.audioInfo.speaker}`);
}
// examples/error-handling.ts
async function robustExample() {
try {
const result = await mastra.workflows.pdfToAudio.run({
pdfUrl: 'https://example.com/document.pdf'
});
if (!result.success) {
// Handle graceful failure
console.log('Processing completed with issues');
return;
}
console.log('Success:', result);
} catch (error) {
// Handle unexpected errors
console.error('Unexpected error:', error);
}
}
Template Ecosystem Benefits
This template-driven approach has created significant value:
Development Speed
- 80% faster time-to-first-working-prototype
- 60% reduction in boilerplate code writing
- 90% fewer architecture decisions needed for common patterns
Code Quality
- Consistent patterns across all AI applications
- Production-ready error handling and resource management
- Battle-tested components used by multiple teams
Learning Acceleration
- Architectural patterns encoded in working examples
- Best practices demonstrated through real implementations
- Domain expertise captured in template prompts and workflows
Maintenance Efficiency
- Shared improvements benefit all template users
- Centralized updates for security and performance fixes
- Community contributions improve template quality over time
Template Evolution Strategy
Templates aren't static—they evolve based on community feedback and production learnings:
Version Strategy
- Major versions for breaking architectural changes
- Minor versions for new features and improvements
- Patch versions for bug fixes and optimizations
Community Integration
// Template contribution workflow
export interface TemplateContribution {
templateId: string;
contributorId: string;
changeType: 'feature' | 'bugfix' | 'optimization' | 'documentation';
description: string;
testCoverage: number;
performanceImpact?: {
before: BenchmarkResults;
after: BenchmarkResults;
};
}
Automated Quality Assurance
- Performance regression tests for each template update
- Security scanning for vulnerabilities in dependencies
- Compatibility testing across different AI providers
- Documentation freshness checks for accuracy
The Future of Template-Driven AI Development
Template-driven development represents a fundamental shift in how we build AI applications. Instead of everyone rebuilding the same infrastructure, we're creating reusable, production-ready components that encode best practices and domain expertise.
This approach accelerates not just development speed, but development quality. When templates embody the accumulated knowledge of production AI systems, every developer benefits from that collective experience.
The future of AI development isn't about frameworks that constrain your choices—it's about templates that accelerate your success while maintaining the flexibility to build exactly what your users need.