From Sequential to Parallel: How Apigene's Parallel Tool Execution Accelerates AI Agents by 10x

AI agents are transforming how we interact with software, but there's a hidden bottleneck slowing them down: sequential tool execution. When agents execute tools one after another, they create artificial delays that compound with each operation, turning simple workflows into slow, frustrating experiences.
At Apigene, we've solved this challenge with parallel tool execution—enabling agents to run multiple actions simultaneously across different applications. This article explores the sequential execution problem, our parallel execution solution, and how it accelerates agent workflows by up to 10x.
The Problem: Sequential Execution Creates Bottlenecks
The Sequential Execution Trap
Most AI agents execute tools sequentially, waiting for each operation to complete before starting the next one. Consider a common workflow:
User Request: "Get my latest email from Gmail, check my open Jira issues, and send a summary to Slack"
Sequential Execution:
1. Execute Gmail action → Wait 500ms → Complete
2. Execute Jira action → Wait 800ms → Complete
3. Execute Slack action → Wait 300ms → Complete
Total Time: 1,600ms (1.6 seconds)
Each operation blocks the next one, creating a waterfall effect where total execution time equals the sum of all individual operations.
Real-World Impact
The sequential execution problem becomes more severe as workflows grow in complexity:
Example: Reading 10 Emails Sequentially
Email 1: 500ms
Email 2: 500ms
Email 3: 500ms
...
Email 10: 500ms
Total: 5,000ms (5 seconds)
Example: Multi-App Dashboard Update
Fetch Gmail unread count: 400ms
Fetch Jira open issues: 600ms
Fetch Slack unread messages: 300ms
Fetch Salesforce opportunities: 800ms
Fetch GitHub pull requests: 500ms
Total: 2,600ms (2.6 seconds)
In both cases, the agent spends most of its time waiting rather than executing, creating poor user experiences and inefficient resource utilization.
The Hidden Costs
Sequential execution creates several hidden problems:
- Increased Latency: Total time equals sum of all operations
- Poor User Experience: Users wait unnecessarily long
- Higher Costs: More LLM reasoning cycles between operations
- Reduced Throughput: Agents can't handle multiple requests efficiently
- Timeout Risks: Long sequential chains may exceed timeout limits
The Solution: Parallel Tool Execution
Apigene's MCP Gateway provides three powerful tools for parallel execution, each optimized for different use cases:
1. run_action - Single Action Execution
The foundation tool for executing individual actions. While not parallel itself, it's the building block for parallel operations.
Use Case: Execute a single action when you only need one operation.
Example:
{
"tool": "run_action",
"arguments": {
"app_name": "Slack",
"user_input": "Send a message to #general",
"context": {
"operationId": "chat.postMessage",
"channel": "#general",
"text": "Hello, world!"
}
}
}Execution Time: ~300ms
2. run_action_batch - Parallel Batch Execution
Execute the same action multiple times in parallel with different parameters. Perfect for processing multiple items of the same type.
Use Case: When you need to execute the same action multiple times with different inputs (e.g., reading multiple emails, updating multiple records, fetching multiple users).
How It Works:
- Takes a
base_contextwith shared parameters - Takes a
batch_contextarray with varying parameters - Executes all batch items in parallel using
Promise.all() - Returns results in the same order as input
Example: Reading 10 Emails in Parallel
{
"tool": "run_action_batch",
"arguments": {
"app_name": "Gmail",
"user_input": "Read multiple emails",
"base_context": {
"operationId": "readEmail",
"format": "full"
},
"batch_context": [
{ "email_id": "msg_001" },
{ "email_id": "msg_002" },
{ "email_id": "msg_003" },
{ "email_id": "msg_004" },
{ "email_id": "msg_005" },
{ "email_id": "msg_006" },
{ "email_id": "msg_007" },
{ "email_id": "msg_008" },
{ "email_id": "msg_009" },
{ "email_id": "msg_010" }
]
}
}Execution Time: ~500ms (vs 5,000ms sequential)
Speed Improvement: 10x faster
Response:
{
"batch_results": [
{
"success": true,
"index": 0,
"merged_context": {
"operationId": "readEmail",
"format": "full",
"email_id": "msg_001"
},
"result": {
"status_code": 200,
"response_content": {
"id": "msg_001",
"subject": "Meeting Tomorrow",
"from": "colleague@example.com"
}
}
},
// ... 9 more results
],
"summary": {
"total": 10,
"successful": 10,
"failed": 0
}
}3. run_multi_actions - Parallel Multi-Action Execution
Execute multiple different actions simultaneously across different applications. Perfect for workflows that need data from multiple sources.
Use Case: When you need to execute different actions from different apps in parallel (e.g., fetching data from multiple sources, updating multiple systems simultaneously).
How It Works:
- Takes an array of action requests
- Each action can target a different app
- Executes all actions in parallel using
Promise.all() - Returns results with success/failure status for each
Example: Multi-App Dashboard Update
{
"tool": "run_multi_actions",
"arguments": {
"actions": [
{
"app_name": "Gmail",
"user_input": "Get unread email count",
"context": {
"operationId": "getUnreadCount"
}
},
{
"app_name": "Jira",
"user_input": "List my open issues",
"context": {
"operationId": "listIssues",
"assignee": "me",
"status": "open"
}
},
{
"app_name": "Slack",
"user_input": "Get unread message count",
"context": {
"operationId": "getUnreadCount"
}
},
{
"app_name": "Salesforce",
"user_input": "Get high-value opportunities",
"context": {
"operationId": "listOpportunities",
"minAmount": 10000
}
},
{
"app_name": "GitHub",
"user_input": "List open pull requests",
"context": {
"operationId": "listPullRequests",
"state": "open"
}
}
]
}
}Execution Time: ~800ms (vs 2,600ms sequential)
Speed Improvement: 3.25x faster
Response:
{
"results": [
{
"success": true,
"index": 0,
"request": {
"app_name": "Gmail",
"context": { "operationId": "getUnreadCount" }
},
"result": {
"status_code": 200,
"response_content": { "unread_count": 5 }
}
},
{
"success": true,
"index": 1,
"request": {
"app_name": "Jira",
"context": { "operationId": "listIssues" }
},
"result": {
"status_code": 200,
"response_content": {
"issues": [
{"id": "PROJ-123", "summary": "Fix bug"},
{"id": "PROJ-124", "summary": "Add feature"}
]
}
}
},
// ... 3 more results
],
"summary": {
"total": 5,
"successful": 5,
"failed": 0
}
}Performance Comparison: Sequential vs Parallel
| Scenario | Sequential Time | Parallel Time | Improvement |
|---|---|---|---|
| Read 10 emails | 5,000ms | 500ms | 10x faster |
| Multi-app dashboard (5 apps) | 2,600ms | 800ms | 3.25x faster |
| Update 50 records | 25,000ms | 2,500ms | 10x faster |
| Fetch 3 different data sources | 1,600ms | 600ms | 2.67x faster |
Why Parallel Execution is Faster
Parallel execution leverages the asynchronous nature of API calls:
- Network I/O Overlap: While one API call waits for network response, others can start
- Independent Operations: Most tool calls don't depend on each other
- Concurrent Processing: Server can handle multiple requests simultaneously
- Reduced Round-Trips: Fewer LLM reasoning cycles between operations
Real-World Use Cases
Use Case 1: Email Processing Pipeline
Scenario: Process 20 emails, extract key information, and update a database.
Sequential Approach:
1. Read email 1 → 500ms
2. Read email 2 → 500ms
...
20. Read email 20 → 500ms
21. Process all emails → 2,000ms
22. Update database → 800ms
Total: 12,800ms (12.8 seconds)
Parallel Approach:
1. Read all 20 emails in parallel → 500ms (run_action_batch)
2. Process all emails → 2,000ms
3. Update database → 800ms
Total: 3,300ms (3.3 seconds)
Improvement: 3.9x faster
Use Case 2: Multi-Source Data Aggregation
Scenario: Build a dashboard with data from 5 different applications.
Sequential Approach:
1. Fetch Gmail data → 400ms
2. Fetch Jira data → 600ms
3. Fetch Slack data → 300ms
4. Fetch Salesforce data → 800ms
5. Fetch GitHub data → 500ms
Total: 2,600ms (2.6 seconds)
Parallel Approach:
1. Fetch all data in parallel → 800ms (run_multi_actions)
Total: 800ms (0.8 seconds)
Improvement: 3.25x faster
Use Case 3: Bulk Record Updates
Scenario: Update 100 customer records in Salesforce.
Sequential Approach:
Update record 1 → 300ms
Update record 2 → 300ms
...
Update record 100 → 300ms
Total: 30,000ms (30 seconds)
Parallel Approach:
Update all 100 records in parallel → 3,000ms (run_action_batch)
Total: 3,000ms (3 seconds)
Improvement: 10x faster
Advanced Features
Error Handling in Parallel Execution
Both run_action_batch and run_multi_actions provide robust error handling:
Partial Success Handling:
{
"batch_results": [
{ "success": true, "index": 0, "result": {...} },
{ "success": false, "index": 1, "error": "Record not found" },
{ "success": true, "index": 2, "result": {...} }
],
"summary": {
"total": 3,
"successful": 2,
"failed": 1
}
}Failed operations don't block successful ones—each execution is independent.
Response Projection for Large Batches
Use response_projection to reduce response size when processing large batches:
{
"tool": "run_action_batch",
"arguments": {
"app_name": "Salesforce",
"user_input": "Get opportunity summaries",
"base_context": {
"operationId": "listOpportunities"
},
"batch_context": [
{ "accountId": "001" },
{ "accountId": "002" },
{ "accountId": "003" }
],
"response_projection": "opportunities[*].{name: name, amount: amount, stage: stageName}"
}
}This extracts only needed fields, reducing token usage and improving performance.
Large Batch Handling
For batches exceeding 128KB response size, Apigene automatically handles truncation:
{
"summary": {
"total": 1000,
"successful": 1000,
"failed": 0,
"truncated": true,
"returned_count": 500,
"omitted_count": 500,
"returned_indices": [0, 1, 2, ..., 499],
"omitted_indices": [500, 501, 502, ..., 999],
"next_action": "Use this batch_context to fetch remaining items..."
}
}The system provides exact instructions for fetching remaining items in subsequent calls.
Best Practices for Parallel Execution
1. Use run_action_batch for Same-Action Scenarios
When executing the same action multiple times with different parameters:
{
"tool": "run_action_batch",
"arguments": {
"app_name": "Gmail",
"base_context": { "operationId": "readEmail" },
"batch_context": [
{ "email_id": "123" },
{ "email_id": "456" }
]
}
}2. Use run_multi_actions for Different Actions
When executing different actions from different apps:
{
"tool": "run_multi_actions",
"arguments": {
"actions": [
{
"app_name": "Gmail",
"context": { "operationId": "getUnreadCount" }
},
{
"app_name": "Jira",
"context": { "operationId": "listIssues" }
}
]
}
}3. Combine Parallel Execution with Dynamic Tool Loading
Use parallel execution together with dynamic tool loading for maximum efficiency:
- Discover tools with
list_actions(summary mode) - Execute multiple actions in parallel with
run_multi_actions - Process results efficiently
4. Handle Errors Gracefully
Always check the summary object for success/failure counts:
{
"summary": {
"total": 10,
"successful": 8,
"failed": 2
}
}5. Use Response Projection for Large Datasets
Reduce response size with JMESPath projections:
{
"response_projection": "items[*].{id: id, name: name}"
}Performance Metrics
Latency Reduction
| Operation Count | Sequential Time | Parallel Time | Reduction |
|---|---|---|---|
| 5 operations | 2,500ms | 600ms | 76% |
| 10 operations | 5,000ms | 800ms | 84% |
| 20 operations | 10,000ms | 1,200ms | 88% |
| 50 operations | 25,000ms | 2,500ms | 90% |
Cost Savings
Parallel execution reduces LLM reasoning cycles:
- Sequential: 1 reasoning cycle per operation = 10 cycles for 10 operations
- Parallel: 1 reasoning cycle for all operations = 1 cycle total
- Savings: 90% reduction in LLM calls
Implementation Details
How Apigene Implements Parallel Execution
Apigene's parallel execution uses JavaScript's Promise.all() to execute multiple operations concurrently:
// run_action_batch implementation
const batchPromises = batch_context.map(async (batchItem, index) => {
const mergedContext = { ...base_context, ...batchItem }
return await executeAction(app_name, mergedContext)
})
const results = await Promise.all(batchPromises)This ensures:
- All operations start simultaneously
- Independent error handling per operation
- Results returned in input order
- Maximum concurrency without blocking
Concurrency Limits
Apigene handles large batches efficiently:
- No hard limits on batch size
- Automatic response truncation for >128KB results
- Efficient memory management
- Graceful degradation for very large batches
Comparison: Sequential vs Parallel Execution
| Aspect | Sequential Execution | Parallel Execution |
|---|---|---|
| Total Time | Sum of all operations | Max of all operations |
| Latency | High (compounds) | Low (overlaps) |
| User Experience | Slow, frustrating | Fast, responsive |
| LLM Cycles | One per operation | One for all operations |
| Cost | High (more cycles) | Low (fewer cycles) |
| Throughput | Low | High |
| Error Handling | Stops on first error | Continues on errors |
| Scalability | Poor (linear growth) | Excellent (constant) |
Getting Started with Parallel Execution
Step 1: Identify Parallelizable Operations
Look for operations that:
- Don't depend on each other's results
- Can execute independently
- Target the same or different applications
Step 2: Choose the Right Tool
- Same action, different params: Use
run_action_batch - Different actions: Use
run_multi_actions - Single action: Use
run_action
Step 3: Structure Your Request
For Batch:
{
"tool": "run_action_batch",
"arguments": {
"app_name": "YourApp",
"base_context": { "operationId": "yourAction" },
"batch_context": [
{ "param1": "value1" },
{ "param1": "value2" }
]
}
}For Multi-Actions:
{
"tool": "run_multi_actions",
"arguments": {
"actions": [
{
"app_name": "App1",
"context": { "operationId": "action1" }
},
{
"app_name": "App2",
"context": { "operationId": "action2" }
}
]
}
}Step 4: Handle Results
Check the summary for success/failure counts and process results accordingly:
{
"summary": {
"total": 10,
"successful": 10,
"failed": 0
}
}Real-World Example: Complete Workflow
Let's build a complete workflow that demonstrates parallel execution:
User Request: "Get my unread emails, check my Jira issues, and send a summary to Slack"
Step 1: Discover Actions (Parallel Discovery)
{
"tool": "run_multi_actions",
"arguments": {
"actions": [
{
"app_name": "Gmail",
"user_input": "Get unread emails",
"context": { "operationId": "listUnreadEmails" }
},
{
"app_name": "Jira",
"user_input": "List my open issues",
"context": { "operationId": "listIssues", "assignee": "me" }
}
]
}
}Step 2: Process Results and Send Summary
{
"tool": "run_action",
"arguments": {
"app_name": "Slack",
"user_input": "Send summary",
"context": {
"operationId": "chat.postMessage",
"channel": "#updates",
"text": "You have 5 unread emails and 3 open Jira issues"
}
}
}Total Time: ~1,100ms (vs 2,400ms sequential)
Improvement: 2.2x faster
Conclusion
Sequential tool execution is a major bottleneck preventing AI agents from reaching their full potential. Apigene's parallel execution capabilities—run_action_batch and run_multi_actions—solve this challenge by enabling agents to execute multiple operations simultaneously.
The benefits are clear:
- 10x faster execution for batch operations
- 3-4x faster for multi-app workflows
- 90% reduction in LLM reasoning cycles
- Better user experience with responsive agents
- Lower costs through reduced token usage
By combining parallel execution with dynamic tool loading, Apigene enables agents to scale efficiently—whether processing 10 operations or 10,000, the performance gains remain consistent.
Ready to accelerate your AI agents? Get started with Apigene's MCP Gateway and experience the power of parallel tool execution.
Learn More: