insights

From Sequential to Parallel: How Apigene's Parallel Tool Execution Accelerates AI Agents by 10x

January 15, 2025•Apigene Team•

10 min read

From Sequential to Parallel: How Apigene's Parallel Tool Execution Accelerates AI Agents by 10x

AI agents are transforming how we interact with software, but there's a hidden bottleneck slowing them down: sequential tool execution. When agents execute tools one after another, they create artificial delays that compound with each operation, turning simple workflows into slow, frustrating experiences.

At Apigene, we've solved this challenge with parallel tool execution—enabling agents to run multiple actions simultaneously across different applications. This article explores the sequential execution problem, our parallel execution solution, and how it accelerates agent workflows by up to 10x.

The Problem: Sequential Execution Creates Bottlenecks

The Sequential Execution Trap

Most AI agents execute tools sequentially, waiting for each operation to complete before starting the next one. Consider a common workflow:

User Request: "Get my latest email from Gmail, check my open Jira issues, and send a summary to Slack"

Sequential Execution:

1. Execute Gmail action → Wait 500ms → Complete
2. Execute Jira action → Wait 800ms → Complete  
3. Execute Slack action → Wait 300ms → Complete
Total Time: 1,600ms (1.6 seconds)

Each operation blocks the next one, creating a waterfall effect where total execution time equals the sum of all individual operations.

Real-World Impact

The sequential execution problem becomes more severe as workflows grow in complexity:

Example: Reading 10 Emails Sequentially

Email 1: 500ms
Email 2: 500ms
Email 3: 500ms
...
Email 10: 500ms
Total: 5,000ms (5 seconds)

Example: Multi-App Dashboard Update

Fetch Gmail unread count: 400ms
Fetch Jira open issues: 600ms
Fetch Slack unread messages: 300ms
Fetch Salesforce opportunities: 800ms
Fetch GitHub pull requests: 500ms
Total: 2,600ms (2.6 seconds)

In both cases, the agent spends most of its time waiting rather than executing, creating poor user experiences and inefficient resource utilization.

The Hidden Costs

Sequential execution creates several hidden problems:

Increased Latency: Total time equals sum of all operations
Poor User Experience: Users wait unnecessarily long
Higher Costs: More LLM reasoning cycles between operations
Reduced Throughput: Agents can't handle multiple requests efficiently
Timeout Risks: Long sequential chains may exceed timeout limits

The Solution: Parallel Tool Execution

Apigene's MCP Gateway provides three powerful tools for parallel execution, each optimized for different use cases:

1. `run_action` - Single Action Execution

The foundation tool for executing individual actions. While not parallel itself, it's the building block for parallel operations.

Use Case: Execute a single action when you only need one operation.

Example:

{
  "tool": "run_action",
  "arguments": {
    "app_name": "Slack",
    "user_input": "Send a message to #general",
    "context": {
      "operationId": "chat.postMessage",
      "channel": "#general",
      "text": "Hello, world!"
    }
  }
}

Execution Time: ~300ms

2. `run_action_batch` - Parallel Batch Execution

Execute the same action multiple times in parallel with different parameters. Perfect for processing multiple items of the same type.

Use Case: When you need to execute the same action multiple times with different inputs (e.g., reading multiple emails, updating multiple records, fetching multiple users).

How It Works:

Takes a base_context with shared parameters
Takes a batch_context array with varying parameters
Executes all batch items in parallel using Promise.all()
Returns results in the same order as input

Example: Reading 10 Emails in Parallel

{
  "tool": "run_action_batch",
  "arguments": {
    "app_name": "Gmail",
    "user_input": "Read multiple emails",
    "base_context": {
      "operationId": "readEmail",
      "format": "full"
    },
    "batch_context": [
      { "email_id": "msg_001" },
      { "email_id": "msg_002" },
      { "email_id": "msg_003" },
      { "email_id": "msg_004" },
      { "email_id": "msg_005" },
      { "email_id": "msg_006" },
      { "email_id": "msg_007" },
      { "email_id": "msg_008" },
      { "email_id": "msg_009" },
      { "email_id": "msg_010" }
    ]
  }
}

Execution Time: ~500ms (vs 5,000ms sequential)
Speed Improvement: 10x faster

Response:

{
  "batch_results": [
    {
      "success": true,
      "index": 0,
      "merged_context": {
        "operationId": "readEmail",
        "format": "full",
        "email_id": "msg_001"
      },
      "result": {
        "status_code": 200,
        "response_content": {
          "id": "msg_001",
          "subject": "Meeting Tomorrow",
          "from": "colleague@example.com"
        }
      }
    },
    // ... 9 more results
  ],
  "summary": {
    "total": 10,
    "successful": 10,
    "failed": 0
  }
}

3. `run_multi_actions` - Parallel Multi-Action Execution

Execute multiple different actions simultaneously across different applications. Perfect for workflows that need data from multiple sources.

Use Case: When you need to execute different actions from different apps in parallel (e.g., fetching data from multiple sources, updating multiple systems simultaneously).

How It Works:

Takes an array of action requests
Each action can target a different app
Executes all actions in parallel using Promise.all()
Returns results with success/failure status for each

Example: Multi-App Dashboard Update

{
  "tool": "run_multi_actions",
  "arguments": {
    "actions": [
      {
        "app_name": "Gmail",
        "user_input": "Get unread email count",
        "context": {
          "operationId": "getUnreadCount"
        }
      },
      {
        "app_name": "Jira",
        "user_input": "List my open issues",
        "context": {
          "operationId": "listIssues",
          "assignee": "me",
          "status": "open"
        }
      },
      {
        "app_name": "Slack",
        "user_input": "Get unread message count",
        "context": {
          "operationId": "getUnreadCount"
        }
      },
      {
        "app_name": "Salesforce",
        "user_input": "Get high-value opportunities",
        "context": {
          "operationId": "listOpportunities",
          "minAmount": 10000
        }
      },
      {
        "app_name": "GitHub",
        "user_input": "List open pull requests",
        "context": {
          "operationId": "listPullRequests",
          "state": "open"
        }
      }
    ]
  }
}

Execution Time: ~800ms (vs 2,600ms sequential)
Speed Improvement: 3.25x faster

Response:

{
  "results": [
    {
      "success": true,
      "index": 0,
      "request": {
        "app_name": "Gmail",
        "context": { "operationId": "getUnreadCount" }
      },
      "result": {
        "status_code": 200,
        "response_content": { "unread_count": 5 }
      }
    },
    {
      "success": true,
      "index": 1,
      "request": {
        "app_name": "Jira",
        "context": { "operationId": "listIssues" }
      },
      "result": {
        "status_code": 200,
        "response_content": {
          "issues": [
            {"id": "PROJ-123", "summary": "Fix bug"},
            {"id": "PROJ-124", "summary": "Add feature"}
          ]
        }
      }
    },
    // ... 3 more results
  ],
  "summary": {
    "total": 5,
    "successful": 5,
    "failed": 0
  }
}

Performance Comparison: Sequential vs Parallel

Scenario	Sequential Time	Parallel Time	Improvement
Read 10 emails	5,000ms	500ms	10x faster
Multi-app dashboard (5 apps)	2,600ms	800ms	3.25x faster
Update 50 records	25,000ms	2,500ms	10x faster
Fetch 3 different data sources	1,600ms	600ms	2.67x faster

Why Parallel Execution is Faster

Parallel execution leverages the asynchronous nature of API calls:

Network I/O Overlap: While one API call waits for network response, others can start
Independent Operations: Most tool calls don't depend on each other
Concurrent Processing: Server can handle multiple requests simultaneously
Reduced Round-Trips: Fewer LLM reasoning cycles between operations

Real-World Use Cases

Use Case 1: Email Processing Pipeline

Scenario: Process 20 emails, extract key information, and update a database.

Sequential Approach:

1. Read email 1 → 500ms
2. Read email 2 → 500ms
...
20. Read email 20 → 500ms
21. Process all emails → 2,000ms
22. Update database → 800ms
Total: 12,800ms (12.8 seconds)

Parallel Approach:

1. Read all 20 emails in parallel → 500ms (run_action_batch)
2. Process all emails → 2,000ms
3. Update database → 800ms
Total: 3,300ms (3.3 seconds)

Improvement: 3.9x faster

Use Case 2: Multi-Source Data Aggregation

Scenario: Build a dashboard with data from 5 different applications.

Sequential Approach:

1. Fetch Gmail data → 400ms
2. Fetch Jira data → 600ms
3. Fetch Slack data → 300ms
4. Fetch Salesforce data → 800ms
5. Fetch GitHub data → 500ms
Total: 2,600ms (2.6 seconds)

Parallel Approach:

1. Fetch all data in parallel → 800ms (run_multi_actions)
Total: 800ms (0.8 seconds)

Improvement: 3.25x faster

Use Case 3: Bulk Record Updates

Scenario: Update 100 customer records in Salesforce.

Sequential Approach:

Update record 1 → 300ms
Update record 2 → 300ms
...
Update record 100 → 300ms
Total: 30,000ms (30 seconds)

Parallel Approach:

Update all 100 records in parallel → 3,000ms (run_action_batch)
Total: 3,000ms (3 seconds)

Improvement: 10x faster

Advanced Features

Error Handling in Parallel Execution

Both run_action_batch and run_multi_actions provide robust error handling:

Partial Success Handling:

{
  "batch_results": [
    { "success": true, "index": 0, "result": {...} },
    { "success": false, "index": 1, "error": "Record not found" },
    { "success": true, "index": 2, "result": {...} }
  ],
  "summary": {
    "total": 3,
    "successful": 2,
    "failed": 1
  }
}

Failed operations don't block successful ones—each execution is independent.

Response Projection for Large Batches

Use response_projection to reduce response size when processing large batches:

{
  "tool": "run_action_batch",
  "arguments": {
    "app_name": "Salesforce",
    "user_input": "Get opportunity summaries",
    "base_context": {
      "operationId": "listOpportunities"
    },
    "batch_context": [
      { "accountId": "001" },
      { "accountId": "002" },
      { "accountId": "003" }
    ],
    "response_projection": "opportunities[*].{name: name, amount: amount, stage: stageName}"
  }
}

This extracts only needed fields, reducing token usage and improving performance.

Large Batch Handling

For batches exceeding 128KB response size, Apigene automatically handles truncation:

{
  "summary": {
    "total": 1000,
    "successful": 1000,
    "failed": 0,
    "truncated": true,
    "returned_count": 500,
    "omitted_count": 500,
    "returned_indices": [0, 1, 2, ..., 499],
    "omitted_indices": [500, 501, 502, ..., 999],
    "next_action": "Use this batch_context to fetch remaining items..."
  }
}

The system provides exact instructions for fetching remaining items in subsequent calls.

Best Practices for Parallel Execution

1. Use `run_action_batch` for Same-Action Scenarios

When executing the same action multiple times with different parameters:

{
  "tool": "run_action_batch",
  "arguments": {
    "app_name": "Gmail",
    "base_context": { "operationId": "readEmail" },
    "batch_context": [
      { "email_id": "123" },
      { "email_id": "456" }
    ]
  }
}

2. Use `run_multi_actions` for Different Actions

When executing different actions from different apps:

{
  "tool": "run_multi_actions",
  "arguments": {
    "actions": [
      {
        "app_name": "Gmail",
        "context": { "operationId": "getUnreadCount" }
      },
      {
        "app_name": "Jira",
        "context": { "operationId": "listIssues" }
      }
    ]
  }
}

3. Combine Parallel Execution with Dynamic Tool Loading

Use parallel execution together with dynamic tool loading for maximum efficiency:

Discover tools with list_actions (summary mode)
Execute multiple actions in parallel with run_multi_actions
Process results efficiently

4. Handle Errors Gracefully

Always check the summary object for success/failure counts:

{
  "summary": {
    "total": 10,
    "successful": 8,
    "failed": 2
  }
}

5. Use Response Projection for Large Datasets

Reduce response size with JMESPath projections:

{
  "response_projection": "items[*].{id: id, name: name}"
}

Performance Metrics

Latency Reduction

Operation Count	Sequential Time	Parallel Time	Reduction
5 operations	2,500ms	600ms	76%
10 operations	5,000ms	800ms	84%
20 operations	10,000ms	1,200ms	88%
50 operations	25,000ms	2,500ms	90%

Cost Savings

Parallel execution reduces LLM reasoning cycles:

Sequential: 1 reasoning cycle per operation = 10 cycles for 10 operations
Parallel: 1 reasoning cycle for all operations = 1 cycle total
Savings: 90% reduction in LLM calls

Implementation Details

How Apigene Implements Parallel Execution

Apigene's parallel execution uses JavaScript's Promise.all() to execute multiple operations concurrently:

// run_action_batch implementation
const batchPromises = batch_context.map(async (batchItem, index) => {
  const mergedContext = { ...base_context, ...batchItem }
  return await executeAction(app_name, mergedContext)
})
 
const results = await Promise.all(batchPromises)

This ensures:

All operations start simultaneously
Independent error handling per operation
Results returned in input order
Maximum concurrency without blocking

Concurrency Limits

Apigene handles large batches efficiently:

No hard limits on batch size
Automatic response truncation for >128KB results
Efficient memory management
Graceful degradation for very large batches

Comparison: Sequential vs Parallel Execution

Aspect	Sequential Execution	Parallel Execution
Total Time	Sum of all operations	Max of all operations
Latency	High (compounds)	Low (overlaps)
User Experience	Slow, frustrating	Fast, responsive
LLM Cycles	One per operation	One for all operations
Cost	High (more cycles)	Low (fewer cycles)
Throughput	Low	High
Error Handling	Stops on first error	Continues on errors
Scalability	Poor (linear growth)	Excellent (constant)

Getting Started with Parallel Execution

Step 1: Identify Parallelizable Operations

Look for operations that:

Don't depend on each other's results
Can execute independently
Target the same or different applications

Step 2: Choose the Right Tool

Same action, different params: Use run_action_batch
Different actions: Use run_multi_actions
Single action: Use run_action

Step 3: Structure Your Request

For Batch:

{
  "tool": "run_action_batch",
  "arguments": {
    "app_name": "YourApp",
    "base_context": { "operationId": "yourAction" },
    "batch_context": [
      { "param1": "value1" },
      { "param1": "value2" }
    ]
  }
}

For Multi-Actions:

{
  "tool": "run_multi_actions",
  "arguments": {
    "actions": [
      {
        "app_name": "App1",
        "context": { "operationId": "action1" }
      },
      {
        "app_name": "App2",
        "context": { "operationId": "action2" }
      }
    ]
  }
}

Step 4: Handle Results

Check the summary for success/failure counts and process results accordingly:

{
  "summary": {
    "total": 10,
    "successful": 10,
    "failed": 0
  }
}

Real-World Example: Complete Workflow

Let's build a complete workflow that demonstrates parallel execution:

User Request: "Get my unread emails, check my Jira issues, and send a summary to Slack"

Step 1: Discover Actions (Parallel Discovery)

{
  "tool": "run_multi_actions",
  "arguments": {
    "actions": [
      {
        "app_name": "Gmail",
        "user_input": "Get unread emails",
        "context": { "operationId": "listUnreadEmails" }
      },
      {
        "app_name": "Jira",
        "user_input": "List my open issues",
        "context": { "operationId": "listIssues", "assignee": "me" }
      }
    ]
  }
}

Step 2: Process Results and Send Summary

{
  "tool": "run_action",
  "arguments": {
    "app_name": "Slack",
    "user_input": "Send summary",
    "context": {
      "operationId": "chat.postMessage",
      "channel": "#updates",
      "text": "You have 5 unread emails and 3 open Jira issues"
    }
  }
}

Total Time: ~1,100ms (vs 2,400ms sequential)
Improvement: 2.2x faster

Conclusion

Sequential tool execution is a major bottleneck preventing AI agents from reaching their full potential. Apigene's parallel execution capabilities—run_action_batch and run_multi_actions—solve this challenge by enabling agents to execute multiple operations simultaneously.

The benefits are clear:

10x faster execution for batch operations
3-4x faster for multi-app workflows
90% reduction in LLM reasoning cycles
Better user experience with responsive agents
Lower costs through reduced token usage

By combining parallel execution with dynamic tool loading, Apigene enables agents to scale efficiently—whether processing 10 operations or 10,000, the performance gains remain consistent.

Ready to accelerate your AI agents? Get started with Apigene's MCP Gateway and experience the power of parallel tool execution.

Learn More:

#parallel-execution#batch-processing#concurrent-tools#mcp-gateway#ai-agents#tool-execution#performance-optimization#latency-reduction#agent-efficiency#mcp#model-context-protocol

From Sequential to Parallel: How Apigene's Parallel Tool Execution Accelerates AI Agents by 10x

The Problem: Sequential Execution Creates Bottlenecks

The Sequential Execution Trap

Real-World Impact

The Hidden Costs

The Solution: Parallel Tool Execution

1. run_action - Single Action Execution

2. run_action_batch - Parallel Batch Execution

3. run_multi_actions - Parallel Multi-Action Execution

Performance Comparison: Sequential vs Parallel

Why Parallel Execution is Faster

Real-World Use Cases

Use Case 1: Email Processing Pipeline

Use Case 2: Multi-Source Data Aggregation

Use Case 3: Bulk Record Updates

Advanced Features

Error Handling in Parallel Execution

Response Projection for Large Batches

Large Batch Handling

Best Practices for Parallel Execution

1. Use run_action_batch for Same-Action Scenarios

2. Use run_multi_actions for Different Actions

3. Combine Parallel Execution with Dynamic Tool Loading

4. Handle Errors Gracefully

5. Use Response Projection for Large Datasets

Performance Metrics

Latency Reduction

Cost Savings

Implementation Details

How Apigene Implements Parallel Execution

Concurrency Limits

Comparison: Sequential vs Parallel Execution

Getting Started with Parallel Execution

Step 1: Identify Parallelizable Operations

Step 2: Choose the Right Tool

Step 3: Structure Your Request

Step 4: Handle Results

Real-World Example: Complete Workflow

Conclusion

1. `run_action` - Single Action Execution

2. `run_action_batch` - Parallel Batch Execution

3. `run_multi_actions` - Parallel Multi-Action Execution

1. Use `run_action_batch` for Same-Action Scenarios

2. Use `run_multi_actions` for Different Actions