Best Practice | API Concurrency and Throttling Limits

Prev Next

Introduction

CMiC’s Cloud environment includes safeguards to ensure consistent performance and fair usage across users.

These safeguards manage how API requests are processed to prevent traffic spikes and maintain stable system performance under varying load conditions.

API Throughput Behavior

Our APIs enforce up to 5 concurrent API requests per IP address.

This limit applies at the IP address level, regardless of the number of users or systems making requests from that address. Integrators operating from shared within cloud-hosted environments should account for this when designing their request concurrency approach.

A concurrent request is defined as a request that has been sent to an endpoint and has been received by the server but has not yet been completed with a response.

A request is considered open until the server responds (e.g. 200, 400, 500), at which point the connection is considered closed.

Requests that do not receive a server response within 60 seconds will be terminated with a timeout.  Integrators should treat timeout responses as failures and apply standard retry logic accordingly.

When this limit is exceeded, additional requests will be rejected with a response to 429 Too Many Requests. Requests that exceed this limit are rejected immediately and are not queued. The client is responsible for retry handling. Clients should implement their own backoff logic based on the guidelines below.

CMiC does not currently return a Retry-After header. Clients should implement backoff logic independently based on the guidelines above.

Request Throttling Behavior

The system supports up to 15 requests per second per IP address, however, in practice, the concurrent request limit is typically reached before this threshold and should be considered the primary control for integration design.

The 15 requests per second limit is most relevant in scenarios where requests are short-lived and resolve quickly, for example, lightweight status checks or single-record lookups that complete in milliseconds. In these cases, requests may open and close fast enough that the concurrent slot limit is never saturated, and the per-second rate becomes the binding constraint instead. For most data retrieval integrations involving larger payloads or complex queries, the concurrent limit will be reached first.

Error Handling and Retry Behavior

To ensure reliable integration behavior, clients should implement retry logic for transient errors and throttling responses.

429 – Too Many Requests

This response is a signal to slow down request execution to remain within supported limits.

A 429 response indicates that request limits have been reached.

When this occurs, clients should:

  • Reduce request frequency immediately
  • Apply a backoff strategy before retrying
  • Avoid immediate or repeated retries that can increase contention

5xx – Server Errors

For 5xx responses, clients should implement retry logic as these errors may be temporary.

Recommended approach:

  • Retry up to 3 attempts per request
  • Use an exponential backoff strategy between retries
  • Include a small amount of jitter to avoid synchronized retry spikes

 Recommendations

  • Use pagination to break large datasets into smaller requests
  • Pagination requests should be paced to avoid simultaneous parallel execution against the same dataset.
  • Design integrations to handle 429 responses with retry and exponential backoff (e.g. 5 seconds baseline with increasing backoff)
  • Avoid unfiltered queries or sustained bursts of concurrent requests against the same dataset when only a subset of data is required
  • Monitor response times and adjust request pacing dynamically where possible
  • Combine filtering and pagination to improve both performance and response consistency

For more detailed guidance and examples of data filtering best practices, refer to Best Practice | Data Filtering requirements

If a request continues to fail after the above retries, it should be noted. If failures persist across multiple requests or over a sustained period, indicating a potential systemic issue rather than a transient error — it should be logged through CMiC Support and flagged for further investigation.


 

 

Copyright © 2024 CMiC All Rights Reserved