API Rate Limits: Plans, Headers, and Retry Strategies

The Google API enforces rate limits to ensure consistent performance and fair resource usage across all customers. Limits are applied per API key and vary based on your plan. Understanding how limits work — and how to handle them gracefully in your code — is essential for building reliable integrations that behave predictably under load.

Rate Limit Tiers

Your rate limit is determined by the Google plan associated with your workspace. Limits apply per API key and are measured in two windows: requests per minute (burst) and requests per day (total volume).

Plan	Requests per minute	Requests per day
Free	60	1,000
Pro	600	50,000
Enterprise	6,000	Unlimited

If you need higher limits than your current plan provides, you can upgrade your plan from Settings > Billing in the Google dashboard. Enterprise customers can also contact support to discuss custom quota arrangements.

Rate Limit Headers

Every API response — including successful ones — includes headers that tell you exactly where you stand against your current rate limit window. Monitor these headers in your integration to proactively throttle requests before hitting a 429 error.

Header	Description
`X-RateLimit-Limit`	Your request limit per minute
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when window resets

When X-RateLimit-Remaining reaches zero, the next request will be rejected with a 429 Too Many Requests response until the window resets at the time indicated by X-RateLimit-Reset.

Handling 429 Errors

When your integration receives a 429 response, you should pause and retry the request after waiting for the rate limit window to reset. The most robust approach is exponential backoff — each failed attempt waits progressively longer before retrying, which naturally spaces out requests and avoids hammering the API.

JavaScript
Python

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);
    if (response.status !== 429) return response;
    
    const resetTime = response.headers.get('X-RateLimit-Reset');
    const waitMs = resetTime
      ? (Number(resetTime) * 1000 - Date.now())
      : Math.pow(2, attempt) * 1000;
    
    await new Promise(resolve => setTimeout(resolve, waitMs));
  }
  throw new Error('Max retries exceeded');
}

import time
import requests

def fetch_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)
        if response.status_code != 429:
            return response
        
        reset_time = response.headers.get('X-RateLimit-Reset')
        wait = int(reset_time) - int(time.time()) if reset_time else 2 ** attempt
        time.sleep(max(wait, 1))
    raise Exception('Max retries exceeded')

Both examples check the X-RateLimit-Reset header first and wait until the window actually resets, falling back to exponential backoff when the header is unavailable. Always enforce a maximum retry count to prevent infinite loops if the API is unexpectedly unavailable.

Best Practices

Cache responses whenever the underlying data doesn’t change frequently — caching even for 30–60 seconds can dramatically reduce your request volume. Batch requests where the API supports it to accomplish more per call. Most importantly, use webhooks instead of polling for any use case that involves waiting for a state change; a webhook delivers the event to you the moment it happens, costing a single request instead of dozens or hundreds of repeated polls. See the Webhooks reference for details on registering event listeners.

​Rate Limit Tiers

​Rate Limit Headers

​Handling 429 Errors

​Best Practices

Rate Limit Tiers

Rate Limit Headers

Handling 429 Errors

Best Practices