Friday, December 15, 2023

[FIXED] Exceeding Page Speed API Limits using async. How can i slow it down?

December 15, 2023 async-await, google-pagespeed-insights-api, python-3.x, python-asyncio No comments

Issue

I've created code to call the Page Speed Insights API.

The build_cwv_data is an async coroutine that calls the api and retrieves and processes json data for a particular URL.

According to the documentation the API has a limit of 400 requests per 100 seconds. And interestingly it is at around the 100 second mark that the API starts returning a 409 error status code (quota exceeded)

My code is doing approximately 775 calls in 100 seconds.

I don't understand how it is making so many calls in that time period as I have added sleep delays to try to slow it down.

Firstly, why is it still so fast? What can I do to slow it down?


async def retrieve_cwv_data(urls_list):    
    site_id = 10234
    tasks = []
    rate_limit = 2  # maximum number of API calls per second
    interval = 1 / rate_limit  # interval between API calls in seconds
    count = 0
    start_time = time.monotonic()  # initial start time

    for url in urls_list:
        task1 = asyncio.ensure_future(build_cwv_data(site_id, url, 'mobile', psi_key))
        task2 = asyncio.ensure_future(build_cwv_data(site_id, url, 'desktop', psi_key))
        tasks.append(task1)
        tasks.append(task2)
        count += 2
        if count >= rate_limit * 2:
            elapsed_time = time.monotonic() - start_time
            if elapsed_time < interval:
                # introduce delay to stay within the rate limit
                await asyncio.sleep(interval - elapsed_time)
            # reset count and start time for the next second
            count = 0
            start_time = time.monotonic()

    results = await asyncio.gather(*tasks)

    tmp_list = []
    for result in results:
        tmp_list.append(result)
    return tmp_list```

Solution

You made the pause while creating the tasks - but they only start to get executed when you pass the control from your code to the asyncio loop, in the call to asyncio.gather: at that point all your taks are actually executed as fast as possible (each next task starts as soon as the previous one, internally, sends a request and awaits for its response). In other words: you are making the calls as fast as your computer can go - the delay that makes the fault happen around the 100th second is because you have delays before the tasks actually start to making requests.

Always keep in mind that asyncio code is just regular, serialized code, running in a single thread with explicit pause points: no code out of what you are looking at ever runs unless you reach one of those pause-points (or, of course, delegate something to another thread or process) - and the pause points are either the await keyword, or async for and async with.

You have to change your code so that it can work - one way to do that is a semaphore, that could hold the limit of concurrent ongoing requests to 400 and then add some pause.

from asyncio import Semaphore
from time import time

call_semaphore = None
timeout = 100

...


async def makecall(*args):
    async with call_semaphore:
        start = time()
        await build_cwv_data(*args)
        elapsed = time() - start
        # ensure the semaphore slot usage for this task is just fred observing the rate limit:
        await time.sleep(max(0, timeout - elapsed))
        


async def retrieve_cwv_data(urls_list):    
    global call_semaphore
    call_semaphore = Semaphore(400)
    site_id = 10234
    tasks = []


    for url in urls_list:
        task1 = asyncio.ensure_future(makecall(site_id, url, 'mobile', psi_key))
        task2 = asyncio.ensure_future(makecall(site_id, url, 'desktop', psi_key))
        tasks.append(task1)
        tasks.append(task2)

    results = await asyncio.gather(*tasks)

    tmp_list = []
    for result in results:
        tmp_list.append(result)
    return tmp_list

Answered By - jsbueno

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, December 15, 2023

[FIXED] Exceeding Page Speed API Limits using async. How can i slow it down?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels