Thursday, February 1, 2024

[FIXED] AWS Lambda: Async imports and data query to reduce latency

February 01, 2024 aws-lambda, global-variables, python-asyncio, python-requests No comments

Issue

Let's say I have this Lambda function, which needs to do three things: 1/ import some "heavy modules" (ex Pandas, Numpy), 2/ request some nontrivial volume of data and 3/ perform some analysis on such data.

What I think might be a plausible solution is to define three asyc functions, heavy_import_handler, query, and analyze. Importing modules in global scope via functions is a high-interest Q/A.

So, I should be able to initiate query, free up CPA while waiting for response, begin heavy_import_handler, and block analyze until the two previous functions complete.

Is this an anti-pattern? Is there a simpler approach?

Or perhaps this is a standard solution for Lambda where the heavy imports would be released from memory at end of execution?

(Bonus points: Would provisioned concurrency keep these imports "hot" in memory, or would the execution environments simply be cached so the latency is lower?)

Solution

So, I should be able to initiate query, free up CPA while waiting for response, begin heavy_import_handler, and block analyze until the two previous functions complete.

I'm not deeply familiar with async IO in Python, but I'll still post it because most probably it applies to Python as well.

Here you can find a diagram with the life cycle of a Lambda (I'm not sure if I'm allowed to copy it here).

You are paying for the duration of the INIT, INVOKE and SHUTDOWN blocks of this diagram.

These blocks are invoked by hooks of the Lambda runtime. They can spawn asynchronous virtual threads / tasks / callbacks / promises / whatever else your runtime offers, but the invocation of each block ends as soon as the hook returns.

You can store data and references to resources in the global memory of your container, that is variables defined outside the handler's scope. This is a supported and even recommended way to store references to shared resources that are expensive to create on every invocation, such as database connections, file handles and, in you case, heavy modules.

There's a tricky part to it. Lambda containers run under a virtualization system called Firecracker. This system, among other things, makes sure that no AWS resources are wasted while your Lambdas are not actively running. To this end, Firecracker freezes Lambda containers when they are not in an active stage of their lifetime (i.e. when none of the blocks is actively executing). During the periods marked with dashed lines, nothing runs in your Lambda: no async code, no callbacks, nothing. It's frozen.

Asynchronous calls in all Lambda languages I'm familiar with (that is JavaScript, C# and Java) are implemented under the hood as callbacks that the runtime calls in response to some external events (epoll, libuv, io_uring, synchronization primitives, and so on).

When your Lambda is frozen, the runtime is frozen too. It will not emit those callbacks even if conditions are right for them. Your asynchronous code will not run outside the billing periods.

Let's look at this piece of Javascript code:

// INIT block starts
const modulePromise = (async () => import("module"))();
// INIT block completes.
// During INIT, handler is defined but not executed.
// import returns a promise and initiates reading the module code from disk but does not wait for the promise to be fulfilled.

export const handler = async (event) => {
    // INVOKE block starts
    const [module, data] = await Promise.all([modulePromise, queryData(event)]);
    await module.analyze(data);
    // INVOKE block completes
}

In a usual environment (like a normal node instance on a normal machine), calling import would initiate a choreographed sequence of Linux library calls and JS callbacks running in the background, that would eventually fulfill the promise returned by import. If you call handler several seconds after initializing your container, by the time it's called modulePromise would likely already be resolved and your handler would only have to wait for query.

In a Firecracker environment, the execution is frozen the moment the INIT hook completes (that is, almost instantly). You import will initiate the read of the first block of data from the filesystem by (eventually) calling libuv (or whatever it is that node is using under the hood to read from filesystem) and registering a callback to be called when a response comes. But when the data arrives from the disk, there will be no active node process to read it and call the callback. The callback dance will only continue when your execution environment is thawed, so by the time handler is first invoked, modulePromise in all likelyhood will not have been resolved yet.

The good news is that the module import will still be parallelized with the data fetch on the first invocation of the handler. The import will only happen once, and the module code will be retained in the global memory of the container.

Some (potentially) bad news is that a lot of time, sometimes minutes, can pass between the initialization and first invocation. It probably doesn't matter for filesystem reads, but it might be enough for some resources (TCP connections, pre-signed AWS requests etc.) to go stale. You don't really have any control over it.

Or perhaps this is a standard solution for Lambda where the heavy imports would be released from memory at end of execution?

Imports (as well as anything that lives in the global container data) will only be released from memory at the end of the container's lifetime (up to several hours and tens of thousands of handler invocations).

Under heavy load, Lambda actually preallocates containers (creating them and running the INIT code in advance). You'll get billed for that of course. If that's the case, and if it's the user's experience you're worried about more, you will actually be beffer off just doing regular synchronous imports. They will be done by the moment the first invocation (the "cold start") of your handler happens, so it will actually be faster.

(Bonus points: Would provisioned concurrency keep these imports "hot" in memory, or would the execution environments simply be cached so the latency is lower?)

Concurrently executing Lambdas execute in multiple containers. Within a single container, the execution is strictly sequential. Containers do not share any memory. Two different containers will import the module two times (and bill you for both).

Answered By - Quassnoi

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, February 1, 2024

[FIXED] AWS Lambda: Async imports and data query to reduce latency

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels