Issue
I am trying to create a Lambda function which will clean automatically csv files from an S3 bucket. The S3 bucket receives files every 5mn, and I have therefore created a trigger for the Lambda function. To clean the csv files I will use pandas library to create a dataframe. I have already installed a pandas layer. When creating a dataframe, there is an error message. This is my code:
import json
import boto3
import pandas as pd
from io import StringIO
#call s3 bucket
client = boto3.client('s3')
def lambda_handler(event, context):
#define bucket_name and object _name
bucket_name = event['Records'][0]['s3']['bucket']['name']
object_name = event['Records'][0]['s3']['object']['key']
#create a df from the object
df = pd.read_csv(object_name)
This is the error message:
[ERROR] FileNotFoundError: [Errno 2] No such file or directory: 'object_name'
On Cloudwatch it additionally says:
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
Has anyone experienced the same issues? Thanks in advance for all your help!
Solution
You have to use the s3 client to download the file from s3 before using pandas. Something like:
response = client.get_object(Bucket=bucket_name, Key=object_name)
df = pd.read_csv(response["Body"])
You'll have to make sure lambda has the right permissions to access the s3 bucket.
Answered By - Mimi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.