Wednesday, July 27, 2022

[FIXED] DataFrame update sqlite3 table

July 27, 2022 pandas, python, sqlite No comments

Issue

I am receiving .xlsx file and need to update Sqlite3 table, code below works but its kind a slow and have feeling that I am doing something wrong. Please advice how to speed up update process? Thanks in advance. step 1)First using regex to split data into 3 Data-Frames step 2)Cleaning the data and creating dictionary step 3)Update Sqlite3 table while iterating thru dictionary in double four loop

import pandas as pd
import sqlite3


def clean(data):
    df = data[['loc', 'date']].reset_index(drop = True)#Filtering columns that i need
    df['date'] = df['date'].dt.isocalendar().week #Change column values to weeks
    return df

def update_cycle_counting(df):
    #Regex to filter data
    m = df[df['loc'].str.contains('A-[a-zA-Z]\d{2}-\d{3}-\d{2}.\d{2}|E[a-zA-Z]\d{3}-\d{4}|M[a-zA-Z]\d{3}-\d{4}|SAFE\d*')]
    j1 = df[df['loc'].str.contains('C-[a-zA-Z]\d{2}-\d{3}-\d{2}.\d{2}')]
    j2 = df[df['loc'].str.contains('B-[a-zA-Z]\d{2}-\d{3}-\d{2}.\d{2}')]
    #Assign cleaned data to new variables
    m = clean(m)
    j1 = clean(j1)
    j2 = clean(j2)
    #Creating dictionary to loop thru
    wh = {'m':m, 'j1':j1, 'j2':j2}
    #Create path and connect to database
    path ='count.db'
    conn = sqlite3.connect(path)
    #Loop dict table names == dict.keys
    for k,v in wh.items():
        #Updating rows
        for i, row in v.iterrows():
            cur = conn.cursor()
            cur.execute(f'UPDATE {k} SET "{row[1]}"= 1 WHERE "loc" = "{row[0]}";')
            conn.commit()
            cur.close()
    conn.close()

Solution

Your code doesn't work because you use f-string in wrong way.

If you want to create query like

UPDATE test SET '1'= '1' + 1 WHERE bin = 'c'

then you have to add some { } and ' ' like

f"UPDATE test SET '{row[2]}' = '{row[2]}' + 1 WHERE bin = '{row[1]}';

But I don't know why you use SQLite if you can do it directly in DataFrame

for index, row in upd.iterrows():
    df.loc[ df['bin'] == row['bin'], row['data']] += 1

Minimal working code:

import pandas as pd

df = pd.DataFrame({
    'bin': ['a', 'b', 'c', 'd', 'e'],
    '1': [0, 0, 0, 0, 0],
    '2': [0, 1, 0, 0, 0], 
    '3': [0, 0, 0, 0, 0],
    "type": ['x', 'x', 'x', 'x', 'x']
})

upd = pd.DataFrame({'bin': ['b', 'c'], 'data': ['2', '3']})

print('--- before ---')
print(df)

for index, row in upd.iterrows():
    df.loc[ df['bin'] == row['bin'], row['data']] += 1

print('--- after ---')
print(df)

Result:

--- before ---
  bin  1  2  3 type
0   a  0  0  0    x
1   b  0  1  0    x
2   c  0  0  0    x
3   d  0  0  0    x
4   e  0  0  0    x
--- after ---
  bin  1  2  3 type
0   a  0  0  0    x
1   b  0  2  0    x
2   c  0  0  1    x
3   d  0  0  0    x
4   e  0  0  0    x

Answered By - furas

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, July 27, 2022

[FIXED] DataFrame update sqlite3 table

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels