Issue
I have a directory that contains multiple csv files named in a similar pattern eg:'1000 x 30.csv','1000 y 30'.csv, or '1111 z 60.csv' etc. My csv files are 2 columns of x-axis and y-axis values which I want to store separately in an array. I want to enter an input like: 1000 x 30 so that the program fetches the columns of (1000 x 30.csv) files and stores in an array. I have a code that runs when I enter the path of a particular file and I want to loop through the directory and give me the array values when I enter the file name. Any suggestions would really help me.
import csv
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
import lmfit
import glob
# reading the x/y/z values from the respective csv files
xData = []
yData = []
path = r'C:\Users\angel\OneDrive\Documents\CSV_FILES_NV_LAB\1111 x 30.csv'
with open(path, "r") as f_in:
reader = csv.reader(f_in)
next(reader)
for line in reader:
try:
float_1, float_2 = float(line[0]), float(line[1])
xData.append(float_1)
yData.append(float_2)
except ValueError:
continue
Solution
I think the solution below should get you started. I've commented the code where needed, and pointed to a couple of SO questions/answers.
Note, please provide some pruned and sanitized sample input files for your next question. I had to guess a bit as to what the exact input was. Remember, the better your question, the better your answer.
input files, generated by keyboard mashing
path/to/csv/files/1111 x 30.csv
x,y
156414.4189,84181.46
16989.177,61619.4698974
path/to/csv/files/11 z 205.csv
x,z
3.123123,56.1231
123.6546,645767.654
65465.4561989,97946.56169
Actual code:
main.py
import os
import csv
def get_files_from_path(path: str) -> list:
"""return list of files from path"""
# see the answer on the link below for a ridiculously
# complete answer for this. I tend to use this one.
# note that it also goes into subdirs of the path
# https://stackoverflow.com/a/41447012/9267296
result = []
for subdir, dirs, files in os.walk(path):
for filename in files:
filepath = subdir + os.sep + filename
# only return .csv files
if filename.lower().endswith('.csv'):
result.append(filepath)
return result
def load_csv(filename: str) -> list:
"""load a CSV file and return it as a list of dict items"""
result = []
# note that if you open a file for reading, you don't need
# to use the 'r' part
with open(filename) as infile:
reader = csv.reader(infile)
# get the column names
# https://stackoverflow.com/a/28837325/9267296
# doing this as you state that you're dealing with
# x/y and x/z values
column0, column1 = next(reader)
for line in reader:
try:
result.append({column0: float(line[0]),
column1: float(line[1])})
except Exception as e:
# I always print out error messages
# in case of random weird things
print(e)
continue
return result
def load_all(path: str) -> dict:
"""loads all CSV files into a dict"""
result = {}
csvfiles = get_files_from_path(path)
for filename in csvfiles:
# extract the filename without extension
# and us it as key name
# since we only load .csv files we can just
# remove the last 4 characters from filename
# https://stackoverflow.com/a/57770000/9267296
keyname = os.path.basename(filename)[:-4]
result[keyname] = load_csv(filename)
return result
from pprint import pprint
all = load_all('path/to/csv/files')
pprint(all)
print('\n--------------------\n')
pprint(all['11 z 205'])
output
{'11 z 205': [{'x': 3.123123, 'z': 56.1231},
{'x': 123.6546, 'z': 645767.654},
{'x': 65465.4561989, 'z': 97946.56169}],
'1111 x 30': [{'x': 156414.4189, 'y': 84181.46},
{'x': 16989.177, 'y': 61619.4698974}]}
--------------------
[{'x': 3.123123, 'z': 56.1231},
{'x': 123.6546, 'z': 645767.654},
{'x': 65465.4561989, 'z': 97946.56169}]
Answered By - Edo Akse
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.