Issue
I want to create a DF of the possible combinations from a full factorial design of experiments.
I'm using doepy but it seems to be taking a slowing down as the number of combinations in my DOE grows. I switched to taking the Cartesian product with product_dict()
which seems to faster but gives the combinations in a different order.
I need the dataframe of DOE combinations to be in the same order given by doepy, I think thats possible but I'm unsure how.
My question is how to compute the Cartesian product of gas_dict
and give the results in the same order as doepy?
import pandas as pd
from doepy import build
from tqdm.contrib.itertools import product
gas_dict = {
'Velocity (m/s)': [0.00000000E+00, 0.10000000E+00, 0.20000000E+00, 0.30000000E+00,
0.40000000E+00, 0.60000000E+00, 0.10000000E+01],
'Pressure (Pa)': [0.10000000E+06, 0.50000000E+06, 0.10000000E+07, 0.20000000E+07,
0.40000000E+07],
'Temperature': [0.30000000E+03, 0.40000000E+03, 0.50000000E+03, 0.60000000E+03,],
'Equivalence Ratio': [0.10000000E+00, 0.50000000E+00, 0.60000000E+00, 0.70000000E+00,
0.80000000E+00, 0.90000000E+00, 0.10000000E+01, 0.11000000E+01,
0.12000000E+01, 0.13000000E+01] }
def product_dict(**kwargs):
keys = kwargs.keys()
vals = kwargs.values()
for instance in product(*vals):
yield dict(zip(keys, instance))
gas = build.full_fact(gas_dict) #Correct form
gas_product = list(product_dict(**gas_dict)) #Incorrect
gas_product = pd.DataFrame(gas_product)
gas_.equals(gas_product)
compare = gas == gas_product
Solution
I think this comes down to the order of arguments to the product function, which seems to determine the way that the arrays are cycled through when creating the cartesian product. If you reverse the order of arrays in the input dictionary the outputs become the same. (This just seems to be the case for the difference between the two implementations in action here, so would be nice if someone had some more detailed insights.)
# reverse dictionary keys
reversed_cols = list(gas_dict.keys())[::-1]
# create reversed input dictionary
gas_dict_rev = {c: sorted(gas_dict[c]) for c in reversed_cols}
# create cartesian product with reversed column order
gas_product_rev = list(product_dict(**gas_dict_rev))
gas_product_rev = pd.DataFrame(gas_product_rev)
# change the column order to conform original dict
gas_product_rev = gas_product_rev[gas_dict.keys()]
the output then looks the same visually, but gas.equals(gas_product_rev)
still reports FALSE
for me. I'm not familiar with this function but I'd guess it does not take float precision into account. Checking with a numpy function that allows for float precision we get the expected result:
for column in gas:
print(f'{column} {np.allclose(gas[column], gas_product[column])}')
# Velocity (m/s) False
# Pressure (Pa) False
# Temperature False
# Equivalence Ratio False
for column in gas:
print(f'{column} {np.allclose(gas[column], gas_product_rev[column])}')
# Velocity (m/s) True
# Pressure (Pa) True
# Temperature True
# Equivalence Ratio True
Answered By - L_W
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.