Issue
I have three ranges and want to create a 3 column array with every possible combination of these ranges and for it to be in a specific order. I know how to do this with a loop. However, in reality the data will have way more than 3 columns and the ranges are very large so I think a loop will be inefficient and would like a fast way of doing this. The real dataset size will be approximately 5 GB so efficiency is key for me. As an example:
inc = 1
a = np.arange(1001,1002+inc,inc)
b = np.arange(1,3+inc,inc)
c = np.arange(1,5+inc,inc)
I want to create an output that looks like:
array([[1001, 1, 1],
[1001, 1, 2],
[1001, 1, 3],
[1001, 1, 4],
[1001, 1, 5],
[1001, 2, 1],
[1001, 2, 2],
[1001, 2, 3],
[1001, 2, 4],
[1001, 2, 5],
[1001, 3, 1],
[1001, 3, 2],
[1001, 3, 3],
[1001, 3, 4],
[1001, 3, 5],
[1002, 1, 1],
[1002, 1, 2],
[1002, 1, 3],
This output is not complete but it shows what I want. I should add that I am doing this because I have an input table of the same format but with missing rows and I want to be able to identify the missing rows by comparing the input dataset to this 'ideal' table. As mentioned above, I can do this with a for loop but want to find a more Pythonic way of doing it if possible.
Solution
You can do it easily with the built-in itertools.product
:
import itertools as it
perms = np.array(list(it.product(a, b, c)))
Output:
>>> perms
array([[1001, 1, 1],
[1001, 1, 2],
[1001, 1, 3],
[1001, 1, 4],
[1001, 1, 5],
[1001, 2, 1],
[1001, 2, 2],
[1001, 2, 3],
[1001, 2, 4],
[1001, 2, 5],
[1001, 3, 1],
[1001, 3, 2],
[1001, 3, 3],
[1001, 3, 4],
[1001, 3, 5],
[1002, 1, 1],
[1002, 1, 2],
[1002, 1, 3],
[1002, 1, 4],
[1002, 1, 5],
[1002, 2, 1],
[1002, 2, 2],
[1002, 2, 3],
[1002, 2, 4],
[1002, 2, 5],
[1002, 3, 1],
[1002, 3, 2],
[1002, 3, 3],
[1002, 3, 4],
[1002, 3, 5]])
Answered By - richardec
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.