Issue
I have a dataset with parallel time series. The column 'A' depends on columns 'B' and 'C'. The order (and the number) of dependent columns can change. For example:
A B C
2022-07-23 1 10 100
2022-07-24 2 20 200
2022-07-25 3 30 300
How should I transform this data, or how should I build the model so the order of columns 'B' and 'C' ('A', 'B', 'C' vs 'A', C', 'B'`) doesn't change the result? I know about GCN, but I don't know how to implement it. Maybe there are other ways to achieve it.
UPDATE:
I want to generalize my question and make one more example. Let's say we have a matrix as a singe observation (no time series data):
col1 col2 target
0 1 a 20
1 2 a 30
2 3 b 30
3 4 b 40
I would like to predict one value 'target' per each row/instance. Each instance depends on other instances. The order of rows is irrelevant, and the number of rows in each observation can change.
Solution
You are looking for a permutation invariant operation on the columns.
One way of achieving this would be to apply column-wise operation, followed by a global pooling operation.
How that achieves your goal:
- column-wise operations are permutation equivariant; that is, applying the operation on the columns and permuting the output, is the same as permuting the columns and then applying the operation.
- A global pooling operation (e.g., max-pool, avg-pool) across the columns is permutation invariant: the result of an average pool does not depend on the order of the columns.
- Applying a permutation invariant operation on top of a permutation equivariant once results in an overall permutation invariant function.
Additionally, you should look at self-attention layers, which are also permutation equivariant.
What I would try is:
- Learn a representation (RNN/Transformer) for a single time series. Apply this representation to
A
,B
andC
. - Learn a transformer between the representation of
A
to those ofB
andC
: that is, use the representation ofA
as "query" and those ofB
andC
as "keys" and "values".
This will give you a representation of A
that is permutation invariant in B
and C
.
Update (Aug 3rd, 2022):
For the case of "observations" with varying number of rows, and fixed number of columns:
I think you can treat each row as a "token" (with a fixed dimension = number of columns), and apply a Transformer encoder to predict the target for each "token", from the encoded tokens.
Answered By - Shai
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.