Issue
I have this Table (Counters):
cell_id | tftralacc | tfnscan | thtralacc | thnscan | date_time |
---|---|---|---|---|---|
13997 | 10 | 360 | 94 | 360 | 2022-02-22 00:00:00+01 |
13997 | 0 | 360 | 0 | 360 | 2022-02-22 01:00:00+01 |
13997 | 0 | 360 | 0 | 360 | 2022-02-22 02:00:00+01 |
13997 | 0 | 360 | 0 | 360 | 2022-02-22 03:00:00+01 |
13997 | 36 | 360 | 83 | 360 | 2022-02-22 04:00:00+01 |
13997 | 0 | 360 | 2 | 360 | 2022-02-22 05:00:00+01 |
13997 | 1 | 360 | 15 | 360 | 2022-02-22 06:00:00+01 |
13997 | 11 | 360 | 159 | 360 | 2022-02-22 07:00:00+01 |
13997 | 21 | 360 | 409 | 360 | 2022-02-22 08:00:00+01 |
13997 | 25 | 360 | 1282 | 360 | 2022-02-22 09:00:00+01 |
13997 | 20 | 360 | 1201 | 360 | 2022-02-22 10:00:00+01 |
13997 | 30 | 360 | 1381 | 360 | 2022-02-22 11:00:00+01 |
13997 | 42 | 360 | 924 | 360 | 2022-02-22 12:00:00+01 |
14000 | 1 | 360 | 36 | 360 | 2022-02-22 00:00:00+01 |
14000 | 0 | 360 | 0 | 360 | 2022-02-22 01:00:00+01 |
14000 | 1 | 360 | 0 | 360 | 2022-02-22 02:00:00+01 |
14000 | 0 | 360 | 2 | 360 | 2022-02-22 03:00:00+01 |
14000 | 0 | 360 | 0 | 360 | 2022-02-22 04:00:00+01 |
14000 | 0 | 360 | 12 | 360 | 2022-02-22 05:00:00+01 |
14000 | 3 | 360 | 4 | 360 | 2022-02-22 06:00:00+01 |
14000 | 24 | 360 | 123 | 360 | 2022-02-22 07:00:00+01 |
14000 | 31 | 360 | 374 | 360 | 2022-02-22 08:00:00+01 |
14000 | 18 | 360 | 620 | 360 | 2022-02-22 09:00:00+01 |
14000 | 38 | 360 | 1616 | 360 | 2022-02-22 10:00:00+01 |
14000 | 36 | 360 | 1410 | 360 | 2022-02-22 11:00:00+01 |
14000 | 24 | 360 | 957 | 360 | 2022-02-22 12:00:00+01 |
I want to get the specific date_time value of the maximum traffic (which is calculated based on the the fields tftralacc, tfnscan, thtralacc and thnscan) for every cell_id.
I've managed to get this maximum value for every cell_id by using the annotate()
and group_by()
functions of the Django's QuerySet API:
result = Counters.objects.filter(
date_time__gte = date_start,
date_time__lte = date_end
).annotate(
# calculate the traffic for each row.
traffic = Case(
When(Q(tfnscan=0) or Q(thnscan=0), then=0),
default = Round((F('tftralacc')*1.0/F('tfnscan')) +
(F('thtralacc')*1.0/F('thnscan')), 2),
output_field=FloatField()
)
).order_by('cell_id').values(
# Group by cell_id.
'cell_id'
).order_by().annotate(
# calculate the max traffic for the grouped Cells.
max_traffic = Max('traffic')
)
The calculated traffic for every date_time is demonstrated here:
My code successfully returns the maximum traffic for every cell_id:
cell_id | max_traffic |
---|---|
13997 | 3.92 |
14000 | 4.59 |
But my goal is to get the Corresponding date_time value for every max value. like this:
cell_id | max_traffic | date_time |
---|---|---|
13997 | 3.92 | 2022-02-22 11:00:00+01 |
14000 | 4.59 | 2022-02-22 10:00:00+01 |
or
cell_id | date_time |
---|---|
13997 | 2022-02-22 11:00:00+01 |
14000 | 2022-02-22 10:00:00+01 |
Because that max value is just a mean to get the date_time and not the goal.
Note: There is this question that describes my problem, but its answer refers to a work-around solution, which is not possible with my problem. SO Question
Solution
Use models.Subquery
with models.OuterRef
to join on cell_id
field. Then use queryset.annotate()
to annotate the subquery with max_traffic
. Finally, use queryset.filter()
to select rows that have traffic
equals to max_traffic
and use .distinct()
to remove duplicate rows.
counters_with_traffic = Counters.objects.filter(
date_time__gte=date_start,
date_time__lte=date_end
).annotate(
# calculate the traffic for each row.
traffic=Case(
When(Q(tfnscan=0) | Q(thnscan=0), then=0),
default=Round((F('tftralacc') * 1.0 / F('tfnscan')) +
(F('thtralacc') * 1.0 / F('thnscan')), 2),
output_field=models.FloatField()
)
)
counters_with_max_traffic = counters_with_traffic.order_by('cell_id').values(
# Group by cell_id.
'cell_id'
).order_by().annotate(
# calculate the max traffic for the grouped Cells.
max_traffic=Max('traffic'),
).filter(cell_id=models.OuterRef("cell_id")).values("max_traffic")
result = counters_with_traffic.annotate(
max_traffic=models.Subquery(counters_with_max_traffic),
).filter(
traffic=models.F("max_traffic")
).values(
"cell_id", "max_traffic", "date_time"
).distinct("cell_id", "max_traffic")
Answered By - annonymous
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.