Extend group methods to optionally allow a group that doesn't need to broadcast
The larry group methods (group_mean etc) currently take one input: a 1d larry called group. The group larry assigns each row to a group. When calculating the group mean of a 2d larry, for example, the group membership of each row cannot be changed across columns. Add the option to input a 2d larry where group membership can change. If group membership information is missing for a column (or one element) use NaN to fill. (Can this be generalized to arbitrary dimension?)
Let's say we have a 2d larry, y, and a 2d larry of groups, g. The extended group method would basically do:
for i in range(y.shape[0]):
y[:,i] = y[:,i].
where, for simplicity, I have assumed that the columns are already aligned. If g is 1d we would just do the normal broadcasting that group_mean already does.
Should we add axis as input?
Proof of concept:
import numpy as np
from la import larry
def getdata():
y = larry(np.
group = larry([[1, 1, 3, 2, 2],
return y, group
def groups_mean(y, group):
if group.ndim == 1:
return y.group_mean(group)
elif group.ndim == 2:
if y.ndim == 2:
z = y.copy() # NOTE: because of this line, int input will become int output
for i, label in enumerate(
idx = group.labelinde
return z
else:
raise ValueError, 'If group is 2d, then y must be 2d.'
else:
raise ValueError, 'group must be 1d or 2d.'
raise RuntimeError, 'Dropped off the end of the function'
Output:
>> from group2d import getdata, groups_mean
>> y, group = getdata()
>> y
label_0
0
1
2
3
4
label_1
0
1
2
3
x
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
>>
>> group
label_0
0
1
2
3
4
label_1
0
1
2
3
x
array([[1, 2, 1, 1],
[1, 2, 2, 2],
[3, 1, 3, 1],
[2, 3, 4, 2],
[2, 3, 5, 1]])
>>
>>
>>
>> groups_mean(y, group)
label_0
0
1
2
3
4
label_1
0
1
2
3
x
array([[ 2, 3, 2, 11],
[ 2, 3, 6, 11],
[ 8, 9, 10, 11],
[14, 15, 14, 11],
[14, 15, 18, 11]])
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Undefined
- Drafter:
- None
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- New
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- 0.2
- Started by
- Completed by