A new proposal for indexing with labels
In a blueprint titled "index-by-label" I proposed a way to index larrys by lists of label elements. Here's a simpler, but less versatile, proposal. On the whole, due to its simplicity, I think it is more powerful.
You can index into larrys just like you index into numpy arrays. To index into numpy arrays you can use integer, slices, etc. But you can't use strings. Strings have no meaning in the context of indexing. Therefore we are free to assign a special meaning to strings when used for indexing into a larry.
My proposal is to interpret strings as label elements. So for example:
>> y = la.larry([1,2,3], [['a', 'b', 4]])
>> y['a']
1
>> y['b':] # <---- slick!
label_0
b
4
x
array([2, 3])
>> y['4']
3
Note the last example above. We indexed with the string '4'. But there is no string '4' in the label, there is only the integer 4. The algorithm first looks for a string '4' in the label; if not found, then it maps the label to strings and looks again.
I think it is quite powerful. It does add some overhead to non-string indexing, but not much. The biggest overhead is checking if slice objects have strings in them. For indexing with one integer (y[5]), for example, there is no overhead.
Here are some more examples:
>> from la import larry
>> import numpy as np
>> import datetime
>> d = datetime.date
>>
>> x = np.arange(
>> label = [['price', 'volume'], ['aapl', 'ibm', 'dell'], [d(2009,1,1), d(2009,1,2), d(2009,1,3), d(2009,1,4)]]
>> y = larry(x, label)
>> y['price']
label_0
aapl
ibm
dell
label_1
2009-01-01
2009-01-02
2009-01-03
2009-01-04
x
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>> y['price', 'aapl']
label_0
2009-01-01
2009-01-02
2009-01-03
2009-01-04
x
array([0, 1, 2, 3])
>> y['price', 'aapl':]
label_0
aapl
ibm
dell
label_1
2009-01-01
2009-01-02
2009-01-03
2009-01-04
x
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>> y['price', 'aapl', '2009-01-02']
1
>> y['price', 'dell', '2009-01-02']
9
>> y[:, 'dell', :]
label_0
price
volume
label_1
2009-01-01
2009-01-02
2009-01-03
2009-01-04
x
array([[ 8, 9, 10, 11],
[20, 21, 22, 23]])
>> y[0, 'ibm', 2]
6
>> y[0, 'ibm', :]
label_0
2009-01-01
2009-01-02
2009-01-03
2009-01-04
x
array([4, 5, 6, 7])
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Undefined
- Drafter:
- None
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- New
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by