larry methods

The larry methods can be divided into the following broad categories:

Below you’ll find the methods in each category along with examples. All of the examples assume that you have already imported larry:

>>> from la import larry

The reference guide for the larry functions, as opposed to methods, can be found in larry functions.

Creation

The creation methods allow you to create larrys.


larry.__init__(x, label=None, dtype=None, validate=True)

Meet larry, he’s a labeled array.

Parameters :

x : numpy array_like

Data, NaN are treated as missing data.

label : {list of lists, None}, optional

A list with labels for each dimension of x. If x is 2d, for example, then label should be a list that contains two lists, one for the row labels and one for the column labels. If x is 1d label should be a list that contain one list of names. If label is None (default) integers will be used to label the the row, columns, etc.

dtype : data-type, optional

The desired data type of the larry.

validate : bool, optional

Check the integrity of the larry by checking that the dimensions of the data match the dimension of the label, that the labels are unique along each axis, and so on. This check adds time to the creation of a larry. The default is the check the integrity.

Raises :

ValueError :

If x cannot be converted to a numpy array, or if the number of elements in label does not match the dimensions of x, or if the elements in label are not unique along each dimension, or if the elements of label are not lists, or if the number of dimensions is zero.

Notes

larry does not copy the data array if it is a Numpy array or if np.asarray() does not make a copy such as when the data array is a Numpy matrix. However, if you change the dtype of the data array, a copy is made. Similarly the label is not copied.

Examples

The labels default to range(n):

>>> larry([6, 7, 8])
label_0
    0
    1
    2
x
array([6, 7, 8])

A more formal way to make a larry:

>>> import numpy as np
>>> x = np.array([1, 2, 3])
>>> label = [['one', 'two', 'three']]
>>> larry(x, label)
label_0
    one
    two
    three
x
array([1, 2, 3])

The dtype can be specified:

>>> larry([0, 1, 2], dtype=bool)
label_0
    0
    1
    2
x
array([False,  True,  True], dtype=bool)            

static larry.fromtuples(data)

Convert a list of tuples to a larry.

The input data, if there are N dimensions and M data points, should have this form:

[(label0_1, label1_1, ..., labelN_1, value_1),
(label0_2, label1_2, ..., labelN_2, value_2),
...
(label0_M, label1_M, ..., labelN_M, value_M)]
Parameters :

data : list of tuples

The input must be a list of tuples where each tuple represents one data point in the larry: (label0, label1, ..., labelN, value)

Returns :

y : larry

A larry constructed from data is returned.

See also

la.larry.totuples
Convert to a flattened list of tuples.
la.larry.fromlist
Convert a list of tuples to a larry.
la.larry.fromdict
Convert a dictionary to a larry.
la.larry.fromcsv
Load a larry from a csv file.

Examples

Convert a list of label, value pairs to a larry:

>>> data = [('r0', 'c0', 1), ('r0', 'c1', 2), ('r1', 'c0', 3), ('r1', 'c1', 4)]
>>> larry.fromtuples(data)
label_0
    r0
    r1
label_1
    c0
    c1
x
array([[ 1.,  2.],
       [ 3.,  4.]])

What happens if we throw out the last data point? The missing value becomes NaN:

>>> data = data[:-1]
>>> larry.fromtuples(data)
label_0
    r0
    r1
label_1
    c0
    c1
x
array([[  1.,   2.],
       [  3.,  NaN]])

static larry.fromlist(data)

Convert a flattened list to a larry.

The input data, if there are N dimensions and M data points, should have this form:

[[value_1,  value_2,  ..., value_M],
[(label0_1, label1_1, ..., labelN_1),
(label0_2, label1_2, ..., labelN_2),
...
(label0_M, label1_M, ..., labelN_M)]]    
Parameters :

data : list

The input must be a list such as that returned by larry.tolist. See the example below.

Returns :

y : larry

A larry constructed from data is returned.

See also

la.larry.tolist
Convert to a flattened list.
la.larry.fromtuples
Convert a list of tuples to a larry.
la.larry.fromdict
Convert a dictionary to a larry.
la.larry.fromcsv
Load a larry from a csv file.

Examples

>>> data = [[1, 2, 3, 4], [('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd')]]
>>> larry.fromlist(data)
label_0
    a
    b
label_1
    c
    d
x
array([[ 1.,  2.],
       [ 3.,  4.]])

static larry.fromdict(data)

Convert a dictionary to a larry.

The input data, if there are N dimensions and M data points, should have this form:

{(label0_1, label1_1, ..., labelN_1): value_1,
(label0_2, label1_2, ..., labelN_2): value_2,
...
(label0_M, label1_M, ..., labelN_M): value_M}   
Parameters :

data : dict

The input must be a dictionary such as that returned by larry.todict See the example below.

Returns :

y : larry

A larry constructed from data is returned.

See also

la.larry.todict
Convert to a dictionary.
la.larry.fromtuples
Convert a list of tuples to a larry.
la.larry.fromlist
Convert a list of tuples to a larry.
la.larry.fromcsv
Load a larry from a csv file.

Examples

>>> data = {('a', 'c'): 1, ('a', 'd'): 2, ('b', 'c'): 3, ('b', 'd'): 4}
>>> larry.fromdict(data)
label_0
    a
    b
label_1
    c
    d
x
array([[ 1.,  2.],
       [ 3.,  4.]])

static larry.fromcsv(filename, delimiter=', ', skiprows=0)

Load a larry from a csv file.

The type information of the labels is not contained in a csv file. Therefore, a label element that was, for example, an integer, will be converted to a string after a round trip (tocsv followed by fromcsv). You can use the maplabel methods to convert it back to an integer.

Integer data values will be converted to floats.

If a data value is missing, a ValueError will be raised. One label element per axis can be missing; the missing label element will be replace with the empty string ‘’.

As you can see from above, the tocsv and fromcvs methods are fragile. A more robust archiving solution is given by the IO class.

The format of the csv file is:

label0, label1, ..., labelN, value
label0, label1, ..., labelN, value
label0, label1, ..., labelN, value
Parameters :

filname : str

The filename of the csv file.

delimiter : str

The delimiter used to separate the labels elements from eachother and from the values.

skiprows : int, optional

Skip the first skiprows lines. No rows are skipped by default.

Raises :

ValueError :

If a data value is missing in the csv file.

See also

la.larry.tocsv
Save larry to a csv file.
la.IO
Save and load larrys in HDF5 format using a dictionary-like interface.
la.larry.fromtuples
Convert a list of tuples to a larry.
la.larry.fromlist
Convert a flattened list to a larry.
la.larry.fromdict
Convert a dictionary to a larry.

Examples

>>> y = larry([1, 2, 3], [['a', 'b', 'c']])
>>> y.tocsv('/tmp/lar.csv')
>>> larry.fromcsv('/tmp/lar.csv')
label_0
    a
    b
    c
x
array([ 1.,  2.,  3.])

Unary

The unary functions (such as log, sqrt, sign) operate on a single larry and do not change its shape or ordering.


larry.log()

Element by element base e logarithm.

Returns :

out : larry

Returns a copy with log of x values.

Examples

>>> y = larry([1, 2, 3])
>>> y.log()
label_0
    0
    1
    2
x
array([ 0.        ,  0.69314718,  1.09861229])
>>>

larry.exp()

Element by element exponential.

Returns :

out : larry

Returns a copy with exp of x values.

Examples

>>> y = larry([1, 2, 3])
>>> y.exp()
label_0
    0
    1
    2
x
array([  2.71828183,   7.3890561 ,  20.08553692])            

larry.sqrt()

Element by element square root.

Returns :

out : larry

Returns a copy with square root of x values.

Examples

>>> y = larry([1, 4, 9])
>>> y.sqrt()
label_0
    0
    1
    2
x
array([ 1.,  2.,  3.])

larry.sign()

Element by element sign of the element.

Returns -1 if x < 0; 0 if x == 0, and 1 if x > 0.

Returns :

out : larry

Returns a copy with the sign of the values.

Examples

>>> y = larry([-1, 2, -3, 4])
>>> y.sign()
label_0
    0
    1
    2
    3
x
array([-1,  1, -1,  1])

larry.power(q)

Element by element x**q.

Parameters :

q : scalar

The power to raise to.

Returns :

out : larry

Returns a copy with x values to the qth power.

Examples

>>> y = larry([1, 2, 3])
>>> y.power(2)
label_0
    0
    1
    2
x
array([1, 4, 9])

larry.cumsum(axis)

Cumulative sum, ignoring NaNs.

Parameters :

axis : int

axis to cumsum along, no default. None is not allowed.

Returns :

out : larry

Returns a copy with cumsum along axis.

Raises :

ValueError :

If axis is None.

See also

la.larry.sum
Sum of values along axis, ignoring NaNs.

Notes

NaNs are ignored except for when all elements in a cumsum are NaN. In that case, a NaN is returned.

Examples

>>> y = larry([1, 2, 3])
>>> y.cumsum(axis=0)
label_0
    0
    1
    2
x
array([1, 3, 6])

If all elements are NaN along the specified axis then NaN is returned:

>>> from la import nan
>>> y = larry([[nan, 2], [nan,  4]])
>>> y.cumsum(axis=0)
label_0
    0
    1
label_1
    0
    1
x
array([[ NaN,   2.],
       [ NaN,   6.]])

larry.cumprod(axis)

Cumulative product, ignoring NaNs.

Parameters :

axis : int

axis to find the cumulative product along, no default. None is not allowed.

Returns :

out : larry

Returns a copy with cumprod along axis.

Raises :

ValueError :

If axis is None.

See also

la.larry.prod
Product of values along axis, ignoring NaNs.

Notes

NaNs are ignored except for when all elements in a cumprod are NaN. In that case, a NaN is returned.

Examples

>>> y = larry([1, 2, 3])
>>> y.cumprod(axis=0)
label_0
    0
    1
    2
x
array([1, 2, 6])

If all elements are NaN along the specified axis then NaN is returned:

>>> from la import nan
>>> y = larry([[nan, 2], [nan,  4]])
>>> y.cumprod(axis=0)
label_0
    0
    1
label_1
    0
    1
x
array([[ NaN,   2.],
       [ NaN,   8.]])                               

larry.clip(lo, hi)

Clip x values.

Parameters :

lo : scalar

All data values less than lo are set to lo.

hi : scalar

All data values greater than hi are set to hi.

Returns :

out : larry

Returns a copy with x values clipped.

Raises :

ValueError :

If lo is greater than hi.

Examples

>>> y = larry([1, 2, 3, 4])
>>> y.clip(2, 3)
label_0
    0
    1
    2
    3
x
array([2, 2, 3, 3])                    

larry.abs()

Absolute value of x.

Returns :

out : larry

Returns a copy with the absolute values of the x data.

Examples

>>> y = larry([-1, 2, -3, 4])
>>> y.abs()
label_0
    0
    1
    2
    3
x
array([1, 2, 3, 4])

larry.isnan()

Returns a bool larry with NaNs replaced by True, non-NaNs False.

Returns :

out : larry

Returns a copy with NaNs replaced by True, non-NaNs False.

Examples

>>> import la
>>> y = larry([-la.inf, 1.0, la.nan, la.inf])
>>> y.isnan()
label_0
    0
    1
    2
    3
x
array([False, False,  True, False], dtype=bool)

larry.isfinite()

Returns a bool larry with NaNs and Inf replaced by True, others False.

Returns :

out : larry

Returns a copy with NaNs and Inf replaced by True, others False.

Examples

>>> import la
>>> y = larry([-la.inf, 1.0, la.nan, la.inf])
>>> y.isfinite()
label_0
    0
    1
    2
    3
x
array([False,  True, False, False], dtype=bool)

larry.isinf()

Returns a bool larry with -Inf and Inf replaced by True, others False.

Returns :

out : larry

Returns a copy with -Inf and Inf replaced by True, others False.

Examples

>>> import la
>>> y = larry([-la.inf, 1.0, la.nan, la.inf])
>>> y.isinf()
label_0
    0
    1
    2
    3
x
array([ True, False, False,  True], dtype=bool)

larry.invert()

Element by element inverting of True to False and False to True.

Raises :

TypeError :

If larry does not have bool dtype.

Examples

>>> y = larry([True, False])
>>> y.invert()
label_0
    0
    1
x
array([False,  True], dtype=bool)

larry.__invert__()

Element by element inverting of True to False and False to True.

Raises :

TypeError :

If larry does not have bool dtype.

Examples

>>> y = larry([True, False])
>>> ~y
label_0
    0
    1
x
array([False,  True], dtype=bool)

Binary methods

The binary methods (such as +, -, / and *) combine a larry with a scalar, Numpy array, or another larry. More general binary functions, that give you control of the join method and the fill method can be found in Binary functions.


larry.__add__(other)

Sum a larry with another larry, Numpy array, or scalar.

If two larrys are added then the larrys are joined with an inner join (i.e., the intersection of the labels).

See also

la.add
Sum of two larrys using given join and fill methods.

Examples

>>> larry([1.0, 2.0]) + larry([2.0, 3.0])
label_0
    0
    1
x
array([ 3.,  5.])        
>>> y1 = larry([1,2], [['a', 'b']])
>>> y2 = larry([1,2], [['b', 'c']])
>>> y1 + y2
label_0
    b
x
array([3])        

larry.__sub__(other)

Subtract a larry from another larry, Numpy array, or scalar.

If two larrys are subtracted then the larrys are joined with an inner join (i.e., the intersection of the labels).

See also

la.subtract
Difference of two larrys using given join and fill methods.

Examples

>>> larry([1.0, 2.0]) - larry([2.0, 3.0])
label_0
    0
    1
x
array([-1., -1.])        
>>> y1 = larry([1,2], [['a', 'b']])
>>> y2 = larry([1,2], [['b', 'c']])
>>> y1 - y2
label_0
    b
x
array([1])        

larry.__div__(other)

Divide a larry with another larry, Numpy array, or scalar.

If two larrys are divided then the larrys are joined with an inner join (i.e., the intersection of the labels).

See also

la.divide
divide two larrys element-wise using given join method.

Examples

>>> larry([1.0, 2.0]) / larry([2.0, 3.0])
label_0
    0
    1
x
array([ 0.5       ,  0.66666667])        
>>> y1 = larry([1,2], [['a', 'b']])
>>> y2 = larry([1,2], [['b', 'c']])
>>> y1 / y2
label_0
    b
x
array([2])        

larry.__mul__(other)

Multiply a larry with another larry, Numpy array, or scalar.

If two larrys are multiplied then the larrys are joined with an inner join (i.e., the intersection of the labels).

See also

la.multiply
Multiply two larrys element-wise using given join method.

Examples

>>> larry([1.0, 2.0]) * larry([2.0, 3.0])
label_0
    0
    1
x
array([ 2.,  6.])        
>>> y1 = larry([1,2], [['a', 'b']])
>>> y2 = larry([1,2], [['b', 'c']])
>>> y1 * y2
label_0
    b
x
array([2])        

larry.__and__(other)

Logical and a larry with another larry, Numpy array, or scalar.

Notes

Numpy defines & as bitwise_and; here & is defined as numpy.logical_and.

Examples

>>> (larry([1.0, 2.0]) > 1) & (larry([2.0, 3.0]) > 1)
label_0
    0
    1
x
array([False,  True], dtype=bool)        
>>> y1 = larry([1,2], [['a', 'b']])
>>> y2 = larry([1,2], [['b', 'c']])
>>> (y1 > 0) & (y2 > 0)
label_0
    b
x
array([ True], dtype=bool)                 

larry.__or__(other)

Logical or a larry with another larry, Numpy array, or scalar.

Notes

Numpy defines | as bitwise_or; here & is defined as numpy.logical_or.

Examples

>>> (larry([1.0, 2.0]) > 1) | (larry([2.0, 3.0]) > 1)
label_0
    0
    1
x
array([ True,  True], dtype=bool)        
>>> y1 = larry([1,2], [['a', 'b']])
>>> y2 = larry([1,2], [['b', 'c']])
>>> (y1 > 0) | (y2 > 0)
label_0
    b
x
array([ True], dtype=bool)

Reduce

The reduce methods (such as sum and std) aggregate along an axis or axes thereby reducing the dimension of the larry.


larry.sum(axis=None)

Sum of values along axis, ignoring NaNs.

Parameters :

axis : {None, integer}, optional

Axis to sum along or sum over all (None, default).

Returns :

d : {larry, scalar}

When axis is an integer a larry is returned. When axis is None (default) a scalar is returned (assuming larry contains scalars).

Raises :

ValueError :

If axis is not an integer or None.

Examples

>>> from la import nan 
>>> y = larry([[nan, 2], [3,  4]])
>>> y.sum()
9.0
>>> y.sum(axis=0)
label_0
    0
    1
x
array([ 3.,  6.])

larry.prod(axis=None)

Product of values along axis, ignoring NaNs.

Parameters :

axis : {None, int}, optional

Axis to find the product along or find the product over a all axes (None, default).

Returns :

d : {larry, scalar}

Returns a larry or a scalar. When axis is None (default) the larry is flattened and a scalar, the product of all elements, is returned; when larry is 1d and axis=0 a scalar is returned.

See also

la.larry.cumprod
Cumulative product, ignoring NaNs.

Notes

NaNs are ignored except for when all elements in a product are NaN. In that case, a NaN is returned. Also the product of an empty larry is NaN.

Examples

>>> from la import nan        
>>> y = larry([[nan, 2], [3,  4]])
>>> y.prod()
24.0
>>> y.prod(axis=0)
label_0
    0
    1
x
array([ 3.,  8.])

If all elements are NaN along the specified axis then NaN is returned:

>>> from la import nan 
>>> y = larry([[nan, 2], [nan,  4]])
>>> y.prod(axis=0)
label_0
    0
    1
x
array([ NaN,   8.])

larry.mean(axis=None)

Mean of values along axis, ignoring NaNs.

Parameters :

axis : {None, integer}, optional

Axis to find the mean along (integer) or the global mean (None, default).

Returns :

d : {larry, scalar}

The mean.

Examples

>>> from la import nan
>>> y = larry([[nan, 2], [3,  4]])
>>> y.mean()
3.0
>>> y.mean(axis=0)
label_0
    0
    1
x
array([ 3.,  3.])       

larry.geometric_mean(axis=None, check_for_greater_than_zero=True)

Geometric mean of values along axis, ignoring NaNs.

Parameters :

axis : {None, integer}, optional

Axis to find the geometric mean along (integer) or the global geometric mean (None, default).

check_for_greater_than_zero : bool

If True (default) an exception is raised if any element is zero or less.

Returns :

d : {larry, scalar}

The geometric mean.

Examples

>>> from la import nan
>>> y = larry([[nan, 2], [3,  4]])
>>> y.geometric_mean()
2.8844991406148162
>>> y.geometric_mean(axis=0)
label_0
    0
    1
x
array([ 3.        ,  2.82842712])      

larry.median(axis=None)

Median of values along axis, ignoring NaNs.

Parameters :

axis : {None, integer}, optional

Axis to find the median along (0 or 1) or the global median (None, default).

Returns :

d : {larry, scalar}

When axis is an integer a larry is returned. When axis is None (default) a scalar is returned (assuming larry contains scalars).

Raises :

ValueError :

If axis is not an integer or None.

Examples

>>> from la import nan
>>> y = larry([[nan, 2], [3,  4]])
>>> y.median()
3.0
>>> y.median(axis=0)
label_0
    0
    1
x
array([ 3.,  3.])

larry.std(axis=None, ddof=0)

Standard deviation of values along axis, ignoring NaNs.

float64 intermediate and return values are used for integer inputs.

Instead of a faster one-pass algorithm, a more stable two-pass algorithm is used.

An example of a one-pass algorithm:

>>> np.sqrt((arr*arr).mean() - arr.mean()**2)

An example of a two-pass algorithm:

>>> np.sqrt(((arr - arr.mean())**2).mean())

Note in the two-pass algorithm the mean must be found (first pass) before the squared deviation (second pass) can be found.

Parameters :

axis : {int, None}, optional

Axis along which the standard deviation is computed. The default (axis=None) is to compute the standard deviation of the flattened array.

ddof : int, optional

Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.

Returns :

d : {larry, scalar}

When axis is an integer a larry is returned. When axis is None (default) a scalar is returned (assuming larry contains scalars).

Raises :

ValueError :

If axis is not an integer or None.

Examples

>>> from la import nan 
>>> y = larry([[nan, 2], [3,  4]])
>>> y.std()
0.81649658092772603
>>> y.std(axis=0)
label_0
    0
    1
x
array([ 0.,  1.])  

larry.var(axis=None, ddof=0)

Variance of values along axis, ignoring NaNs.

float64 intermediate and return values are used for integer inputs.

Instead of a faster one-pass algorithm, a more stable two-pass algorithm is used.

An example of a one-pass algorithm:

>>> np.sqrt((arr*arr).mean() - arr.mean()**2)

An example of a two-pass algorithm:

>>> np.sqrt(((arr - arr.mean())**2).mean())

Note in the two-pass algorithm the mean must be found (first pass) before the squared deviation (second pass) can be found.

Parameters :

axis : {int, None}, optional

Axis along which the variance is computed. The default (axis=None) is to compute the variance of the flattened array.

ddof : int, optional

Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.

Returns :

d : {larry, scalar}

When axis is an integer a larry is returned. When axis is None (default) a scalar is returned (assuming larry contains scalars).

Raises :

ValueError :

If axis is not an integer or None.

Examples

>>> from la import nan
>>> y = larry([[nan, 2], [3,  4]])
>>> y.var()
0.66666666666666663
>>> y.var(axis=0)
label_0
    0
    1
x
array([ 0.,  1.])

larry.max(axis=None)

Maximum values along axis, ignoring NaNs.

Parameters :

axis : {None, integer}, optional

Axis to find the max along (integer) or the global max (None, default).

Returns :

d : {larry, scalar}

When axis is an integer a larry is returned. When axis is None (default) a scalar is returned (assuming larry contains scalars).

Raises :

ValueError :

If axis is not an integer or None.

Examples

>>> from la import nan
>>> y = larry([[nan, 2], [3,  4]])
>>> y.max()
4.0
>>> y.max(axis=0)
label_0
    0
    1
x
array([ 3.,  4.])

larry.min(axis=None)

Minimum values along axis, ignoring NaNs.

Parameters :

axis : {None, integer}, optional

Axis to find the min along (integer) or the global min (None, default).

Returns :

d : {larry, scalar}

When axis is an integer a larry is returned. When axis is None (default) a scalar is returned (assuming larry contains scalars).

Raises :

ValueError :

If axis is not an integer or None.

Examples

>>> from la import nan
>>> y = larry([[nan, 2], [3,  4]])
>>> y.min()
2.0
>>> y.min(axis=0)
label_0
    0
    1
x
array([ 3.,  2.])

larry.any(axis=None)

True if any element along specified axis is True; False otherwise.

Parameters :

axis : {int, None}, optional

The axis over which to reduce the truth. By default (axis=None) the larry is flattened before the truth is found.

Returns :

out : {larry, True, False}

If axis is None then returns True if any data element of the larry (not including the label) is True; False otherwise. If axis is an integer then a bool larry is returned.

Notes

As in Numpy, NaN is True since it is not equal to 0.

Examples

>>> y = larry([[1, 2], [3,  4]]) < 2
>>> y
label_0
    0
    1
label_1
    0
    1
x
array([[ True, False],
       [False, False]], dtype=bool)
>>> y.any()
True
>>> y.any(axis=1)
label_0
    0
    1
x
array([ True, False], dtype=bool)

larry.all(axis=None)

True if all elements along specified axis are True; False otherwise.

Parameters :

axis : {int, None}, optional

The axis over which to reduce the truth. By default (axis=None) the larry is flattened before the truth is found.

Returns :

out : {larry, True, False}

If axis is None then returns True if all data elements of the larry (not including the label) are True; False otherwise. If axis is an integer then a bool larry is returned.

Notes

As in Numpy, NaN is True since it is not equal to 0.

Examples

>>> y = larry([[1, 2], [3,  4]]) > 1
>>> y
label_0
    0
    1
label_1
    0
    1
x
array([[False,  True],
       [ True,  True]], dtype=bool)
>>> y.all()
False
>>> y.all(axis=0)
label_0
    0
    1
x
array([False,  True], dtype=bool)

larry.lastrank(axis=-1, decay=0)

The ranking of the last element along the axis, ignoring NaNs.

The ranking is normalized to be between -1 and 1 instead of the more common 1 and N. The results are adjusted for ties. Suitably slicing the output of the ranking method will give the same result as lastrank. The only difference is that lastrank is faster.

Parameters :

axis : int, optional

The axis over which to rank. By default (axis=-1) the ranking (and reducing) is performed over the last axis.

decay : scalar, optional

Exponential decay strength. Cannot be negative. The default (decay=0) is no decay. In normal ranking (decay=0) all elements used to calculate the rank are equally weighted and so the ordering of all but the last element does not matter. In exponentially decayed ranking the ordering of the elements influences the ranking: elements nearer the last element get more weight.

Returns :

d : larry

In the case of, for example, a 2d larry of shape (n, m) and axis=1, the output will contain the rank (normalized to be between -1 and 1 and adjusted for ties) of the the last element of each row. The output in this example will have shape (n,).

See also

la.larry.ranking
Rank elements treating NaN as missing.
la.larry.movingrank
Moving rank in a given window along axis.

Examples

Create a larry:

>>> y1 = larry([1, 2, 3])

What is the rank of the last element (the value 3 in this example)? It is the largest element so the rank is 1.0:

>>> y1.lastrank()
1.0

Now let’s try an example where the last element has the smallest value:

>>> y2 = larry([3, 2, 1])
>>> y2.lastrank()
-1.0

Here’s an example where the last element is not the minimum or maximum value:

>>> y3 = larry([1, 3, 4, 5, 2])
>>> y3.lastrank()
-0.5

Finally, let’s add a large decay. The decay means that the elements closest to the last element receive the most weight. Because the decay is large, the first element (the value 1) doesn’t get any weight and therefore the last element (2) becomes the smallest element:

>>> y3.lastrank(decay=10)
-1.0

Comparison

The comparison methods, such as ==, >, and !=, perform an element-by-element comparison and return a bool larry. For example:

>>> y1 = larry([1, 2, 3, 4])
>>> y2 = larry([1, 9, 3, 9])
>>> y1 == y2
label_0
    0
    1
    2
    3
x
array([ True, False,  True, False], dtype=bool)

and

>>> from la import larry
>>> y1 = larry([1, 2], [['a', 'b']])
>>> y2 = larry([1, 2], [['b', 'c']])
>>> y1 == y2
label_0
    b
x
array([False], dtype=bool)

A larry can be compared with a scalar, NumPy array, list, tuple, and another larry.

Warning

Do not compare a NumPy array on the left-hand side with a larry on the right-hand side. You will get unexpected results. To compare a larry to a NumPy array, put the array on the right-hand side.


larry.__eq__(other)

Element by element equality (==) comparison.


larry.__ne__(other)

Element by element inequality (!=) comparison.


larry.__lt__(other)

Element by element ‘less than’ (<) comparison.


larry.__gt__(other)

Element by element ‘greater than’ (>) comparison.


larry.__le__(other)

Element by element ‘less than or equal to’ (<=) comparison.


larry.__ne__(other)

Element by element inequality (!=) comparison.

Get and set

The get methods return subsets of a larry through indexing and the set methods assign values to a subset of a larry.


larry.__getitem__(index)

Index into a larry.

Examples

>>> y = larry([[1, 2], [3,  4]])
>>> y[0,0]
1
>>> y[0,:]
label_0
    0
    1
x
array([1, 2])
>>> y[:,1:]
label_0
    0
    1
label_1
    1
x
array([[2],
       [4]])

larry.take(indices, axis)

A copy of the specified elements of a larry along an axis.

This method does the same thing as “fancy” indexing (indexing larrys using lists or Numpy arrays); however, it can be easier to use if you need elements along a given axis. It is also faster in many situations.

Parameters :

indices : sequence

The indices of the values to extract. You can use a list, tuple, 1d Numpy array or any other iterable object.

axis : int

The axis over which to select values.

Returns :

lar : larry

A copy of the specified elements of a larry along the specifed axis.

Examples

Take elements at index 0 and 2 from a 1d larry:

>>> y = larry([1, 2, 3])
>>> y.take([0, 2], axis=0)
label_0
    0
    2
x
array([1, 3])

Take columns 0 and 2 from a 2d larry:

>>> y = la.rand(2, 3)
>>> y.take([0, 2], axis=1)
label_0
    0
    1
label_1
    0
    2
x
array([[ 0.07887698,  0.44490303],
       [ 0.75024392,  0.92896999]])

The above is equivalent to (but faster than):

>>> y[:, [0, 2]]
label_0
    0
    1
label_1
    0
    2
x
array([[ 0.07887698,  0.44490303],
       [ 0.75024392,  0.92896999]])

larry.lix()

Index into a larry using labels or index numbers or both.

In order to distinguish between labels and indices, label elements must be wrapped in a list while indices (integers) cannot be wrapped in a list. If you wrap indices in a list they will be interpreted as label elements.

When indexing with multi-element lists of labels along more than one axes, rectangular indexing is used instead of fancy indexing. Note that the corresponding situation with NumPy arrays would produce fancy indexing.

Slicing can be done with labels or indices or a combination of the two. A single element along an axis can be selected with a label or the index value. Several elements along an axis can be selected with a multi-element list of labels. Lists of indices are not allowed.

Examples

Let’s start by making a larry that we can use to demonstrate idexing by label:

>>> y = larry(range(6), [['a', 'b', 3, 4, 'e', 'f']])

We can select the first element of the larry using the index value, 0, or the corresponding label, ‘a’:

>>> y.lix[0]
0
>>> y.lix[['a']]
0

We can slice with index values or with labels:

>>> y.lix[0:]
label_0
    a
    b
    3
    4
    e
    f
x
array([0, 1, 2, 3, 4, 5])
>>> y.lix[['a']:]
label_0
    a
    b
    3
    4
    e
    f
x
array([0, 1, 2, 3, 4, 5])
>>> y.lix[['a']:['e']]
label_0
    a
    b
    3
    4
x
array([0, 1, 2, 3])
>>> y.lix[['a']:['e']:2]
label_0
    a
    3
x
array([0, 2])

Be careful of the difference between indexing with indices and indexing with labels. In the first example below 4 is an index; in the second example 4 is a label element:

>>> y.lix[['a']:4]
label_0
    a
    b
    3
    4
x
array([0, 1, 2, 3])
>>> y.lix[['a']:[4]]
label_0
    a
    b
    3
x
array([0, 1, 2])

Here’s a demonstration of rectangular indexing:

>>> y = larry([[1, 2], [3, 4]], [['a', 'b'], ['c', 'd']])
>>> y.lix[['a', 'b'], ['c', 'd']]
label_0
    a
    b
label_1
    c
    d
x
array([[1, 2],
       [3, 4]])

The rectangular indexing above is very different from how Numpy arrays behave. The corresponding example with a NumyPy array:

>>> x = np.array([[1, 2], [3, 4]])
>>> x[[0, 1], [0, 1]]
array([1, 4])       

larry.__setitem__(index, value)

Assign values to a subset of a larry using indexing to select subset.

Examples

Let’s set all elements of a larry with values less then 3 to zero:

>>> import numpy as np
>>> x = np.array([[1, 2], [3, 4]])
>>> label = [['a', 'b'], [8, 10]]
>>> y = larry(x, label)
>>> y[y < 3] = 0
>>> y
label_0
    a
    b
label_1
    8
    10
x
array([[0, 0],
       [3, 4]])

larry.get(label)

Get one x element given a list of label names.

Give one label name (not label index) for each dimension.

Parameters :

label : {list, tuple}

List or tuple of one label name for each dimension. For example, for row label ‘a’ and column label 7: (‘a’, 7).

Returns :

out : scalar, string, etc.

Value of the single cell specified by label.

Raises :

ValueError :

If the length of label is not equal to the number of dimensions of larry.

Examples

>>> y = larry([[1, 2], [3, 4]], [['r0', 'r1'], ['c0', 'c1']])
>>> y.get(['r0', 'c1'])
2

larry.set(label, value)

Set one x element given a list of label names.

Give one label name (not label index) for each dimension.

Parameters :

label : {list, tuple}

List or tuple of one label name for each dimension. For example, for row label ‘a’ and column label 7: (‘a’, 7).

value : Float, string, etc.

Value to place in the single cell specified by label.

Returns :

None :

Raises :

ValueError :

If the length of label is not equal to the number of dimensions of larry.

Examples

>>> y = larry([[1, 2], [3, 4]], [['r0', 'r1'], ['c0', 'c1']])
>>> y.set(['r0', 'c1'], 99)
>>> y
label_0
    r0
    r1
label_1
    c0
    c1
x
array([[ 1, 99],
       [ 3,  4]])

larry.getx(copy=True)

Return a copy of the x data or a reference to it.

Parameters :

copy : {True, False}, optional

Return a copy (True, default) of the x values or a reference (False) to it.

Returns :

out : array

Copy or reference of x array.

Examples

>>> y = larry([0, 1, 2])
>>> x = y.getx()
>>> (x == y.x).all()
True
>>> x is y.x
False
>>> x = y.getx(copy=False)
>>> x is y.x
True

larry.A()

Return a reference to the underlying Numpy array.

Examples

>>> y = larry([1, 2, 3])
>>> y.A
array([1, 2, 3])
>>> type(y.A)
<type 'numpy.ndarray'>        

larry.getlabel(axis, copy=True)

Return a copy of the label or a reference to it.

Parameters :

axis : int

The axis identifies the label you wish to get.

copy : {True, False}, optional

Return a copy (True, default) of the label or a reference (False) to it.

Returns :

out : list

Copy or reference of the label.

Examples

Get a copy of the label:

>>> y = larry([[1, 2], [3, 4]], [['r0', 'r1'], ['c0', 'c1']])
>>> y.getlabel(axis=0)
['r0', 'r1']
>>> y.getlabel(axis=1)
['c0', 'c1']

The difference between a copy and a reference to the label:

>>> label = y.getlabel(0)
>>> label == y.label[0]
True
>>> label is y.label[0]
False
>>> label = y.getlabel(0, copy=False)
>>> label is y.label[0]
True        

larry.fill(fill_value)

Inplace filling of data array with specified value.

Parameters :

fill_value : {scalar, string, etc}

Value to replace every element of the data array.

Returns :

out : None

Examples

>>> y = larry([0, 1])
>>> y.fill(9)
>>> y
label_0
    0
    1
x
array([9, 9])        

larry.pull(name, axis)

Pull out the values for a given label name along a specified axis.

A view of the data (but a copy of the label) is returned and the dimension is reduced by one.

Parameters :

name : scalar, string, etc.

Label name.

axis : integer

The axis the label name is in.

Returns :

out : {view of larry, scalar}

A view of the larry with the dimension reduced by one is returned unless the larry is alread 1d, then a scalar is returned.

Raises :

ValueError :

If the axis is None.

Examples

>>> y = larry([[1, 2], [3, 4]], [['r0', 'r1'], ['c0', 'c1']])
>>> y.pull('r0', axis=0)
label_0
    c0
    c1
x
array([1, 2])
>>> import numpy as np
>>> x = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> label = [['experiment1', 'experient2'], ['r0', 'r1'], ['c0', 'c1']]
>>> y = larry(x, label)
>>> y.pull('experiment1', axis=0)
label_0
    r0
    r1
label_1
    c0
    c1
x
array([[1, 2],
       [3, 4]])

larry.keep_label(op, value, axis)

Keep labels (and corresponding values) that satisfy conditon.

Keep labels that satify:

label[axis] op value,

where op can be ‘==’, ‘>’, ‘<’, ‘>=’, ‘<=’, ‘!=’, ‘in’, ‘not in’.

Parameters :

op : string

Operation to perform. op can be ‘==’, ‘>’, ‘<’, ‘>=’, ‘<=’, ‘!=’, ‘in’, ‘not in’.

value : anything that can be compared to labels

Usually the same type as the labels. So if the labels are integers then value is an integer.

axis : integer

axis over which to test condiction.

Returns :

out : larry

Returns a copy with only the labels and corresponding values that satisfy the specified condition.

Raises :

ValueError :

If op is unknown or if axis is None.

IndexError :

If axis is out of range.

Examples

>>> y = larry([1, 2, 3, 4], [['a', 'b', 'c', 'd']])
>>> y.keep_label('<', 'c', axis=0)
label_0
    a
    b
x
array([1, 2])                    

larry.keep_x(op, value, vacuum=True)

Keep labels (and corresponding values) that satisfy conditon.

Set x values that do not satify the condition to NaN. Then, if vacuum is True, rows and columns with all NaNs will be removed. If vacuum is True, larry must be 2d.

Note that when vacuum is True, all rows and columns with all NaNs (even if they already had all NaNs in the row or column before this function was called) will be removed.

The op can be ‘==’, ‘>’, ‘<’, ‘>=’, ‘<=’, ‘!=’.

Parameters :

op : string

Operation to perform. op can be ‘==’, ‘>’, ‘<’, ‘>=’, ‘<=’, ‘!=’.

value : anything that can be compared to labels

Usually the same type as the labels. So if the labels are integers then value is an integer.

vacuum : {True, False}, optional

Vacuum larry after conditionally setting data values to NaN. False is the default.

Returns :

out : larry

Returns a copy with only the labels and corresponding values that satisfy the specified condition.

Raises :

ValueError :

If op is unknown or if axis is None.

IndexError :

If axis is out of range.

Label

The label methods allow you to get information (and change) the labels of a larry.


larry.maxlabel(axis=None)

Maximum label value along the specified axis.

Parameters :

axis : {int, None}, optional

The axis over which to find the maximum label. By default (None) the search for the maximum label element is performed along all axes.

Returns :

out : scalar, string, etc.

The maximum label element along the specified axis.

Examples

What is the maximum label value in the following larry?

>>> y = larry([1, 2, 3], [['a', 'z', 'w']])
>>> y.maxlabel()
'z'               

larry.minlabel(axis=None)

Minimum label value along the specified axis.

Parameters :

axis : {int, None}, optional

The axis over which to find the minimum label. By default (None) the search for the minimum label element is performed along all axes.

Returns :

out : scalar, string, etc.

The minimum label element along the specified axis.

Examples

What is the minimum label value in the following larry?

>>> y = larry([1, 2, 3], [['a', 'z', 'w']])
>>> y.minlabel()
'a'               

larry.labelindex(name, axis, exact=True)

Return index of given label element along specified axis.

Parameters :

name : str, datetime.date, int, etc.

Name of label element to index.

axis : int

Axis to index along. Cannot be None.

exact : bool, optional

If an exact match is specfied (default) then an IndexError is raised if an exact match cannot be found. If exact match is False and if a perfect match is not found then the index of the nearest label is returned. Nearest is defined as the closest that is equal or smaller.

Returns :

idx : int

Index of given label element.

Examples

What column number (starting from 0) of the following 2d larry is labeled ‘west’?

>>> from la import larry
>>> y = larry([[1, 2], [3, 4]], [['north', 'south'], ['east', 'west']])        
>>> y.labelindex('west', axis=1)
1        

larry.maplabel(func, axis=None, copy=True)

Apply given function to each element of label along specified axis.

Parameters :

func : function

Function to apply to each element of label.

axis : {int, None}, optional

Axis along which to apply the function.

copy : bool

Whether to return a copy (True, default) or to return a reference.

Returns :

y : larry

A copy or a reference (dending on the value of copy) of the larry with the given function applied to the specified labels.

Examples

Create a larry with dates in the label:

>>> import datetime
>>> d = datetime.date
>>> y = larry([1, 2], [[d(2010,1,1), d(2010,1,2)]])

Convert the dates in the label to integers:

>>> y.maplabel(datetime.date.toordinal)
label_0
    733773
    733774
x
array([1, 2])

Convert the dates in the label to strings:

>>> y.maplabel(str)
label_0
    2010-01-01
    2010-01-02
x
array([1, 2])

Moving window statistics

Moving window statistics along the specified axis of a larry.


larry.move_sum(window, axis=-1)

Moving window sum along the specified axis, ignoring NaNs.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving sum. By default the moving sum is taken over the last axis (-1).

Returns :

y : larry

The moving sum along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window sum.) The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, la.nan, 4])
>>> lar.move_sum(window=2) 
label_0
    0
    1
    2
    3
x
array([ NaN,   3.,   2.,   4.])

larry.move_mean(window, axis=-1)

Moving window mean along the specified axis, ignoring NaNs.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving mean. By default the moving mean is taken over the last axis (-1).

Returns :

y : larry

The moving mean along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window mean.) The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, la.nan, 4])
>>> lar.move_mean(window=2)  
label_0
    0
    1
    2
    3
x
array([ NaN,  1.5,  2. ,  4. ])

larry.move_std(window, axis=-1)

Moving window standard deviation along specified axis, ignoring NaNs.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving standard deviation. By default the moving standard deviation is taken over the last axis (-1).

Returns :

y : larry

The moving standard deviation along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window standard deviation.) The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, la.nan, 4, 5])
>>> lar.move_std(window=3) 
label_0
    0
    1
    2
    3
    4
x
array([ NaN,  NaN,  0.5,  1. ,  0.5])

larry.move_min(window, axis=-1)

Moving window minimum along the specified axis, ignoring NaNs.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving minimum. By default the moving minimum is taken over the last axis (-1).

Returns :

y : larry

The moving minimum of the input array along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window minimum.) The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, la.nan, 4])
>>> lar.move_min(window=2) 
label_0
    0
    1
    2
    3
x
array([ NaN,   1.,   2.,   4.])

larry.move_max(window, axis=-1)

Moving window maximum along the specified axis, ignoring NaNs.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving maximum. By default the moving maximum is taken over the last axis (-1).

Returns :

y : larry

The moving maximum of the input array along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window maximum.) The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, la.nan, 4])
>>> lar.move_max(window=2) 
label_0
    0
    1
    2
    3
x
array([ NaN,   2.,   2.,   4.])

larry.move_ranking(window, axis=-1, method='strides')

Moving window ranking along the specified axis, ignoring NaNs.

The output is normalized to be between -1 and 1. For example, with a window width of 3 (and with no ties), the possible output values are -1, 0, 1.

Ties are broken by averaging the rankings. See the examples below.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving ranking. By default the moving ranking is taken over the last axis (-1).

method : str, optional

The following moving window methods are available:

‘strides’

strides tricks (ndim < 4) (default)

‘loop’

brute force python loop

Returns :

y : larry

The moving ranking along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window ranking; if all elements in a window are NaNs except the last element, a NaN is returned.) The output has the same shape as the input.

Examples

With window=3 and no ties, there are 3 possible output values, i.e. [-1., 0., 1.]:

>>> lar = larry([1, 2, 6, 4, 5, 3])
>>> lar.move_ranking(window=3) 
label_0
    0
    1
    2
    3
    4
    5
x
array([ NaN,  NaN,   1.,   0.,   0.,  -1.])

Ties are broken by averaging the rankings of the tied elements:

>>> lar = larry([1, 2, 1, 1, 1, 2])
>>> lar.move_ranking(window=3) 
label_0
    0
    1
    2
    3
    4
    5
x
array([ NaN,  NaN, -0.5, -0.5,  0. ,  1. ])

In a monotonically increasing sequence, the moving window ranking is always equal to 1:

>>> lar = larry([1, 2, 3, 4, 5])
>>> lar.move_ranking(window=3) 
label_0
    0
    1
    2
    3
    4
x
array([ NaN,  NaN,   1.,   1.,   1.])

larry.move_median(window, axis=-1, method='loop')

Moving window median along the specified axis, ignoring NaNs.

Parameters :

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving median. By default the moving median is taken over the last axis (-1).

method : str, optional

The following moving window methods are available:

‘loop’

brute force python loop (default)

‘strides’

strides tricks (ndim < 4)

Returns :

y : larry

The moving median along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window maximum.) The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, la.nan, 4, 5])
>>> lar.move_median(window=2)
label_0
    0
    1
    2
    3
    4
x
array([ NaN,  1.5,  2. ,  4. ,  4.5])

larry.move_func(func, window, axis=-1, method='loop', **kwargs)

Generic moving window function along the specified axis.

Parameters :

func : function

A reducing function such as np.sum, np.max, or np.median that takes a Numpy array and axis and, optionally, key word arguments as input.

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to evaluate func. By default the window moves along the last axis (-1).

method : str, optional

The following moving window methods are available:

‘loop’

brute force python loop (default)

‘strides’

strides tricks (ndim < 4)

Returns :

y : larry

A moving window evaluation of func along the specified axis. The output has the same shape as the input.

Examples

>>> lar = larry([1, 2, 3, 4])
>>> lar.move_func(np.sum, window=2)
label_0
    0
    1
    2
    3
x
array([ NaN,   3.,   5.,   7.])

which give the same result as:

>>> lar.move_sum(window=2)
label_0
    0
    1
    2
    3
x
array([ NaN,   3.,   5.,   7.])

larry.movingsum_forward(window, skip=0, axis=-1, norm=False)

Movingsum in the forward direction skipping skip dates

Calculation

The calculation methods transform the larry.


larry.demean(axis=None)

Subtract the mean along the specified axis.

Parameters :

axis : {int, None}, optional

The axis along which to remove the mean. The default (None) is to subtract the mean of the flattened larry.

Returns :

y : larry

A copy with the mean along the specified axis removed.

Examples

>>> y = larry([1, 2, 3, 4])
>>> y.demean()
label_0
    0
    1
    2
    3
x
array([-1.5, -0.5,  0.5,  1.5])

larry.demedian(axis=None)

Subtract the median along the specified axis.

Parameters :

axis : {int, None}, optional

The axis along which to remove the median. The default (None) is to subtract the median of the flattened larry.

Returns :

y : larry

A copy with the median along the specified axis removed.

Examples

>>> y = larry([1, 2, 3, 4])
>>> y.demedian()
label_0
    0
    1
    2
    3
x
array([-1.5, -0.5,  0.5,  1.5])

larry.zscore(axis=None)

Z-score along the specified axis.

Parameters :

axis : {int, None}, optional

The axis along which to take the z-score. The default (None) is to find the z-score of the flattened larry.

Returns :

y : larry

A copy normalized with the Z-score along the specified axis.

Examples

>>> y = larry([1, 2, 3])
>>> y.zscore()
label_0
    0
    1
    2
x
array([-1.22474487,  0.        ,  1.22474487])

larry.ranking(axis=0, norm='-1, 1')

Rank elements treating NaN as missing and averaging ties.

Parameters :

axis : {int, None} optional

Axis to rank over. Default axis is 0.

norm: str, optional :

A string that specifies the normalization:

‘0,N-1’

Zero to N-1 ranking

‘-1,1’

Scale zero to N-1 ranking to be between -1 and 1

‘gaussian’

Rank data then scale to a Gaussian distribution

The default ranking is ‘-1,1’.

Returns :

y : larry

The ranked data. The dtype of the output is always np.float even if the dtype of the input is int.

Notes

If there is only one non-NaN value along the given axis, then that value is set to the midpoint of the specified normalization method. For example, if the input is array([1.0, nan]), then 1.0 is set to zero for the ‘-1,1’ and ‘gaussian’ normalizations and is set to 0.5 (mean of 0 and 1) for the ‘0,N-1’ normalization.

For ‘0,N-1’ normalization, note that N is x.shape[axis] even if there are NaNs. That ensures that when ranking along the columns of a 2d array, for example, the output will have the same min and max along all columns.


larry.quantile(q, axis=0)

Assign elements along specified axis into q bins, where smallest elements are in bin 1, next smallest in bin 2, ..., largest elements are in bin q; then normalize output be between [-1, 1].

Parameters :

q : int

The number of bins into which to quantize the data. Must be at least 1 but less than the number of elements along the specified axis.

axis : {int, None}, optional

The axis along which to quantize the elements. The default is axis 0.

Returns :

lar : larry

A quantized copy of the larry.

Examples

>>> lar = larry([1, 2, 3, 4, 5, 6])
>>> lar.quantile(3)
label_0
    0
    1
    2
    3
    4
    5
x
array([-1., -1.,  0.,  0.,  1.,  1.])

Group

The group methods allow you to calculate the group mean (or median or ranking) along axis=0 of a larry. For example, let’s calculate the group mean of y where group 1 is (‘e’, ‘a’), group 2 is (‘d’, ‘c’), and group 3 is (‘b’):

>>> from la import larry
>>> y  = larry([[1], [2], [3], [4], [5]], [['a', 'b', 'c', 'd', 'e'], [0]])
>>> group = larry([1, 1, 2, 2, 3], [['e', 'a', 'd', 'c', 'b']])

>>> y.group_mean(group)
label_0
    a
    b
    c
    d
    e
label_1
    0
x
array([[ 3. ],
       [ 2. ],
       [ 3.5],
       [ 3.5],
       [ 3. ]])

larry.group_ranking(group, axis=0)

Group (e.g. sector) ranking along columns.

The row labels of the object must be a subset of the row labels of the group.


larry.group_mean(group, axis=0)

Group (e.g. sector) mean along columns (zero axis).

The row labels of the object must be a subset of the row labels of the group.


larry.group_median(group, axis=0)

Group (e.g. sector) median along columns (zero axis).

The row labels of the object must be a subset of the row labels of the group.

Alignment

There are several alignment methods. See also the align function.


larry.morph(label, axis)

Reorder the elements along the specified axis.

If an element in label does not exist in the larry, NaNs will be used for float dtype, None will be used for object dtype, and ‘’ will be used for string (np.string_) dtype. All other dtype, such as int and bool, will be cast to float if there are any elements in label does not exist in the larry.

Parameters :

label : list

Desired ordering of elements along specified axis.

axis : int

axis along which to perform the reordering.

Returns :

out : larry

A reordered copy.

See also

la.larry.morph_like
Morph along all axes to align with given larry.
la.larry.merge
Merge, or optionally update, a larry with a second larry.
la.align
Align two larrys using one of five join methods.

Examples

>>> y = larry([1, 2, 3], [['a', 'b', 'c']])
>>> y.morph(['b', 'ee', 'a', 'c'], axis=0)
label_0
    b
    ee
    a
    c
x
array([  2.,  NaN,   1.,   3.])                    

larry.morph_like(lar)

Morph along all axes to align with given larry.

If label elements in lar do not exist in the larry, then a fill value is used to mark the missing values. See morph for details.

Parameters :

lar : larry

The target larry to align to.

Returns :

lar : larry

A morphed larry that is aligned with the input larry, lar.

See also

la.larry.morph
Reorder the elements along the specified axis.
la.larry.merge
Merge, or optionally update, a larry with a second larry.
la.align
Align two larrys using one of five join methods.

Examples

Align y1 to y2:

>>> y1 = larry([1, 2], [['a', 'b']])
>>> y2 = larry([3, 2, 1], [['c', 'b', 'a']])
>>> y1.morph_like(y2)
label_0
    c
    b
    a
x
array([ NaN,   2.,   1.])

Align y2 to y1:

>>> y2.morph_like(y1)
label_0
    a
    b
x
array([ 1.,  2.])

larry.merge(other, update=False)

Merge, or optionally update, a larry with a second larry.

Parameters :

other : larry

The larry to merge or to use to update the values. It must have the same number of dimensions as the existing larry.

update : bool

Raise a ValueError (default) if there is any overlap in the two larrys. An overlap is defined as a common label in both larrys that contains a finite value in both larrys. If update is True then the overlapped values in the current larry will be overwritten with the values in other.

Returns :

lar1 : larry

The merged larry.

Notes

If either larry has dtype of object or np.string_ then both larrys must have the same dtype, otherwise a TypeError is raised.

Examples

>>> y1 = larry([1, 2], [['a', 'b']])
>>> y2 = larry([3, 4], [['c', 'd']])
>>> y1.merge(y2)
label_0
    a
    b
    c
    d
x
array([ 1.,  2.,  3.,  4.])

larry.squeeze()

Remove all length-1 dimensions and corresponding labels.

Note that a view (reference) is returned, not a copy.

Returns :

out : larry

Returns a view with all length-1 dimensions and corresponding labels removed.

Examples

>>> y = larry([[1, 2]], [['row'], ['c1', 'c2']])
>>> y
label_0
    row
label_1
    c1
    c2
x
array([[1, 2]])
>>> y.squeeze()
label_0
    c1
    c2
x
array([1, 2])             

larry.lag(nlag, axis=-1)

Lag the values along the specified axis.

Parameters :

nlag : int

Number of periods (rows, columns, etc) to lag. The lag can be positive (delay), zero (copy of input), or negative (push forward).

axis : int

The axis to lag along. The default is -1.

Returns :

out : larry

A lagged larry is returned.

Raises :

ValueError :

If nlag < 0.

IndexError :

If the axis is None.

Examples

Create a larry:

>>> y = larry([1, 2, 3, 4], [['a', 'b', 'c', 'd']])

A positive lag:

>>> y.lag(2)
label_0
    c
    d
x
array([1, 2])

A negative lag:

>>> y.lag(-2)
label_0
    a
    b
x
array([3, 4])

larry.sortaxis(axis=None, reverse=False)

Sort data (and label) according to label along specified axis.

Parameters :

axis : {int, None}, optional

The axis to sort along. The default (None) is to sort all axes.

reverse : {True, False}, optional

Sort in descending order (True) or ascending order (False). The default is to sort in ascending order.

Returns :

y : larry

A sorted copy of the larry.

Examples

Let’s make a larry that we can use to demonstrate the sortaxis method:

>>> y = larry([[4, 3], [2, 1]], [['b', 'a'], ['d', 'c']])
>>> y
label_0
    b
    a
label_1
    d
    c
x
array([[4, 3],
       [2, 1]])

By default all axes are sorted:

>>> y.sortaxis()
label_0
    a
    b
label_1
    c
    d
x
array([[1, 2],
       [3, 4]])

You can also sort in reverse order (although in this particular example the larry is already in reverse order):

>>> y.sortaxis(reverse=True)
label_0
    b
    a
label_1
    d
    c
x
array([[4, 3],
       [2, 1]])

And you can sort along a single axis:

>>> y.sortaxis(axis=0)
label_0
    a
    b
label_1
    d
    c
x
array([[2, 1],
       [4, 3]])

larry.flipaxis(axis=None, copy=True)

Reverse the order of the elements along the specified axis.

Parameters :

axis : {int, None}, optional

The axis to flip. The default (None) is to flip all axes.

copy : {True, False}, optional

If True (default) return a copy; if False return a view.

Returns :

y : larry

A copy or view (depending on the value of copy) of the larry that has been flipped.

Examples

Create a larry:

>>> y = larry([[1, 2], [3, 4]], [['a', 'b'], ['c', 'd']])
>>> y
label_0
    a
    b
label_1
    c
    d
x
array([[1, 2],
       [3, 4]])

Flip all axes:

>>> y.flipaxis()
label_0
    b
    a
label_1
    d
    c
x
array([[4, 3],
       [2, 1]])

Flip axis 0 only:

>>> y.flipaxis(axis=0)
label_0
    b
    a
label_1
    c
    d
x
array([[3, 4],
       [1, 2]])            

Shuffle

The data and the labels of larrys can be randomly shuffled in-place.


larry.shuffle(axis=0)

Shuffle the data inplace along the specified axis.

Unlike numpy’s shuffle, this shuffle takes an axis argument. The ordering of the labels is not changed, only the data is shuffled.

Parameters :

axis : {int, None}, optional

The axis to shuffle the data along. Default is axis 0. If None, then the data will be shuffled along all axes.

Returns :

out : None

The data are shuffled inplace.

See also

la.larry.shufflelabel
Shuffle the label inplace along the specified axis.

Examples

>>> y = larry([[1, 2], [3,  4]], [['north', 'south'], ['east', 'west']])
>>> y.shuffle()
>>> y
label_0
    north
    south
label_1
    east
    west
x
array([[3, 4],
       [1, 2]])

larry.shufflelabel(axis=0)

Shuffle the label inplace along the specified axis.

Parameters :

axis : {int, None}, optional

The axis to shuffle the data along. Default is axis 0. If None, then the labels will be shuffled along all axes, where each label axis will still contain the same set of labels (labels from one axis will not be shuffle to another axis).

Returns :

out : None

The labels are shuffled inplace.

See also

la.larry.shuffle
Shuffle the data inplace along the specified axis.

Examples

>>> y = larry([[1, 2], [3,  4]], [['north', 'south'], ['east', 'west']])
>>> y.shufflelabel()
>>> y
label_0
    south
    north
label_1
    west
    east
x
array([[3, 4],
       [1, 2]])                    

Missing data

NaNs are treated as missing data in larry:

>>> import la
>>> y = larry([1.0, la.nan])
>>> y.sum()
1.0

Missing value makers for various dtypes:

dtype missing marker
float NaN
object None
str ‘’
int, bool, etc Not supported

larry.ismissing()

A bool larry with element-wise marking of missing values.

Returns :

y : larry

Returns a bool larry that contains the value True if the corresponding element of the larry is missing; otherwise False.

Examples

Floats:

>>> larry([1.0, la.nan]).ismissing()
label_0
    0
    1
x
array([False,  True], dtype=bool)

Strings:

>>> larry(['string', '']).ismissing()
label_0
    0
    1
x
array([False,  True], dtype=bool)

Objects:

>>> larry(['', None], dtype=object).ismissing()
label_0
    0
    1
x
array([False,  True], dtype=bool)

bool and int dtype do not support missing values so always return False:

>>> larry([0, 1]).ismissing()
label_0
    0
    1
x
array([False, False], dtype=bool)
.
>>> larry([True, False]).ismissing()
label_0
    0
    1
x
array([False, False], dtype=bool)        

larry.cut_missing(fraction, axis=None)

Cut rows and columns that contain too many NaNs.

Parameters :

fraction : scalar

Usually a float that give the minimum allowable fraction of missing data before the row or column is cut.

axis : {int, None}

Look for missing data along this axis. So for axis=0, the missing data along columns are checked and columns are cut. For axis=1, the missing data along rows are checked and rows are cut.

Returns :

out : larry

Returns a copy with rows or columns with lots of missing data cut.


larry.push(window, axis=-1)

Fill missing values (NaNs) with most recent non-missing values if recent, where recent is defined by the window. The filling proceeds from left to right along each row.


larry.vacuum(axis=None)

Remove all rows and/or columns that contain all NaNs.

Parameters :

axis : None or int or tuple of int

Remove columns (0) or rows (1) or both (None, default) that contain no finite values, for nd arrays see Notes.

Returns :

out : larry

Return a copy with rows and/or columns removed that contain all NaNs.

Notes

For nd arrays, axis can also be a tuple. In this case, all other axes are checked for nans. If the corresponding slice of the array contains only nans then the slice is removed.


larry.nan_replace(replace_with=0)

Replace NaNs.

Parameters :

replace_with : scalar

Value to replace NaNs with.

Returns :

out : larry

Returns a copy with NaNs replaced.

Size, shape, dtype

Here are the methods that tell you about the size, shape, and dtype of larry. Some of the methods (T, flatten, unflatten) change the shape of the larry.


larry.nx()

Number of finite values (not NaN, -Inf, or Inf) in the larry.

Examples

>>> from la import nan
>>> y = larry([1, 2, nan])
>>> y.nx
2

larry.size()

Number of elements in the larry.

Examples

>>> from la import nan
>>> y = larry([1, 2, nan])
>>> y.size
3

larry.shape()

Shape of the larry as a tuple.

Examples

>>> from la import nan
>>> y = larry([1, 2, nan])
>>> y.shape
(3,)       

larry.ndim()

Number of dimensions in the larry.

Examples

>>> from la import nan
>>> y = larry([1, 2, nan])
>>> y.ndim
1          

larry.dtype()

The dtype of the elements (not the labels) in the larry.

Examples

>>> from la import nan
>>> y = larry([1, 2, nan])
>>> y.dtype
dtype('float64')         

larry.astype(dtype)

Copy of larry cast to specified type.

Parameters :

dtype: string or data-type :

Typecode or data-type to which the larry is cast.

Returns :

y : larry

A copy of the larry, cast to the specified type.

Examples

Create a larry with float dtype:

>>> y = la.larry([1, 2, 2.5])
>>> y
label_0
    0
    1
    2
x
array([ 1. ,  2. ,  2.5])

Cast the float larry to int:

>>> y.astype(int)
label_0
    0
    1
    2
x
array([1, 2, 2])

larry.T()

Returns a transposed copy of the larry.

Examples

>>> y = larry([[1, 2], [3, 4]], [['a', 'b'], ['c', 'd']])
>>> y
label_0
    a
    b
label_1
    c
    d
x
array([[1, 2],
       [3, 4]])
>>> y.T
label_0
    c
    d
label_1
    a
    b
x
array([[1, 3],
       [2, 4]])

larry.swapaxes(axis1, axis2)

Swap the two specified axes.

Parameters :

axis1 : int

First axis. This axis will become the axis2.

axis2 : int

Second axis. This axis will become the axis1.

Returns :

y : larry

A larry with the specified axes swapped.

Examples

First create a (3,2) larry:

>>> y = larry([[0, 1], [2, 3], [4, 5]])
>>> y
label_0
    0
    1
    2
label_1
    0
    1
x
array([[0, 1],
       [2, 3],
       [4, 5]])

Then swap axes 0 and 1 (i.e., take the transpose):

>>> y.swapaxes(1,0)
label_0
    0
    1
label_1
    0
    1
    2
x
array([[0, 2, 4],
       [1, 3, 5]])

larry.flatten(order='C')

Return a copy of the larry after collapsing into one dimension.

The elements of the label become tuples.

Parameters :

order : {‘C’, ‘F’}, optional

Whether to flatten in row-major order (‘C’, default) or column-major order (‘F’).

Returns :

y : larry

A copy of the input larry, collapsed to one dimension where the labels are tuples.

See also

la.larry.unflatten
Return an unflattened copy of larry.

Examples

>>> y = larry([[1, 2], [3, 4]], [['a', 'b'], ['c', 'd']])
>>> y
label_0
    a
    b
label_1
    c
    d
x
array([[1, 2],
       [3, 4]])
>>> y.flatten()
label_0
    ('a', 'c')
    ('a', 'd')
    ('b', 'c')
    ('b', 'd')
x
array([1, 2, 3, 4])

larry.unflatten()

Return an unflattened copy of larry.

The larry to be unflattened must be in flattened form: 1d and label elements must be tuples containing the label elements of the corresponding data array element. Refer to the example below to see what a flattened array looks like.

Returns :y : larry

See also

la.larry.flatten
Return a copy of the larry collapsed into one dimension.

Examples

First create a flattened larry:

>>> y = larry([[1, 2], [3, 4]], [['r0', 'r1'], ['c0', 'c1']])
>>> yf = y.flatten()
>>> yf
label_0
    ('r0', 'c0')
    ('r0', 'c1')
    ('r1', 'c0')
    ('r1', 'c1')
x
array([1, 2, 3, 4])

Then unflatten it:

>>> yf.unflatten()
label_0
    r0
    r1
label_1
    c0
    c1
x
array([[ 1.,  2.],
       [ 3.,  4.]])

larry.insertaxis(axis, label)

Insert a new axis at the specified position.

Parameters :

axis : int

The position to insert the new axis into.

label : str, scalar, object, etc

The label element of the new axis. The length of the new axis is always 1, so only one label element is needed.

Returns :

y : larry

A copy of the larry with a new axis inserted in the specified position.

Examples

Create a 1d larry and then insert a new axis in position 0:

>>> y = larry([1, 2, 3])
>>> y.insertaxis(0, 'NEW')
label_0
    NEW
label_1
    0
    1
    2
x
array([[1, 2, 3]])

Try inserting a new axis in position 1:

>>> y.insertaxis(1, 'NEW')
label_0
    0
    1
    2
label_1
    NEW
x
array([[1],
       [2],
       [3]])

Conversion

Methods to convert larrys to other formats. For the corresponding ‘from’ methods, see Creation.


larry.totuples()

Convert to a flattened list of tuples.

See also

la.larry.fromtuples
Convert a list of tuples to a larry.
la.larry.tolist
Convert to a flattened list.
la.larry.todict
Convert to a dictionary.
la.larry.tocsv
Save larry to a csv file.
la.larry.tofile
Save 1d or 2d larry to text file.

Examples

>>> y = larry([[1, 2], [3, 4]], [['a', 'b'], ['c', 'd']])
>>> y.totuples()
[('a', 'c', 1), ('a', 'd', 2), ('b', 'c', 3), ('b', 'd', 4)]       

larry.tolist()

Convert to a flattened list.

See also

la.larry.fromlist
Convert a flattened list to a larry.
la.larry.totuples
Convert to a flattened list of tuples.
la.larry.todict
Convert to a dictionary.
la.larry.tocsv
Save larry to a csv file.
la.larry.tofile
Save 1d or 2d larry to text file.

Examples

>>> y = larry([[1, 2], [3, 4]], [['a', 'b'], ['c', 'd']])
>>> y.tolist()
[[1, 2, 3, 4], [('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd')]]       

larry.todict()

Convert to a dictionary.

See also

la.larry.totuples
Convert to a flattened list of tuples.
la.larry.tolist
Convert to a flattened list.
la.larry.tocsv
Save larry to a csv file.
la.larry.tofile
Save 1d or 2d larry to text file.

Examples

>>> y = larry([[1.0, 2.0], [3.0, 4.0]], [['a', 'b'], ['c', 'd']])
>>> y.todict()
{('b', 'c'): 3.0, ('a', 'd'): 2.0, ('a', 'c'): 1.0, ('b', 'd'): 4.0}     

larry.tocsv(filename, delimiter=', ')

Save larry to a csv file.

The type information of the labels will be lost. So if a label element is, for example, an integer, a round trip (tocsv followed by fromcsv) will convert it to an integer. You can use the maplabel method to convert it back to an integer.

As you can see from above, the tocsv and fromcvs methods are fragile. A more robust archiving solution is given by the IO class.

The format of the csv file is:

label0, label1, ..., labelN, value
label0, label1, ..., labelN, value
label0, label1, ..., labelN, value
Parameters :

filname : str

The filename of the csv file.

delimiter : str

The delimiter used to separate the labels elements from eachother and from the values.

See also

la.larry.fromcsv
Load a larry from a csv file.
la.larry.tofile
Save 1d or 2d larry to text file.
la.IO
Save and load larrys in HDF5 format using a dictionary-like interface.
la.larry.totuples
Convert to a flattened list of tuples.
la.larry.tolist
Convert to a flattened list.
la.larry.todict
Convert to a dictionary.

Examples

>>> y = larry([1, 2, 3], [['a', 'b', 'c']])
>>> y.tocsv('/tmp/lar.csv')
>>> larry.fromcsv('/tmp/lar.csv')
label_0
    a
    b
    c
x
array([ 1.,  2.,  3.])

larry.tofile(file, delimiter=', ')

Save 1d or 2d larry to text file (overwrites file if already exists).

Parameters :

file : {str, file object}

A file name (str) or file object. If file object, then it will not be closed.

delimiter : str

The delimiter used to separate the elements in the file.

See also

la.IO
Save and load larrys in HDF5 format using a dictionary-like interface.
la.larry.tocsv
Save larry to a csv file.
la.larry.totuples
Convert to a flattened list of tuples.
la.larry.tolist
Convert to a flattened list.
la.larry.todict
Convert to a dictionary.

Examples

Create a larry:

>>> lar = larry([[1, 2], [3, 4]], [['r1', 'r2'], ['c1', 'c2']])

Pick a file name or file object or string buffer. Let’s use a string buffer:

>>> import StringIO
>>> f = StringIO.StringIO()

Write to file:

>>> lar.tofile(f)

Display file (or in this case, string buffer):

>>> print f.getvalue()
,c1,c2
r1,1,2
r2,3,4

Copy

Here are the methods that copy a larry or its components.


larry.copy()

Return a copy of a larry.

Examples

>>> y = larry([1, 2], [['a', 'b']])        
>>> z = y.copy()
>>> z
label_0
    a
    b
x
array([1, 2])

larry.copylabel()

Return a copy of a larry’s label.

Examples

>>> y = larry([1, 2], [['a', 'b']])        
>>> label = y.copylabel()
>>> label
[['a', 'b']]

larry.copyx()

Return a copy of a larry’s data as a Numpy array.

Examples

>>> y = larry([1, 2], [['a', 'b']])  
>>> x = y.copyx()
>>> x
array([1, 2])

Table Of Contents

Previous topic

Reference

Next topic

larry functions

This Page