array functions

Most larry methods have an equivalent Numpy array function. For example, to find the z-score along the last axis of a larry, lar, you would do:

>>> lar.zscore(axis=-1)

Here’s the corresponding operation on a Numpy array, arr:

>>> la.farray.zscore(arr, axis=-1)

This section of the manual is a reference guide to most of the Numpy array functions available in the la package.

Moving window statistics

This section contains Numpy array functions that calculate moving window summary statistics.

Note

The Bottleneck package contains fast moving window functions.


la.farray.move_median(arr, window, axis=-1, method='loop')

Moving window median along the specified axis.

Parameters :

arr : ndarray

Input array.

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving median. By default the moving median is taken over the last axis (-1).

method : str, optional

The following moving window methods are available:

‘loop’

brute force python loop (default)

‘strides’

strides tricks (ndim < 4)

Returns :

y : ndarray

The moving median of the input array along the specified axis. The output has the same shape as the input.

Examples

>>> arr = np.array([1, 2, 3, 4, 5])
>>> la.farray.move_median(arr, window=2)
array([ NaN,  1.5,  2.5,  3.5,  4.5])

la.farray.move_nanmedian(arr, window, axis=-1, method='loop')

Moving window median along the specified axis, ignoring NaNs.

Parameters :

arr : ndarray

Input array.

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving median. By default the moving median is taken over the last axis (-1).

method : str, optional

The following moving window methods are available:

‘loop’

brute force python loop (default)

‘strides’

strides tricks (ndim < 4)

Returns :

y : ndarray

The moving median of the input array along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window maximum.) The output has the same shape as the input.

Examples

>>> arr = np.array([1, 2, np.nan, 4, 5])
>>> la.farray.move_nanmedian(arr, window=2)
array([ NaN,  1.5,  2. ,  4. ,  4.5])

la.farray.move_nanranking(arr, window, axis=-1, method='strides')

Moving window ranking along the specified axis, ignoring NaNs.

The output is normalized to be between -1 and 1. For example, with a window width of 3 (and with no ties), the possible output values are -1, 0, 1.

Ties are broken by averaging the rankings. See the examples below.

Parameters :

arr : ndarray

Input array.

window : int

The number of elements in the moving window.

axis : int, optional

The axis over which to perform the moving ranking. By default the moving ranking is taken over the last axis (-1).

method : str, optional

The following moving window methods are available:

‘strides’

strides tricks (ndim < 4) (default)

‘loop’

brute force python loop

Returns :

y : ndarray

The moving ranking of the input array along the specified axis, ignoring NaNs. (A window with all NaNs returns NaN for the window ranking; if all elements in a window are NaNs except the last element, this NaN is returned.) The output has the same shape as the input.

Examples

With window=3 and no ties, there are 3 possible output values, i.e. [-1., 0., 1.]:

>>> arr = np.array([1, 2, 6, 4, 5, 3])
>>> la.farray.move_nanranking(arr, window=3)
array([ NaN,  NaN,   1.,   0.,   0.,  -1.])

Ties are broken by averaging the rankings of the tied elements:

>>> arr = np.array([1, 2, 1, 1, 1, 2])
>>> la.farray.move_nanranking(arr, window=3)
array([ NaN,  NaN, -0.5, -0.5,  0. ,  1. ])

In a monotonically increasing sequence, the moving window ranking is always equal to 1:

>>> arr = np.array([1, 2, 3, 4, 5])
>>> la.farray.move_nanranking(arr, window=3)
array([ NaN,  NaN,   1.,   1.,   1.])

Normalization

Normalization functions that take a Numpy array as input.


la.farray.ranking(x, axis=0, norm='-1, 1')

Normalized ranking treating NaN as missing and averaging ties.

Parameters :

x : ndarray

Data to be ranked.

axis : {int, None} optional

Axis to rank over. Default axis is 0.

norm: str, optional :

A string that specifies the normalization:

‘0,N-1’

Zero to N-1 ranking

‘-1,1’

Scale zero to N-1 ranking to be between -1 and 1

‘gaussian’

Rank data then scale to a Gaussian distribution

The default ranking is ‘-1,1’.

Returns :

idx : ndarray

The ranked data.The dtype of the output is always np.float even if the dtype of the input is int.

Notes

If there is only one non-NaN value along the given axis, then that value is set to the midpoint of the specified normalization method. For example, if the input is array([1.0, nan]), then 1.0 is set to zero for the ‘-1,1’ and ‘gaussian’ normalizations and is set to 0.5 (mean of 0 and 1) for the ‘0,N-1’ normalization.

For ‘0,N-1’ normalization, note that N is x.shape[axis] even in there are NaNs. That ensures that when ranking along the columns of a 2d array, for example, the output will have the same min and max along all columns.


la.farray.quantile(x, q, axis=0)

Convert elements in each column to integers between 1 and q then normalize.

Result is normalized to -1, 1.

Parameters :

x : ndarray

Input array.

q : int

The number of bins into which to quantize the data. Must be at least 1 but less than the number of elements along the specified axis.

axis : {int, None}, optional

The axis along which to quantize the elements. The default is axis 0.

Returns :

y : ndarray

A quantized copy of the array.

Examples

>>> arr = np.array([1, 2, 3, 4, 5, 6])
>>> la.farray.quantile(arr, 3)
array([-1., -1.,  0.,  0.,  1.,  1.])

la.farray.demean(arr, axis=None)

Subtract the mean along the specified axis.

Parameters :

arr : ndarray

Input array.

axis : {int, None}, optional

The axis along which to remove the mean. The default (None) is to subtract the mean of the flattened array.

Returns :

y : ndarray

A copy with the mean along the specified axis removed.

Examples

>>> arr = np.array([1, np.nan, 2, 3])
>>> demean(arr)
array([ -1.,  NaN,   0.,   1.])

la.farray.demedian(arr, axis=None)

Subtract the median along the specified axis.

Parameters :

arr : ndarray

Input array.

axis : {int, None}, optional

The axis along which to remove the median. The default (None) is to subtract the median of the flattened array.

Returns :

y : ndarray

A copy with the median along the specified axis removed.

Examples

>>> arr = np.array([1, np.nan, 2, 10])
>>> demedian(arr)
array([ -1.,  NaN,   0.,   8.])        

la.farray.zscore(arr, axis=None)

Z-score along the specified axis.

Parameters :

arr : ndarray

Input array.

axis : {int, None}, optional

The axis along which to take the z-score. The default (None) is to find the z-score of the flattened array.

Returns :

y : ndarray

A copy normalized with the Z-score along the specified axis.

Examples

>>> arr = np.array([1, np.nan, 2, 3])
>>> zscore(arr)
array([-1.22474487,         NaN,  0.        ,  1.22474487])

Misc

Miscellaneous Numpy array functions.


la.farray.correlation(arr1, arr2, axis=None)

Correlation between two Numpy arrays along the specified axis.

This is not a cross correlation function. If the two input arrays have shape (n, m), for example, then the output will have shape (m,) if axis is 0 and shape (n,) if axis is 1.

Parameters :

arr1 : Numpy ndarray

Input array.

arr2 : Numpy ndarray

Input array.

axis : {int, None}, optional

The axis along which to measure the correlation. The default, axis None, flattens the input arrays before finding the correlation and returning it as a scalar.

Returns :

corr : Numpy ndarray, scalar

The correlation between arr1 and arr2 along the specified axis.

Examples

Make two Numpy arrays:

>>> a1 = np.array([[1, 2], [3, 4]])
>>> a2 = np.array([[2, 1], [4, 3]])
>>> a1
array([[1, 2],
       [3, 4]])
>>> a2
array([[2, 1],
       [4, 3]])

Find the correlation between the two arrays along various axes:

>>> correlation(a1, a2)
0.59999999999999998
>>> correlation(a1, a2, axis=0)
array([ 1.,  1.])
>>> correlation(a1, a2, axis=1)
array([-1., -1.])

la.farray.shuffle(x, axis=0)

Shuffle the data inplace along the specified axis.

Unlike numpy’s shuffle, this shuffle takes an axis argument. The ordering of the labels is not changed, only the data is shuffled.

Parameters :

x : ndarray

Array to be shuffled.

axis : int

The axis to shuffle the data along. Default is axis 0.

Returns :

out : None

The data is shuffled inplace.


la.farray.geometric_mean(x, axis=-1, check_for_greater_than_zero=True)

Return the geometric mean of matrix x along axis, ignore NaNs.

Raise an exception if any element of x is zero or less.


la.farray.covMissing(R)

Covariance matrix adjusted for missing returns.

covMissing returns the covariance matrix adjusted for missing returns. R (NxT) is log stock returns; missing returns are NaN.

Note the mean of each row of R is assumed to be zero. So returns are not demeaned and the covariance is normalized by T not T-1.

Table Of Contents

Previous topic

larry functions

Next topic

Release Notes

This Page