Flatten multidimensional array in Python

Example

Assume we have an array which looks like this:
import numpy as np
arr = np.array([[1,2,3], [4,5]])
If we try to flatten it by using either flatten() or ravel(), it turns out:
>>> arr.flatten()
array([list([1, 2, 3]), list([4, 5])], dtype=object)

>>> arr.ravel()
array([list([1, 2, 3]), list([4, 5])], dtype=object)

Solution

Why flattening the array didn't work is because arr is a 1-dimensional array of objects. The list objects will simply remain in the same order with flatten() or ravel(). We can use hstack() to stack the arrays in sequence horizontally:
>>> np.hstack(arr)
array([1, 2, 3, 4, 5])
Note that this is basically equivalent to using concatenate() with an axis of 1 (this should make sense intuitively):
>>> np.concatenate(arr, axis=0)
array([1, 2, 3, 4, 5])
If you don't have this issue however and can merge the items, it is always preferable to use flatten() or ravel() for performance:
>>> import timeit

# hstack()
>>> u = timeit.Timer('np.hstack(np.array([[1,2,3],[4,5,6]]))', setup='import numpy as np')
>>> print(u.timeit())
7.742744328978006

# concatenate()
>>> u = timeit.Timer('np.concatenate(np.array([[1,2,3],[4,5,6]]))', setup='import numpy as np')
>>> print(u.timeit())
3.360384401981719

# flatten()
>>> u = timeit.Timer('np.array([[1,2,3],[4,5,6]]).flatten()', setup='import numpy as np')
>>> print(u.timeit())
3.1729793019476347

# ravel()
>>> u = timeit.Timer('np.array([[1,2,3],[4,5,6]]).ravel()', setup='import numpy as np')
>>> print(u.timeit())
2.4354382199817337

Iluengo's answer also has you covered for further information as to why you cannot use flatten() or ravel() given your array type. The problem is that even if it looks like the second axis has different length, this is not true in practice. If we try:
>>> arr.shape
(2,)

>>> arr.dtype
dtype('O')

>>> arr[0]
[1, 2, 3]
It shows that arr array is not a 2D array with variable size (as you might think), it is just a 1D array of objects. In your case, the elements are list, being the first element of your array a 3-element list and the second element of the array is a 2-element list.

So, flatten() and ravel() won't work because transforming 1D array to a 1D array results in exactly the same 1D array. If you have a object numpy array, it won't care about what you put inside, it will treat individual items as unknown items and can't decide how to merge them.

What you should have in consideration, is if this is the behavior you want for your application. Numpy arrays are specially efficient with fixed-size numeric matrices. If you are playing with arrays of objects, I don't see why would you like to use Numpy instead of regular python lists.

The point is that the arr is not 2D. In order to use +, -, and sum, we need to make the array numeric and it should be fixed size.
>>> arr + arr
array([list([1, 2, 3, 1, 2, 3]), list([4, 5, 4, 5])], dtype=object)

References

Share:

0 意見:

張貼留言