import numpy as np
the_array = np.array([np.nan, 2, 3, 4])
array_has_nan = np.isnan(the_array).any()
print(array_has_nan)
the_array = np.array([1, 2, 3, 4])
array_has_nan = np.isnan(the_array).any()
print(array_has_nan)
In NumPy, to replace missing values NaN
(np.nan
) in ndarray
with other numbers, use np.nan_to_num()
or np.isnan()
.
This article describes the following contents.
- Missing value
NaN
(np.nan
) in NumPy - Specify
filling_values
argument of np.genfromtxt()
- Replace
NaN
with np.nan_to_num()
- Replace
NaN
with np.isnan()
If you want to delete the row or column containing the missing value instead of replacing it, see the following article.
- NumPy: Remove rows/columns with missing value (NaN) in ndarray
Missing value NaN (np.nan) in NumPy
When you read a CSV file with np.genfromtxt()
, by default, the missing data is regarded as a missing value NaN
(Not a Number).
When outputting with print()
, it is printed as nan
.
- sample_nan.csv
- numpy.genfromtxt — NumPy v1.21 Manual
import numpy as np
a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(a)
# [[11. 12. nan 14.]
# [21. nan nan 24.]
# [31. 32. 33. 34.]]
If you want to generate NaN
explicitly, use np.nan
or float('nan')
. You can also import the math
module of the standard library and use math.nan
.
They are all the same.
a_nan = np.array([0, 1, np.nan, float('nan')])
print(a_nan)
# [ 0. 1. nan nan]
Since comparing missing values with ==
returns False
, use np.isnan()
or math.isnan()
to check if the value is NaN
or not.
- numpy.isnan — NumPy v1.21 Manual
- math.isnan — Mathematical functions — Python 3.10.1 documentation
print(np.nan == np.nan)
# False
print(np.isnan(np.nan))
# True
np.isnan()
checks whether each element of ndarray
is a NaN
or not.
print(a_nan == np.nan)
# [False False False False]
print(np.isnan(a_nan))
# [False False True True]
Specify filling_values argument of np.genfromtxt()
If the data of a CSV file is missing, you can fill the missing part with any value by specifying the argument filling_values
when reading it with np.genfromtxt()
.
For example, if you want to fill NaN
with 0
:
a_fill = np.genfromtxt('data/src/sample_nan.csv', delimiter=',', filling_values=0)
print(a_fill)
# [[11. 12. 0. 14.]
# [21. 0. 0. 24.]
# [31. 32. 33. 34.]]
Replace NaN with
np.nan_to_num()
You can use np.nan_to_num()
to replace NaN
.
- numpy.nan_to_num — NumPy v1.21 Manual
Note that np.nan_to_num()
also replaces infinity inf
. See the following article for details.
- "inf" for infinity in Python
If you specify ndarray
as the first
argument of np.nan_to_num()
, a new ndarray
is created with missing values replaced with 0
by default. The original ndarray
is not changed.
a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(np.nan_to_num(a))
# [[11. 12. 0. 14.]
# [21. 0. 0. 24.]
# [31. 32. 33. 34.]]
print(a)
# [[11. 12. nan 14.]
# [21. nan nan 24.]
# [31. 32. 33. 34.]]
If the second argument copy
is set to False
, the original ndarray
is changed.
print(np.nan_to_num(a, copy=False))
# [[11. 12. 0. 14.]
# [21. 0. 0. 24.]
# [31. 32. 33. 34.]]
print(a)
# [[11. 12. 0. 14.]
# [21. 0. 0. 24.]
# [31. 32. 33. 34.]]
In NumPy version 1.17
or later, the value to be replaced can be specified by the argument nan
.
a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(np.nan_to_num(a, nan=-1))
# [[11. 12. -1. 14.]
# [21. -1. -1. 24.]
# [31. 32. 33. 34.]]
You can replace NaN
with the average of elements that are not missing values with
np.nanmean()
.
- NumPy: Calculate the sum, mean, max, min of ndarray containing np.nan
print(np.nanmean(a))
# 23.555555555555557
print(np.nan_to_num(a, nan=np.nanmean(a)))
# [[11. 12. 23.55555556 14. ]
# [21. 23.55555556 23.55555556 24. ]
# [31. 32. 33. 34. ]]
In versions where the nan
argument is not implemented, you can replace NaN
with a value other than 0
in the following way.
Replace NaN with np.isnan()
You can use np.isnan()
to check whether elements of ndarray
are NaN
or not.
a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
print(np.isnan(a))
# [[False False True False]
# [False True True False]
# [False False False False]]
Using this result, you can assign any value to the missing value element.
If you want to replace NaN
with 0
:
a[np.isnan(a)] = 0
print(a)
# [[11. 12. 0. 14.]
# [21. 0. 0. 24.]
# [31. 32. 33. 34.]]
You can also use np.nanmean()
to replace NaN
with the average value.
a = np.genfromtxt('data/src/sample_nan.csv', delimiter=',')
a[np.isnan(a)] = np.nanmean(a)
print(a)
# [[11. 12. 23.55555556 14. ]
# [21. 23.55555556 23.55555556 24. ]
# [31. 32. 33. 34. ]]
How do you check if it is NP NaN?
Numpy module in python, provides a function numpy. isnan(), to check if an element is NaN or not. The isnan() method will take a array as an input and returns a boolean array of same size. The values in boolean array represent that if the element at that corresponding position in original array is a NaN or not.
How do I check if a value is NaN in Python?
The math. isnan() method checks whether a value is NaN (Not a Number), or not. This method returns True if the specified value is a NaN, otherwise it returns False.
How do I check if a value is NP NaN pandas?
Here are 4 ways to check for NaN in Pandas DataFrame:.
(1) Check for NaN under a single DataFrame column: df['your column name'].isnull().values.any().
(2) Count the NaN under a single DataFrame column: df['your column name'].isnull().sum().
(3) Check for NaN under an entire DataFrame: df.isnull().values.any().
How do I check if a numpy array contains NaN?
The numpy. isnan( ) method is very useful for users to find NaN(Not a Number) value in NumPy array. It returns an array of boolean values in the same shape as of the input data. Returns a True wherever it encounters NaN, False elsewhere.