Handling Missing Values with pandas

Write a python program to find and replace the missing values in a given Data Frame.

Program Logic:

  • Create dictionary say ‘sale’ which store details of 7 years monthly sales.
  • Add some missing data in it and represent it with ‘?’ and ‘—‘.
  • Create dataframe object say’df’ using DataFrame method of pandas module and store necessary information in it.
  • Display original dataframe say ‘df’ using print() function
  • Fill its missing values from corresponding entries with NaN
  • Display dataframe say ‘df’ with modified content using print() function
  • Exit

Below is implementation code/Source code:

Here is the program to find and replace missing values in given dataframe. The output is also shown below

import pandas as pd
import numpy as np
sale = {'1996' : [98,99,103,102,104,'---',102,103,102,105,'?',105],
      '1997' : [78,101,103,102,84,101,89,103,90,104,65,105], 
      '1998' : [90,40,50,90,65,'?',101,103,102,104,65,109],
      '1999' : ['?',80,60,85,'?',101,'---',103,102,104,65,101], 
      '2000' : ['?',80,60,85,'?',101,201,103,100,104,65,101],
      '2001' : [101,101,103,102,104,101,101,103,102,104,123,101],
      '2002' : [ 101,101,103,102,'---',101,101,103,102,'?',65,101]}
df=pd.DataFrame(sale)
print("*******Original Dataframe************")
print(df)      
print("\nReplace the missing values with NaN:")
result = df.replace({"?": np.nan, "---": np.nan})
print(result)

Output:

*********Original Dataframe********
1996 1997 1998 1999 2000 2001 2002
0 98 78 90 ? ? 101 101
1 99 101 40 80 80 101 101
2 103 103 50 60 60 103 103
3 102 102 90 85 85 102 102
4 104 84 65 ? ? 104 —
5 — 101 ? 101 101 101 101
6 102 89 101 — 201 101 101
7 103 103 103 103 103 103 103
8 102 90 102 102 100 102 102
9 105 104 104 104 104 104 ?
10 ? 65 65 65 65 123 65
11 105 105 109 101 101 101 101

Replace the missing values with NaN:
1996 1997 1998 1999 2000 2001 2002
0 98.0 78 90.0 NaN NaN 101 101.0
1 99.0 101 40.0 80.0 80.0 101 101.0
2 103.0 103 50.0 60.0 60.0 103 103.0
3 102.0 102 90.0 85.0 85.0 102 102.0
4 104.0 84 65.0 NaN NaN 104 NaN
5 NaN 101 NaN 101.0 101.0 101 101.0
6 102.0 89 101.0 NaN 201.0 101 101.0
7 103.0 103 103.0 103.0 103.0 103 103.0
8 102.0 90 102.0 102.0 100.0 102 102.0
9 105.0 104 104.0 104.0 104.0 104 NaN
10 NaN 65 65.0 65.0 65.0 123 65.0
11 105.0 105 109.0 101.0 101.0 101 101.0

Below is Snapshot of executable code with output

pandas handling missing data
Snapshot of source code
output for pandas handling missing data
Sanpshot of Output
<