What should i do if there is no data for some features in some training examples? | Coursera Community

# What should i do if there is no data for some features in some training examples?

• 2 replies
• 113 views

+1
• Newcomer
• 0 replies
I'm working on a machine learning problem with 100,000 training examples and about 100 features.

Y is only 1 number for every training example labeling as 1 or 0.

And i'm going to use basic neural network to solve it.

However, there is no data for some features in some training examples.

For example, feature_1 : 1,0,0,NA,0,0,1,1,1,0,NA,0,1,1,NA,......

What should i do in this situation?

I have three plans for it as a draft. Can someone tell me if these plans would work?

Or maybe you have a better idea?

1.Ignored those NA like this

[1,0.5,NA,0]*[theta1,theta2,theta3,theta4]=1*theta1 + 0.5*theta2 + 0*theta3 +0*theta4

But this will learn NA=0, so i think this would work if there's no 0 in this feature, but fail to work

otherwise.Am i right?

First, you look at a feature, and it show like this

feature_1 : 1,0,0,NA,0,0,1,1,1,0,NA,0,1,1,NA

Then i let NA= -1 or 2 or 100(a number that did not close to possible choices of the feature), would

this work? And (-1 , 2 ,100) which one do you think is better?

3.use Recommender Systems to estimate those NA first, then applied it to NN.

### 2 replies

Userlevel 1
+2
I would go for the second choice but with going for negative number. make sure you don't have outliers which can make problems.

going to recommending system could be tricky and would complicate things but it should be better in performance (but you should measure this), overall try different ways and if the performance difference (which might be increasing a little bit with recommended system is important for you go for it)
There are many ways to deal with missing values:

1. You can remove the entire example if there not too many missing values.
2. Replace the missing value with mean/median/mode. Here y takes 0 or 1 , so replacing it with mode should work better.
Do share what worked for you.