I'm working on a machine learning problem with 100,000 training examples and about 100 features.
Y is only 1 number for every training example labeling as 1 or 0.
And i'm going to use basic neural network to solve it.
However, there is no data for some features in some training examples.
For example, feature_1 : 1,0,0,NA,0,0,1,1,1,0,NA,0,1,1,NA,......
What should i do in this situation?
I have three plans for it as a draft. Can someone tell me if these plans would work?
Or maybe you have a better idea?
1.Ignored those NA like this
[1,0.5,NA,0]*[theta1,theta2,theta3,theta4]=1*theta1 + 0.5*theta2 + 0*theta3 +0*theta4
But this will learn NA=0, so i think this would work if there's no 0 in this feature, but fail to work
otherwise.Am i right?
2.adjust NA like this
First, you look at a feature, and it show like this
feature_1 : 1,0,0,NA,0,0,1,1,1,0,NA,0,1,1,NA
Then i let NA= -1 or 2 or 100(a number that did not close to possible choices of the feature), would
this work? And (-1 , 2 ,100) which one do you think is better?
3.use Recommender Systems to estimate those NA first, then applied it to NN.