Errors are like this:
Traceback (most recent call last): File "NearestCentroid.py", line 53, in <module> clf.fit(X_train.todense(),y_train) File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.13.1-py2.7-linux-i686.egg/sklearn/neighbors/nearest_centroid.py", line 115, in fit variance = np.array(np.power(X - self.centroids_[y], 2)) IndexError: arrays used as indices must be of integer (or boolean) type
Codes are like this:
distancemetric=['euclidean','l2'] for mtrc in distancemetric: for shrkthrshld in [None]: #shrkthrshld=0 #while (shrkthrshld <=1.0): clf = NearestCentroid(metric=mtrc,shrink_threshold=shrkthrshld) clf.fit(X_train.todense(),y_train) y_predicted = clf.predict(X_test.todense())
I am using
y_train are in LIBSVM format,
X is the feature:value pair,
y_train is the target/label,
X_train is in CSR matric format, the
shrink_threshold does not support CSR sparse matrix, so I add
X_train, then I got this error, could anyone help me fix this? Thanks a lot!
I had a similar problem using the Pystruct
It occured because my training labels were floats, in stead of integers. In my case, it was because I initialized the labels with np.ones, without specifying dtype=np.int8. Hope it helps.
It happens quite often that an indexing array should be clearly
integer type by the way it is created, but in the case of empty list passed, becomes default
float, a case which might not be considered by the programmer. For example:
>>> np.array(xrange(1)) >>> array() #integer type as expected >>> np.array(xrange(0)) >>> array(, dtype=float64) #does not generalize to the empty list
Therefore, one should always explicitely define the
dtype in the array constructor.
Sometimes your data is in integer and every thing is right but it happened because one of your data series is an empty array, so you can use this condition:
if len(X_train.todense())> 0: