Arrays used as indices must be of integer (or boolean) type

Errors are like this:

Traceback (most recent call last):
  File "", line 53, in <module>,y_train)
  File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.13.1-py2.7-linux-i686.egg/sklearn/neighbors/", line 115, in fit
    variance = np.array(np.power(X - self.centroids_[y], 2))
IndexError: arrays used as indices must be of integer (or boolean) type

Codes are like this:

for mtrc in distancemetric:
for shrkthrshld in [None]:
#while (shrkthrshld <=1.0):
    clf = NearestCentroid(metric=mtrc,shrink_threshold=shrkthrshld),y_train)
    y_predicted = clf.predict(X_test.todense())

I am using scikit-learn package, X-train, y_train are in LIBSVM format, X is the feature:value pair, y_train is the target/label, X_train is in CSR matric format, the shrink_threshold does not support CSR sparse matrix, so I add .todense() to X_train, then I got this error, could anyone help me fix this? Thanks a lot!

3 Answers

I had a similar problem using the Pystruct pystruct.learners.OneSlackSSVM.

It occured because my training labels were floats, in stead of integers. In my case, it was because I initialized the labels with np.ones, without specifying dtype=np.int8. Hope it helps.

It happens quite often that an indexing array should be clearly integer type by the way it is created, but in the case of empty list passed, becomes default float, a case which might not be considered by the programmer. For example:

>>> np.array(xrange(1))
>>> array([0])                #integer type as expected
>>> np.array(xrange(0))
>>> array([], dtype=float64)  #does not generalize to the empty list

Therefore, one should always explicitely define the dtype in the array constructor.

Sometimes your data is in integer and every thing is right but it happened because one of your data series is an empty array, so you can use this condition:

if len(X_train.todense())> 0:

Leave a Reply

Your email address will not be published. Required fields are marked *