Attributeerror: 'list' Object Has No Attribute 'lower' : Clustering
I'm trying to do a clustering. I'm doing with pandas and sklearn. import pandas import pprint import pandas as pd from sklearn.cluster import KMeans from sklearn.metrics import adj
Solution 1:
The error is in this line:
dataset_list = dataset.values.tolist()
You see, dataset
is a pandas DataFrame, so when you do dataset.values
, it will be converted to a 2-d dataset of shape (n_rows, 1) (Even if the number of columns are 1). Then calling tolist()
on this will result in a list of lists, something like this:
print(dataset_list)
[[hello wish to cancel order thank you confirmation],
[hello would liketo cancel order made today store house world],
[dimensions bed not compatible would liketo know how to pass cancellation refund send today cordially]
...
...
...]]
As you see, there are two square brackets here.
Now TfidfVectorizer
only requires a list of sentences, not lists of list and hence the error (because TfidfVectorizer
assumes internal data to be sentences, but here it is a list).
So you just need to do this:
# Use ravel to convert 2-d to 1-d arraydataset_list = dataset.values.ravel().tolist()
OR
# Replace `column_name` with your actual column header, # which converts DataFrame to Seriesdataset_list = dataset['column_name'].values).tolist()
Post a Comment for "Attributeerror: 'list' Object Has No Attribute 'lower' : Clustering"