1
resposta

Erro com o K-means

Eu fiz exatamente igual (até baixei o projeto da aula), mas ao rodar essa linha:

k_means_dados_e.fit(x_normalizado)

eu recebo o erro:


AttributeError Traceback (most recent call last) Input In [36], in <cell line: 6>() 6 for i in range(1,11): 7 k_means_dados_e = KMeans(n_clusters=i, random_state=42) ----> 8 k_means_dados_e.fit(x_normalizado)

File ~\Anaconda3\lib\site-packages\sklearn\cluster_kmeans.py:1186, in KMeans.fit(self, X, y, sample_weight) 1183 print("Initialization complete") 1185 # run a k-means once -> 1186 labels, inertia, centers, n_iter_ = kmeans_single( 1187 X, 1188 sample_weight, 1189 centers_init, 1190 max_iter=self.max_iter, 1191 verbose=self.verbose, 1192 tol=self._tol, 1193 x_squared_norms=x_squared_norms, 1194 n_threads=self._n_threads, 1195 ) 1197 # determine if these results are the best so far 1198 # we chose a new run if it has a better inertia and the clustering is 1199 # different from the best so far (it's possible that the inertia is 1200 # slightly better even if the clustering is the same with potentially 1201 # permuted labels, due to rounding errors) 1202 if best_inertia is None or ( 1203 inertia < best_inertia 1204 and not _is_same_clustering(labels, best_labels, self.n_clusters) 1205 ):

File ~\Anaconda3\lib\site-packages\sklearn\cluster_kmeans.py:625, in _kmeans_single_lloyd(X, sample_weight, centers_init, max_iter, verbose, x_squared_norms, tol, n_threads) 621 strict_convergence = False 623 # Threadpoolctl context to limit the number of threads in second level of 624 # nested parallelism (i.e. BLAS) to avoid oversubsciption. --> 625 with threadpool_limits(limits=1, user_api="blas"): 626 for i in range(max_iter): 627 lloyd_iter( 628 X, 629 sample_weight, (...) 636 n_threads, 637 )

File ~\Anaconda3\lib\site-packages\sklearn\utils\fixes.py:314, in threadpool_limits(limits, user_api) 312 return controller.limit(limits=limits, user_api=user_api) 313 else: --> 314 return threadpoolctl.threadpool_limits(limits=limits, user_api=user_api)

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:171, in threadpool_limits.init(self, limits, user_api) 167 def init(self, limits=None, user_api=None): 168 self._limits, self._user_api, self._prefixes = 169 self._check_params(limits, user_api) --> 171 self._original_info = self._set_threadpool_limits()

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:268, in threadpool_limits._set_threadpool_limits(self) 265 if self._limits is None: 266 return None --> 268 modules = _ThreadpoolInfo(prefixes=self._prefixes, 269 user_api=self._user_api) 270 for module in modules: 271 # self._limits is a dict {key: num_threads} where key is either 272 # a prefix or a user_api. If a module matches both, the limit 273 # corresponding to the prefix is chosed. 274 if module.prefix in self._limits:

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:340, in _ThreadpoolInfo.init(self, user_api, prefixes, modules) 337 self.user_api = [] if user_api is None else user_api 339 self.modules = [] --> 340 self._load_modules() 341 self._warn_if_incompatible_openmp() 342 else:

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:373, in _ThreadpoolInfo._load_modules(self) 371 self._find_modules_with_dyld() 372 elif sys.platform == "win32": --> 373 self._find_modules_with_enum_process_module_ex() 374 else: 375 self._find_modules_with_dl_iterate_phdr()

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:485, in _ThreadpoolInfo._find_modules_with_enum_process_module_ex(self) 482 filepath = buf.value 484 # Store the module if it is supported and selected --> 485 self._make_module_from_path(filepath) 486 finally: 487 kernel_32.CloseHandle(h_process)

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:515, in _ThreadpoolInfo._make_module_from_path(self, filepath) 513 if prefix in self.prefixes or user_api in self.user_api: 514 module_class = globals()[module_class] --> 515 module = module_class(filepath, prefix, user_api, internal_api) 516 self.modules.append(module)

File ~\Anaconda3\lib\site-packages\threadpoolctl.py:606, in _Module.init(self, filepath, prefix, user_api, internal_api) 604 self.internal_api = internal_api 605 self._dynlib = ctypes.CDLL(filepath, mode=_RTLD_NOLOAD) --> 606 self.version = self.get_version() 607 self.num_threads = self.get_num_threads()

1 resposta

Olá, Alexandre, tudo bem?

O erro está ocorrendo na biblioteca threadpoolctl, que é usada pelo sklearn . Portanto, o problema pode estar relacionando a incompatibilidade entre as bibliotecas que você está usando.

Uma possível solução para o seu problema seria verificar se todas as suas bibliotecas estão atualizadas. Tente executar este comando:

pip install --upgrade sklearn
pip install --upgrade threadpoolctl

Se isso não resolver o problema, tente desinstalar e reinstalar as bibliotecas.

pip uninstall sklearn
pip uninstall threadpoolctl
pip install sklearn
pip install threadpoolctl

Além disso, para ter um maior aproveitamento do curso recomendo fortemente utilizar as mesmas ferramentas utilizadas pela instrutora, que neste caso seria o Google Colab.

Espero ter ajudado e caso o problema persista, fico à disposição para ajudá-lo.

Abraço e bons estudos!

Caso este post tenha lhe ajudado, por favor, marcar como solucionado ✓. Bons Estudos!