Olá. Fiz alguns testes e percebi que tem alguns sites que geram erro ao serem lidos com o comando pd.read_html().
Eu testei os dois sites abaixo, porém deu erro:
**-Site passado como "exemplo equivalente" pela Alura:**
df_html = pd.read_html('https://www.federalreserve.gov/releases/h3/current/default.htm') df_html[0]
*Erro gerado: *
HTTPError Traceback (most recent call last) in <cell line: 1>() ----> 1 df_html = pd.read_html('https://www.federalreserve.gov/releases/h3/current/default.htm') 2 df_html[0]
12 frames /usr/lib/python3.9/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs) 639 class HTTPDefaultErrorHandler(BaseHandler): 640 def http_error_default(self, req, fp, code, msg, hdrs): --> 641 raise HTTPError(req.full_url, code, msg, hdrs, fp) 642 643 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden
**-Site Wikipedia:**
df_html = pd.read_html('https://pt.wikipedia.org/wiki/Lista_de_unidades_federativas_do_Brasil_por_população') df_html[0]
**Erro gerado: **
UnicodeEncodeError Traceback (most recent call last)
in <cell line: 1>() ----> 1 df_html = pd.read_html('https://pt.wikipedia.org/wiki/Lista_de_unidades_federativas_do_Brasil_por_população') 2 df_html[0]
20 frames /usr/lib/python3.9/http/client.py in encoderequest(self, request) 1212 def encoderequest(self, request): 1213 # ASCII also helps prevent CVE-2019-9740. -> 1214 return request.encode('ascii') 1215 1216 def validatemethod(self, method):
UnicodeEncodeError: 'ascii' codec can't encode characters in position 60-61: ordinal not in range(128)