I tried to use UnstructuredURLLoader
as below
from langchain.document_loaders import UnstructuredURLLoader
loaders = UnstructuredURLLoader(urls=urls)
data = loaders.load()
but some pages report that
libmagic is unavailable but assists in filetype detection on file-like objects. Please consider installing libmagic for better results.
Error fetching or processing https://wellfound.com/company/chorus-one, exception: Invalid file. The FileType.UNK file type is not supported in partition.
while in my conda env I seem to have it
%pip list | grep libmagic
libmagic 1.0
but I do not have the python-libmagic
. When I try to install it:
pip install python-libmagic
I keep getting error:
Collecting python-libmagic
Using cached python_libmagic-0.4.0-py3-none-any.whl
Collecting cffi==1.7.0 (from python-libmagic)
Using cached cffi-1.7.0.tar.gz (400 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: pycparser in /opt/conda/envs/cho_env/lib/python3.10/site-packages (from cffi==1.7.0->python-libmagic) (2.21)
Building wheels for collected packages: cffi
Building wheel for cffi (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [254 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-310
creating build/lib.linux-x86_64-cpython-310/cffi
copying cffi/ffiplatform.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/cffi_opcode.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/verifier.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/commontypes.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/vengine_gen.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/setuptools_ext.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/vengine_cpy.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/recompiler.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/cparser.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/lock.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/backend_ctypes.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/__init__.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/model.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/api.py -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/_cffi_include.h -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/parse_c_type.h -> build/lib.linux-x86_64-cpython-310/cffi
copying cffi/_embedding.h -> build/lib.linux-x86_64-cpython-310/cffi
running build_ext
building '_cffi_backend' extension
creating build/temp.linux-x86_64-cpython-310
creating build/temp.linux-x86_64-cpython-310/c
gcc -pthread -B /opt/conda/envs/cho_env/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /opt/conda/envs/cho_env/include -fPIC -O2 -isystem /opt/conda/envs/cho_env/include -fPIC -DUSE__THREAD -I/usr/include/ffi -I/usr/include/libffi -I/opt/conda/envs/cho_env/include/python3.10 -c c/_cffi_backend.c -o build/temp.linux-x86_64-cpython-310/c/_cffi_backend.o
In file included from c/_cffi_backend.c:274:
c/minibuffer.h: In function ‘mb_ass_slice’:
c/minibuffer.h:66:5: warning: ‘PyObject_AsReadBuffer’ is deprecated [-Wdeprecated-declarations]
66 | if (PyObject_AsReadBuffer(other, &buffer, &buffer_len) < 0)
| ^~
In file included from /opt/conda/envs/cho_env/include/python3.10/genobject.h:12,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:110,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/abstract.h:343:17: note: declared here
343 | PyAPI_FUNC(int) PyObject_AsReadBuffer(PyObject *obj,
| ^~~~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:277:
c/file_emulator.h: In function ‘PyFile_AsFile’:
c/file_emulator.h:54:14: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
54 | mode = PyText_AsUTF8(ob_mode);
| ^
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h: In function ‘_my_PyUnicode_AsSingleWideChar’:
c/wchar_helper.h:83:5: warning: ‘PyUnicode_AsUnicode’ is deprecated [-Wdeprecated-declarations]
83 | Py_UNICODE *u = PyUnicode_AS_UNICODE(unicode);
| ^~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:580:45: note: declared here
580 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
| ^~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h:84:5: warning: ‘_PyUnicode_get_wstr_length’ is deprecated [-Wdeprecated-declarations]
84 | if (PyUnicode_GET_SIZE(unicode) == 1) {
| ^~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h:84:5: warning: ‘PyUnicode_AsUnicode’ is deprecated [-Wdeprecated-declarations]
84 | if (PyUnicode_GET_SIZE(unicode) == 1) {
| ^~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:580:45: note: declared here
580 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
| ^~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h:84:5: warning: ‘_PyUnicode_get_wstr_length’ is deprecated [-Wdeprecated-declarations]
84 | if (PyUnicode_GET_SIZE(unicode) == 1) {
| ^~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h: In function ‘_my_PyUnicode_SizeAsWideChar’:
c/wchar_helper.h:99:5: warning: ‘_PyUnicode_get_wstr_length’ is deprecated [-Wdeprecated-declarations]
99 | Py_ssize_t length = PyUnicode_GET_SIZE(unicode);
| ^~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h:99:5: warning: ‘PyUnicode_AsUnicode’ is deprecated [-Wdeprecated-declarations]
99 | Py_ssize_t length = PyUnicode_GET_SIZE(unicode);
| ^~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:580:45: note: declared here
580 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
| ^~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h:99:5: warning: ‘_PyUnicode_get_wstr_length’ is deprecated [-Wdeprecated-declarations]
99 | Py_ssize_t length = PyUnicode_GET_SIZE(unicode);
| ^~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from c/_cffi_backend.c:281:
c/wchar_helper.h: In function ‘_my_PyUnicode_AsWideChar’:
c/wchar_helper.h:118:5: warning: ‘PyUnicode_AsUnicode’ is deprecated [-Wdeprecated-declarations]
118 | Py_UNICODE *u = PyUnicode_AS_UNICODE(unicode);
| ^~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:580:45: note: declared here
580 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
| ^~~~~~~~~~~~~~~~~~~
c/_cffi_backend.c: In function ‘ctypedescr_dealloc’:
c/_cffi_backend.c:352:23: error: lvalue required as left operand of assignment
352 | Py_REFCNT(ct) = 43;
| ^
c/_cffi_backend.c:355:23: error: lvalue required as left operand of assignment
355 | Py_REFCNT(ct) = 0;
| ^
c/_cffi_backend.c: In function ‘cast_to_integer_or_char’:
c/_cffi_backend.c:3331:26: warning: ‘_PyUnicode_get_wstr_length’ is deprecated [-Wdeprecated-declarations]
3331 | PyUnicode_GET_SIZE(ob), ct->ct_name);
| ^~~~~~~~~~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
c/_cffi_backend.c:3331:26: warning: ‘PyUnicode_AsUnicode’ is deprecated [-Wdeprecated-declarations]
3331 | PyUnicode_GET_SIZE(ob), ct->ct_name);
| ^~~~~~~~~~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:580:45: note: declared here
580 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
| ^~~~~~~~~~~~~~~~~~~
c/_cffi_backend.c:3331:26: warning: ‘_PyUnicode_get_wstr_length’ is deprecated [-Wdeprecated-declarations]
3331 | PyUnicode_GET_SIZE(ob), ct->ct_name);
| ^~~~~~~~~~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:446:26: note: declared here
446 | static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
c/_cffi_backend.c: In function ‘b_complete_struct_or_union’:
c/_cffi_backend.c:4251:17: warning: ‘PyUnicode_GetSize’ is deprecated [-Wdeprecated-declarations]
4251 | do_align = PyText_GetSize(fname) > 0;
| ^~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:177:43: note: declared here
177 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_ssize_t) PyUnicode_GetSize(
| ^~~~~~~~~~~~~~~~~
c/_cffi_backend.c:4283:13: warning: ‘PyUnicode_GetSize’ is deprecated [-Wdeprecated-declarations]
4283 | if (PyText_GetSize(fname) == 0 &&
| ^~
In file included from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:177:43: note: declared here
177 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_ssize_t) PyUnicode_GetSize(
| ^~~~~~~~~~~~~~~~~
c/_cffi_backend.c:4353:17: warning: ‘PyUnicode_GetSize’ is deprecated [-Wdeprecated-declarations]
4353 | if (PyText_GetSize(fname) > 0) {
| ^~
In file included from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:177:43: note: declared here
177 | Py_DEPRECATED(3.3) PyAPI_FUNC(Py_ssize_t) PyUnicode_GetSize(
| ^~~~~~~~~~~~~~~~~
c/_cffi_backend.c: In function ‘prepare_callback_info_tuple’:
c/_cffi_backend.c:5214:5: warning: ‘PyEval_InitThreads’ is deprecated [-Wdeprecated-declarations]
5214 | PyEval_InitThreads();
| ^~~~~~~~~~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/Python.h:130,
from c/_cffi_backend.c:2:
/opt/conda/envs/cho_env/include/python3.10/ceval.h:122:37: note: declared here
122 | Py_DEPRECATED(3.9) PyAPI_FUNC(void) PyEval_InitThreads(void);
| ^~~~~~~~~~~~~~~~~~
c/_cffi_backend.c: In function ‘b_callback’:
c/_cffi_backend.c:5255:5: warning: ‘ffi_prep_closure’ is deprecated: use ffi_prep_closure_loc instead [-Wdeprecated-declarations]
5255 | if (ffi_prep_closure(closure, &cif_descr->cif,
| ^~
In file included from c/_cffi_backend.c:15:
/opt/conda/envs/cho_env/include/ffi.h:347:1: note: declared here
347 | ffi_prep_closure (ffi_closure*,
| ^~~~~~~~~~~~~~~~
In file included from /opt/conda/envs/cho_env/include/python3.10/unicodeobject.h:1046,
from /opt/conda/envs/cho_env/include/python3.10/Python.h:83,
from c/_cffi_backend.c:2:
c/ffi_obj.c: In function ‘_ffi_type’:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:744:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
744 | #define _PyUnicode_AsString PyUnicode_AsUTF8
| ^~~~~~~~~~~~~~~~
c/_cffi_backend.c:72:25: note: in expansion of macro ‘_PyUnicode_AsString’
72 | # define PyText_AS_UTF8 _PyUnicode_AsString
| ^~~~~~~~~~~~~~~~~~~
c/ffi_obj.c:191:32: note: in expansion of macro ‘PyText_AS_UTF8’
191 | char *input_text = PyText_AS_UTF8(arg);
| ^~~~~~~~~~~~~~
c/lib_obj.c: In function ‘lib_build_cpython_func’:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:744:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
744 | #define _PyUnicode_AsString PyUnicode_AsUTF8
| ^~~~~~~~~~~~~~~~
c/_cffi_backend.c:72:25: note: in expansion of macro ‘_PyUnicode_AsString’
72 | # define PyText_AS_UTF8 _PyUnicode_AsString
| ^~~~~~~~~~~~~~~~~~~
c/lib_obj.c:129:21: note: in expansion of macro ‘PyText_AS_UTF8’
129 | char *libname = PyText_AS_UTF8(lib->l_libname);
| ^~~~~~~~~~~~~~
c/lib_obj.c: In function ‘lib_build_and_cache_attr’:
/opt/conda/envs/cho_env/include/python3.10/cpython/unicodeobject.h:744:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
744 | #define _PyUnicode_AsString PyUnicode_AsUTF8
| ^~~~~~~~~~~~~~~~
c/_cffi_backend.c:71:24: note: in expansion of macro ‘_PyUnicode_AsString’
71 | # define PyText_AsUTF8 _PyUnicode_AsString /* PyUnicode_AsUTF8 in Py3.3 */
| ^~~~~~~~~~~~~~~~~~~
c/lib_obj.c:208:15: note: in expansion of macro ‘PyText_AsUTF8’
208 | char *s = PyText_AsUTF8(name);
| ^~~~~~~~~~~~~
In file included from c/cffi1_module.c:16,
from c/_cffi_backend.c:6636:
c/lib_obj.c: In function ‘lib_getattr’:
c/lib_obj.c:506:7: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
506 | p = PyText_AsUTF8(name);
| ^
In file included from c/cffi1_module.c:19,
from c/_cffi_backend.c:6636:
c/call_python.c: In function ‘_get_interpstate_dict’:
c/call_python.c:20:30: error: dereferencing pointer to incomplete type ‘PyInterpreterState’ {aka ‘struct _is’}
20 | builtins = tstate->interp->builtins;
| ^~
c/call_python.c: In function ‘_ffi_def_extern_decorator’:
c/call_python.c:73:11: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
73 | s = PyText_AsUTF8(name);
| ^
error: command '/usr/bin/gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for cffi
Running setup.py clean for cffi
Failed to build cffi
ERROR: Could not build wheels for cffi, which is required to install pyproject.toml-based projects```
How can I fix or bypass this?
Got the same issue. Root cause: the python-magic
library does not include required binary packages for windows, mac and linux. However, the python-magic-bin
fork does include them.
Note that python-libmagic
(which you have tried) would not work for me either. Go for python-magic-bin
instead.
So, try the following solution (found in this GitHub issue page) which worked for me:
# uninstall what you initially tried, to avoid conflicts
pip uninstall python-libmagic
pip uninstall python-magic
# install the working one
pip install python-magic-bin
If you are using conda
(instead of PyPI
), then you can use conda install -c conda-forge libmagic
, as per this GH issue page.