pythonxmlexcellxmlpyxll

integration of python into excel using pyxll... having problems with lxml module


I am new to python. I am trying to get the meaning of a word from internet. The standalone python code works just fine.

    from lxml import html
    import requests
    url = "http://dictionnaire.reverso.net/francais-definition/"
    word = raw_input("please enter the word you want to translate ")
    url = url + word
    page = requests.get(url)
    tree= html.fromstring(page.text)
    translation = tree.xpath('//*[@id="ID0EYB"]/text()')
    print translation

Please note that xpath that I am using is just for testing purposes. Works fine with simple words like 'manger', 'gonfler' etc. The next step that I am trying is to use pyxll addin for excel to create a function in excel for the same task.

   from pyxll import xl_func
   from lxml import html
   import requests
   @xl_func("string x: string")
   def traduction(x):
           url = "http://dictionnaire.reverso.net/francais-definition/"
           url = url + x
           page = requests.get(url)
           tree= html.fromstring(page.text)
           translation = tree.xpath('//*[@id="ID0EYB"]/text()')
           return translation

After this when I start excel, I get an error. In the log files of pyxll, error is described as follows:

  2014-09-09 17:02:41,845 - ERROR : Error importing 'worksheetfuncs': DLL load failed: Le module spécifié est introuvable.
  2014-09-09 17:02:41,845 - ERROR : Traceback (most recent call last):
  2014-09-09 17:02:41,845 - ERROR :   File "pyxll", line 791, in _open
  2014-09-09 17:02:41,845 - ERROR :   File "\pyxll\examples\worksheetfuncs.py", line 317, in <module>
  2014-09-09 17:02:41,845 - ERROR :     from lxml import html
  2014-09-09 17:02:41,846 - ERROR :   File "C:\Python27\lib\site-packages\lxml\html\__init__.py", line 42, in <module>
  2014-09-09 17:02:41,846 - ERROR :     from lxml import etree
  2014-09-09 17:02:41,846 - ERROR : ImportError: DLL load failed: Le module spécifié est introuvable.
  2014-09-09 17:02:41,888 - WARNING : pydevd failed to import - eclipse debugging won't work
  2014-09-09 17:02:41,888 - WARNING : Check the eclipse path in \pyxll\examples\tools\eclipse_debug.pyc
  2014-09-09 17:02:41,890 - INFO : callbacks.license_notifier: This copy of PyXLL is for evaluation or non-commercial use only

I have used translation sites with APIs to do similar stuff and it worked fine. The real problem for me here was parsing for which I used lxml and it seems that lxml and pyxll dont go together. Help please!!!


Solution

  • I switched to urllib2 and beautifulsoup which works well with pyxll. Below is the working code for array function in excel that takes in multiple words and gives out two meanings. However, CSS select that I am using is too restrictive and I am yet to find a pattern in the website which I can use to get meaning of any word.

    from pyxll import xl_func
    import urllib2
    from bs4 import BeautifulSoup
    @xl_func("var[] x: var[]")
    def dictionnaire(x):
        height = len(x)
        meanings = []
        for i in range(height):
            word = x[i][0]
            row = []
            url = "http://dictionnaire.reverso.net/francais-definition/"
            url = url + word
            page = urllib2.urlopen(url)
            soup = BeautifulSoup(page)
            results = soup.select("#ID0ENC")
            row.append(results[0].get_text())
            row.append(results[1].get_text())
        meanings.append(row)
        return meanings