langchainpy-langchaingpt-4

SQLDatabaseChain with GPT-4 returns SQL syntax error


I wrote a program trying to query local sqlite db, and it worked fine for text-davinci-003:

llm = OpenAI(model_name="text-davinci-003", verbose=True)

However, after I changed it to GPT-4:

llm = ChatOpenAI(model_name="gpt-4-0613", verbose=True)
...
db_chain = SQLDatabaseChain.from_llm(
    llm,
    db,
    verbose=True,
    use_query_checker=True,
    return_intermediate_steps=True,
)

with get_openai_callback() as cb:
    # No intermediate steps
    # result = db_chain.run(query)

    # If intermediate steps are needed...
    result = db_chain(query)
    intermediate_steps = result["intermediate_steps"]

    print("")

    try:
        sql_result = intermediate_steps[3]
        print("SQL Query Result:")
        print(json.dumps(ast.literal_eval(sql_result), indent=4))
    except Exception as e:
        print(f"Error while parsing the SQL result:\n{e}")
        print("")
        print(intermediate_steps)
    
    print("")

    print(cb)

... everything still works, except the final SQL query contained more text in addition to SQL query, i.e.:

> Entering new SQLDatabaseChain chain...
Have the user visited some news website? If yes, list all the urls.
DO NOT specify timestamp unless query said so.
DO NOT specify limit unless query said so.
SQLQuery:The original query appears to be correct as it doesn't seem to have any of the common mistakes listed. Here is the same query:

SELECT "URL" FROM browsinghistory WHERE "Title" LIKE '%news%'Traceback (most recent call last):
  File "C:\path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1968, in _exec_single_context
    self.dialect.do_execute(
  File "C:\path\Python311\Lib\site-packages\sqlalchemy\engine\default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: near "The": syntax error

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\path\run.py", line 292, in <module>
    database_mode(llm, filepath, delimiter)
  File "D:\path\run.py", line 156, in database_mode
    llm.query_database(db_path=db_path, query=query)
  File "D:\path\modules\chatbot.py", line 220, in query_database
    result = db_chain(query)
             ^^^^^^^^^^^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\chains\base.py", line 140, in __call__
    raise e
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\chains\base.py", line 134, in __call__
    self._call(inputs, run_manager=run_manager)
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\chains\sql_database\base.py", line 181, in _call
    raise exc
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\chains\sql_database\base.py", line 151, in _call
    result = self.database.run(checked_sql_command)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\sql_database.py", line 334, in run
    cursor = connection.execute(text(command))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1413, in execute
    return meth(
           ^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\sql\elements.py", line 483, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1637, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1846, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1987, in _exec_single_context
    self._handle_dbapi_exception(
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 2344, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1968, in _exec_single_context
    self.dialect.do_execute(
  File "C:\path\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\engine\default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near "The": syntax error
[SQL: The original query appears to be correct as it doesn't seem to have any of the common mistakes listed. Here is the same query:

SELECT "URL" FROM browsinghistory WHERE "Title" LIKE '%news%']
(Background on this error at: https://sqlalche.me/e/20/e3q8)

I know that I can try to tell it not to return anything but the query (might be unstable. though...), but why isn't this work for GPT-4, while it works for text-davinci-003?


Update:

Tried with a different query, and the problem remains:

> Entering new SQLDatabaseChain chain...
List all websites visited by the user.
DO NOT specify timestamp unless query said so.
DO NOT specify limit unless query said so.
SQLQuery:The original query seems to be correct. It is simply selecting the "URL" column from the "browsinghistory" table. There is no misuse of any functions, no data type mismatch, no joins, etc.

Reproducing the original query:

SELECT "URL" FROM browsinghistory
...
...
...

Solution

  • I still have no idea why the query result contains contents other than SQL query, but I implemented SQLDatabaseToolkit to run the query and worked. SQLDatabaseToolkit can help to deal with some parsing errors like this.

    toolkit = SQLDatabaseToolkit(db=db, llm=self.llm)
    
    agent_executor = create_sql_agent(
        llm=self.llm,
        toolkit=toolkit,
        verbose=True
    )
    
    with get_openai_callback() as cb:
        res = agent_executor.run(query)
        print("SQL Query Result:")
        print(res)
        print("")
        print(cb)