I developed a database with the ORM system to manage financial market data.
I am employing Python with SQLAlchemy and SQLite.
I need to retrieve the latest record for each asset from the market_data table, extract its date and time, request updated data from the server, and then insert the new data back into the market_data table. I plan to perform this operation multiple times per minute, so speed is essential. However, the current query takes several minutes to execute. I've identified that the filtering process for each asset is the primary bottleneck; without it, the query completes instantly but returns incorrect results.
I think storing each asset's data in a separate table would be more efficient, but I can't create a specific class for every asset since they can be added dynamically at runtime.
Is there a way to generate multiple tables from one ORM class? Alternatively, do you have any suggestions to enhance the database structure to improve the speed of this query?
The current models for the database:
class Date(Base):
__tablename__ = "dates"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, unique=True)
date: Mapped[datetime.date] = mapped_column(Date, nullable=False, unique=True)
class Time(Base):
__tablename__ = "times"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, unique=True)
time: Mapped[datetime.time] = mapped_column(Time, nullable=False, unique=True)
class Region(Base):
__tablename__ = "regions"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, unique=True)
name: Mapped[str] = mapped_column(String(30), nullable=False, unique=True)
markets: Mapped[List["Market"]] = relationship()
class Market(Base):
__tablename__ = "markets"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, unique=True)
name: Mapped[str] = mapped_column(String(20), nullable=False)
region: Mapped[int] = mapped_column(ForeignKey("regions.id"))
assets: Mapped[List["Asset"]] = relationship()
class Asset(Base):
__tablename__ = "assets"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, unique=True)
symbol: Mapped[str] = mapped_column(String(10), nullable=False, unique=True)
name: Mapped[str] = mapped_column(String(30), nullable=True)
market: Mapped[int] = mapped_column(ForeignKey("markets.id"))
class MarketData(Base):
__tablename__ = "market_data"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, unique=True)
date: Mapped[int] = mapped_column(ForeignKey("dates.id"))
time: Mapped[int] = mapped_column(ForeignKey("times.id"))
asset: Mapped[int] = mapped_column(ForeignKey("assets.id"))
opening: Mapped[float] = mapped_column(Float, nullable=False)
high: Mapped[float] = mapped_column(Float, nullable=False)
low: Mapped[float] = mapped_column(Float, nullable=False)
closing: Mapped[float] = mapped_column(Float, nullable=False)
volume: Mapped[float] = mapped_column(Float, nullable=True)
The query that I tested to take the last x record of one asset:
session.query(MarketData).filter_by(asset=asset.id).order_by(desc(MarketData.id)).limit(10).all()
mapped_class = type("Asset", (Asset, Base), {"__tablename__": "assets"})
>>> "assets" in Base.metadata.tables
True
>>> from sqlalchemy.sql.schema import Table
>>> isinstance(Base.metadata.tables["assets"], Table)
True
>>> Base.metadata.tables["assets"] is mapped_class.__table__
True
If you have a different table for each asset, just by changing the table name, you can create another ORM class referencing that table.
type("Asset", (Asset, Base), {"__tablename__": "another_assets"})
>>> isinstance(Base.metadata.tables["another_assets"], Table)
True
For a given table name, only one instance of Table
in the metadata
can exist at a time:
>>> type("Asset", (Asset, Base), {"__tablename__": "another_assets"})
{InvalidRequestError}InvalidRequestError("Table 'another_assets' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object.")
>>> from sqlalchemy import select
>>> session.scalars(select(mapped_class).filter_by(name="asset_name")).one_or_none()
You can also use Table
instance here if created earlier:
session.scalars(select(Base.metadata.tables["assets"]).filter_by(name="asset_name")).one_or_none()
Just by loading the instance of Table
in metadata
is not enough to send queries. Your table must exist in the database as well. By using the ORM class created earlier, you can also create a table dynamically.
from sqlalchemy import create_engine
engine = create_engine(url=...)
mapped_class = type("Asset", (Asset, Base), {"__tablename__": "new_assets"})
with engine.begin() as conn:
mapped_class.__table__.create(bind=conn, checkfirst=True)