mysqldatabasedistributed-computingrdbmsmysql-cluster

NDB cluster7.5(MySQL 5.7) is taking more time in fetching data, if number of rows in table is large(2 million)


I am trying to setup NDB cluster(MYsql 5.7) for one of my real time application(with a high volume of read and write concurrency).

My Setup -

3 Data Nodes 1 Management Node 1 MySQL node

All nodes are of amazon EC2 r3.4xlarge type. OS - centos 7

i created one table and partioned by primary key to make sure same primary key data goes in single node.

Table Schema -
CREATE TABLE ContactsAgentContacts(
       uniqueid integer not null,
       did varchar(32) not null,
       nId varchar(50),
       companyname varchar(50),
primary key (uniqueid,did)
) 

ENGINE=NDBCLUSTER PARTITION BY KEY(did);

Now i populated my table with 2 million of record in such a way that Each did contains 1K record.

Query Fired - SELECT DISTINCT ContactsAgentContacts.companyname AS 'fullname' from ContactsAgentContacts where did='xyz';

performance getting -

with single concurrency - fetching 1k record of one did

**with 1 read concurrency - 800 ms
with 25 read concurrency - 1.5 sec
with 50 read concurrency - 3 sec**

As i am trying to develop a real time system any value more then 300 ms is too much for me and this time is increaasing as number of rows are increasing in table. Please let me know how to optimize my solution.

My configiration .
config.ini

[tcp default]
SendBufferMemory=2M
ReceiveBufferMemory=2M

[ndb_mgmd default]
# Directory for MGM node log files
DataDir=/var/lib/mysql-cluster

[ndb_mgmd]
#Management Node db1
HostName=10.2.25.129
NodeId=1

[ndbd default]
NoOfReplicas=1
DataMemory=2000M
IndexMemory=300M
LockPagesInMainMemory=1
#Directory for Data Node
DataDir=/var/lib/mysql-cluster
NoOfFragmentLogFiles=300
MaxNoOfConcurrentOperations=100000
SchedulerSpinTimer=400
SchedulerExecutionTimer=100
RealTimeScheduler=1
TimeBetweenGlobalCheckpoints=1000
TimeBetweenEpochs=200
RedoBuffer=32M

[ndbd]
#Data Node db2
HostName=10.2.18.81
NodeId=2
#LockExecuteThreadToCPU=1
LockMaintThreadsToCPU=0

[ndbd]
#Data Node db3
HostName=10.2.20.15
NodeId=3
#LockExecuteThreadToCPU=1
LockMaintThreadsToCPU=0

[ndbd]
#Data Node db4
HostName=10.2.24.28
NodeId=4
#LockExecuteThreadToCPU=1
LockMaintThreadsToCPU=0

[mysqld]
#SQL Node db5
HostName=10.2.29.42
NodeId=5

Solution

  • Switch to primary key (did,uniqueid) means that the ordered index on primary key will be used for scan instead of a full table scan on one of the partitions.

    This should improve things by magnitudes.