pythonmysqlgrafanatest-data

How to generate fake real time data continously for mysql?


I actually need to visualise real-time data in grafana in MySQL. I initially used a seed_python file to generate random numbers, but now I want to have continuous values in MySQL

Here is the python code I used initially

POSSIBLE_STATES = ['ACTIVE', 'INACTIVE']

class MySqlSeeder:

    def __init__(self):
        config = {
            'user': 'root',
            'password': 'something',
            'host': '192.168.99.100' if script_runs_within_container() else 'localhost',
            'port': '3306',
            'database': 'team'
        }
        while not hasattr(self, 'connection'):
            try:
                self.connection = mysql.connector.connect(**config)
                self.cursor = self.connection.cursor()
            except InterfaceError:
                print("MySQL Container has not started yet. Sleep and retry...")
                time.sleep(1)

    def seed(self):
        print("Clearing old data...")
        self.drop_user_table()
        print("Start seeding...")
        self.create_user_table()
        self.insert_users()

        self.connection.commit()
        self.cursor.close()
        self.connection.close()
        print("Done")

    def create_user_table(self):
        sql = '''
        CREATE TABLE users(
          id INT PRIMARY KEY AUTO_INCREMENT,
          number INT
        );
        '''
        self.cursor.execute(sql)

    def insert_users(self):
        for _ in range(300):
            sql = '''
            INSERT INTO users (number)
            VALUES (%(number)s);
            '''
            user_data = {
                'number': random.randint(1,100)
            }
            self.cursor.execute(sql, user_data)

    def drop_user_table(self):
        self.cursor.execute('DROP TABLE IF EXISTS users;')


    def script_runs_within_container():
        with open('/proc/1/cgroup', 'r') as cgroup_file:
            return 'docker' in cgroup_file.read()
    
    
    MySqlSeeder().seed()

Below is the docker-compose file

version: '3.3'
services:
  mysql:
    image: mysql:latest
    ports:
      - "3306:3306"
    environment:
      MYSQL_ROOT_PASSWORD: something
  grafana:
    image: grafana/grafana
    restart: always
    ports:
      - "3000:3000"
    depends_on:
      - mysql
  mysql_seeding:
    build: ./DockerFiles
    depends_on:
      - mysql

volumes:
  grafana-storage:
  db_data:

What could be the best way to perform continuous real-time data generation?


Solution

  • I presume you want a table containing random data generated every so often. Here's such a table. Each row has an id, a timestamp called ts, and two float values va1, val2.

    CREATE OR REPLACE TABLE realtime (
        id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
        ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
        val1 FLOAT NOT NULL DEFAULT '0.0',
        val2 FLOAT NOT NULL DEFAULT '0.0',
        PRIMARY KEY (id),
        INDEX time_vals (ts, val1, val2)
    )
    COLLATE='utf8mb4_general_ci'
    ENGINE=InnoDB;
    

    Then, you need a way to insert a new row into that table every so often. That's done with a MySQL EVENT object. Here's an example that puts data into that realtime table I just defined ten times a minute (every six seconds).

    It also DELETEs rows older than a day, so your database won't get totally swamped with random useless garbage if you forget to DISABLE or DELETE the timer. The index on ts speeds up the DELETE operation.

    DELIMITER //
    CREATE OR REPLACE EVENT `random_realtime`
        ON SCHEDULE
            EVERY 6 SECOND 
        ON COMPLETION PRESERVE
        ENABLE
        COMMENT 'Generating random timeseries test data. Please delete me in production.' 
    DO BEGIN
        INSERT INTO realtime (val1, val2) VALUES  (RAND(), RAND());
        DELETE FROM realtime WHERE ts <= NOW() - INTERVAL 1 DAY;
    END//
    DELIMITER ;
    

    Finally, you need to tell MySQL to schedule your events. This command does that.

    SET GLOBAL event_scheduler=ON;
    

    You can put these three queries into the initialization of your Python program where you presently define your table.