mysqlsqljdbclogstashlogstash-jdbc

Can logstash's JDBC input plugin do multiple sql tasks?


Now there is a JDBC task read from db as

input {
  jdbc {
    jdbc_connection_string => "jdbc:mysql://${MYSQL_MAIN_HOST}/${MYSQL_DATABASE}"
    jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
    jdbc_page_size => 10000
    jdbc_paging_enabled => true
    jdbc_password => "${MYSQL_PASSWORD}"
    jdbc_user => "${MYSQL_USER}"
    schedule => "0 1 * * *"
    statement_filepath => "/usr/share/logstash/pipeline/sql/select_posts.sql"
    tracking_column => "updated_at"
    tracking_column_type => "numeric"
    use_column_value => true
    last_run_metadata_path => "/usr/share/logstash/jdbc_last_run/select_posts_last_value"
  }
}

/usr/share/logstash/pipeline/sql/select_posts.sql

SELECT
  id,
  title,
  body
FROM posts

This task is heavy when the body item goes to very large data. So I want to remove it at the first search as:

SELECT
  id,
  title
FROM posts

Then get the IDs and use them to find body again.

SELECT
  body
FROM posts
WHERE id in (IDs)

To set to output target. Even update the output target is also okay.

So can logstash read statement by statement in this case?


Solution

  • If you want to execute two separate sql queries using logstash, you can do it as follows

    SELECT
      id,
      title
    FROM posts
    
    SELECT
      body
    FROM posts
    WHERE id in (IDs)
    
    input {
      jdbc {
        jdbc_connection_string => "jdbc:mysql://${MYSQL_MAIN_HOST}/${MYSQL_DATABASE}"
        jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
        jdbc_page_size => 10000
        jdbc_paging_enabled => true
        jdbc_password => "${MYSQL_PASSWORD}"
        jdbc_user => "${MYSQL_USER}"
        schedule => "0 1 * * *"
        statement_filepath => <PATH TO query1.sql>  -----query1.sql
        tracking_column => "updated_at"
        tracking_column_type => "numeric"
        use_column_value => true
        last_run_metadata_path => "/usr/share/logstash/jdbc_last_run/select_posts_last_value"
      }
    
     jdbc {
        jdbc_connection_string => "jdbc:mysql://${MYSQL_MAIN_HOST}/${MYSQL_DATABASE}"
        jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
        jdbc_page_size => 10000
        jdbc_paging_enabled => true
        jdbc_password => "${MYSQL_PASSWORD}"
        jdbc_user => "${MYSQL_USER}"
        schedule => "0 1 * * *"
        statement_filepath => "<PATH TO query2.sql>"    -----query2.sql
        tracking_column => "updated_at"
        tracking_column_type => "numeric"
        use_column_value => true
        last_run_metadata_path => "/usr/share/logstash/jdbc_last_run/select_posts_last_value"
      }
    }