elasticsearchsonarqubesonarqube5.3sonarqube-ops

SonarQube uses incorrect (?) ElasticSearch query to get ScmAccountToUser


I am running SonarQube 5.3 on Windows with a MSSQL backend.

When creating new Issues, SonarQube queries its ElasticSearch user index to get author login for the "git blame" info of the line presenting the issue.

The following happens in /server/sonar-server/src/main/java/org/sonar/server/computation/issue/IssueAssigner.java:

=> The "git blame" information returns the author of the affected line, in my example (anonymized):

steve smith@ca5553f7-9c36-c34d-916b-b330600317e9

=> This value is looked up in ScmAccountToUser, which lazily queries the ElasticSearch index "users". I added some debug output to print the ES query, which is:

{
  "size": 3,
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "bool": {
          "must": {
            "term": {
              "active": true
            }
          },
          "should": [
            {
              "term": {
                "login": "steve smith@ca5553f7-9c36-c34d-916b-b330600317e9"
              }
            },
            {
              "term": {
                "email": "steve smith@ca5553f7-9c36-c34d-916b-b330600317e9"
              }
            },
            {
              "term": {
                "scmAccounts": "steve smith@ca5553f7-9c36-c34d-916b-b330600317e9"
              }
            }
          ]
        }
      }
    }
  }
}

This query returns 0 results.

In contrast, when I enumerate the whole index, I get a hit which generally should match this user:

{ -
  "took": 4,
  "timed_out": false,
  "_shards": { -
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": { -
    "total": 39,
    "max_score": 1,
    "hits": [ -
      { -
        // snip
      },
      // snip
      { -
        "_index": "users",
        "_type": "user",
        "_id": "steve.smith",
        "_score": 1,
        "_source": { -
          "createdAt": 1442988141642,
          "name": "Steve Smith",
          "active": true,
          "login": "steve.smith",
          "scmAccounts": [ -
            "
",
            "steve smith@ca5553f7-9c36-c34d-916b-b330600317e9
",
            "steve.smith@ca5553f7-9c36-c34d-916b-b330600317e9
"
          ],
          "email": "steve.smith@globodex.ch",
          "updatedAt": 1450088380632
        }
      },
      // snip
    ]
  }
}

This issue is currently preventing my SonarQube instance from auto-assigning a lot of issues. I am in the process of figuring out when/how this broke, as some auto-assigning has previously succeeded.

Is this an error in the query or in the data? Can I work around this issue somehow?


Solution

  • It turns out that the problem was due to the newlines in the "scmAccounts" field entries.

    By manually re-adding the SCM accounts in the SonarQube UI, these fields were updated to

    "scmAccounts": 
    [ -
                "steve smith@ca5553f7-9c36-c34d-916b-b330600317e9",
                "steve.smith@ca5553f7-9c36-c34d-916b-b330600317e9"
    ],
    

    , after which the query succeeded and issue assignment succeeded.

    The newlines got into the fields in the first place because I manually restored the table "users" on the SQL server from a backup SQL INSERT script.