I'm getting what I think are needless CROSS JOINs when I'm doing a select IN SUBQUERY, which is hurting performance. I'm using Postgres if that makes a difference.
I'm aiming to generate the following query
select a1.first_name from author a1
where a1.last_name = ?
and (a1.id in
(select distinct b.author_id
from book b
where (b.published_on between ? and ?)
group by b.author_id
having count(b.author_id) >= 2))
But I get
select a1.first_name from author a1
where a1.last_name = ?
and (a1.id in
(select distinct b.author_id
from book b
cross join author a2 where b.author_id = a2.id -- <<< I don't want this cross join!
and (b.published_on between ? and ?)
group by b.author_id
having count(b.author_id) >= 2))
Code
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<String> cq = cb.createQuery(Author.class);
Root<Author> authorRoot = cq.from(Author.class);
Subquery<Long> countSubquery = cq.subquery(Long.class);
Root<Book> bookRoot = countSubquery.from(Book.class);
Expression<Long> count = cb.count(bookRoot.get(Book_.author));
countSubquery.select(bookRoot.get(Book_.AUTHOR))
.distinct(true)
.where(cb.between(bookRoot.get(Book_.publishedOn),
LocalDate.of(2021, MARCH, 1),
LocalDate.of(2021, MARCH, 31)))
.groupBy(bookRoot.get(Book_.author))
.having(cb.greaterThanOrEqualTo(count, 2L));
cq.where(
cb.equal(authorRoot.get(Author_.lastName), "Smith"),
cb.in(authorRoot.get(Author_.ID)).value(countSubquery));
cq.select(authorRoot.get(Author_.FIRST_NAME));
TypedQuery<String> query = entityManager.createQuery(cq);
return query.getResultList();
In reality I'm generating the queries from a user driven query builder, this code recreates the exact problem I'm having.
When using the query builder the user could end up with multiple select in subqueries so I need this to perform as well as possible.
I don't see why I should need any join/cross join for my query to work.
Entities
@Entity
public class Author {
@Id
@GeneratedValue
private Long id;
private String firstName;
private String lastName;
@OneToMany(mappedBy = "author", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
private Set<Book> books;
}
@Entity
public class Book {
@Id
@GeneratedValue
private Long id;
private String name;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "author_id")
private Author author;
private LocalDate publishedOn;
}
This expression: bookRoot.get(Book_.author)
means you're joining Author
to Book
implicitly.
To get rid of the extra join, you would have to either use a native query, or map Book.author_id
once more as a simple column:
@Column(name = "author_id", insertable = false, updatable = false)
private Long authorId;
And use Book_.authorId
instead.