I have a stream of java objects and I would like to query them using SQL-like syntax at a reasonable performance (comparable to querying a regular table without any indexes in an RDBMS, like a one-time full table scan).
I like the Stream API map/filter/etc., but the query would also be an input, so I can't hard-code it in java.
Is it possible to do this without inserting the incoming data into a "real" database (and then removing them later to save space)?
I was thinking about using an in-memory database like H2 or SQLite, but then I would still have to insert the data, and they really are not for streaming.
Are there any existing libraries/solutions for something like I'm trying to do?
class A {
private String name;
/* ... */
}
Stream<A> myStream /* = ... */ ;
Stream<Integer> result = query(myStream, "select count(*) as number_of_x from :myStream where name = 'x'",
(rs, i) -> rs.getInt("number_of_x"));
/* result.toList() will contain one element at the end */
I have a stream of java objects and I would like to query them
What you want is doesn't make a lot of sense.
Streams are iterators, not containers of data. See the API documentation:
No storage. A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
So the stream aren't a mean of storing data.
And once the stream is consumed, it can't be used anymore. You can query a stream like a database.
A stream is an internal iterator that can be executed only once.
If you're interested in implementing a Parser for translating SQL-like queries into Predicates and Functions, which would be applied on a stream, then sure you can try. For very simple queries, it's definitely doable.
But it's not a trivial task. A fully-fledged parser for handling simple queries (similar to the one that has been specified in the question) would require a lot of effort both to implement and to test. I doubt whether it would pay off.
Here's a very, very dumb illustration which makes use of the Reflection API and regular expressions.
The Demo-parser below is not capable of doing much, the proper implementation would be far more complex.
public class QueryParser {
public static <T> long getAsCount(String query, Stream<T> stream, Class<T> tClass) { // overloaded versions for primitive streams
StreamOperation<T> operation = StreamOperation.fromQuery(query, tClass);
return stream
.filter(operation.getPredicate())
.count();
}
private static class StreamOperation<T> {
public static final Pattern WHERE = Pattern.compile("(?<=(?i)where).*?(?=(?i)group)|(?<=(?i)where).*?(?=$)");
private Predicate<T> predicate;
// more properties, constructor
public static <T> StreamOperation<T> fromQuery(String query, Class<T> tClass) {
Predicate<T> where = WHERE.matcher(query).results()
.map(MatchResult::group)
.findFirst()
.map(conditions -> parseConditions(conditions, tClass))
.orElse(t -> true);
// working on other properties
StreamOperation<T> so = new StreamOperation<>(where);
return so;
}
public Predicate<T> getPredicate() {
return predicate;
}
public static <T> Predicate<T> parseConditions(String conditions, Class<T> tClass) {
String[] or = conditions.split("(?i)or"); // split by OR
Predicate<T> orPredicate = t -> false; // base predicate for OR
for (String jointCondition: or) {
String[] and = jointCondition.split("(?i)and"); // split by END
Predicate<T> andPredicate = t -> true; // base predicate for AND
for (String condition: and) {
Predicate<T> next = null;
// parse each condition
try {
next = Conditions.parseCondition(condition, tClass);
} catch (NoSuchFieldException e) {
e.printStackTrace();
throw new IllegalArgumentException("Invalid condition or type:\n"
+ condition + " for Type " + tClass.getCanonicalName());
}
andPredicate = andPredicate.and(next); // join conditions together using Predicate.end()
}
orPredicate = orPredicate.or(andPredicate); // join conditions together using Predicate.or()
}
return orPredicate;
}
}
private static class Conditions {
public static <T> Predicate<T> parseCondition(String conditions, Class<T> tClass) throws NoSuchFieldException {
// TO DO add logic for other conditions
// Logic equality comparison implemented for demo purposes
String[] equals = conditions.split("=");
Field field = tClass.getDeclaredField(equals[0].strip());
field.setAccessible(true);
return field.getType().isPrimitive() ? // assumption that boolean is also represented as numeric value 0 or 1
compareAsNumericType(field, equals) : compareAsString(field, equals);
}
public static <T> Predicate<T> compareAsNumericType(Field field, String[] equals) {
return t -> {
try {
return field.getDouble(t) == Double.parseDouble(equals[1].strip());
} catch (IllegalAccessException e) {
e.printStackTrace();
return false;
}
};
}
public static <T> Predicate<T> compareAsString(Field field, String[] equals) {
return t -> {
try {
return field.get(t).equals(equals[1].strip().replace("'", ""));
} catch (IllegalAccessException e) {
e.printStackTrace();
return false;
}
};
}
}
// TODO implement methods for retrieving other results
// public static <T, R> List<R> getAsList(String query, Stream<T> stream, Class<T> tClass) { // overloaded versions for primitive streams
//
// StreamOperation<T> operation = StreamOperation.fromQuery(query, tClass);
//
// return stream
// .filter(operation.getPredicate())
// .map(operation.getMapper()) // not implemented
// .toList();
// }
}
A dummy class for testing:
public class A {
private int id;
private String name;
// getters, constructor
}
main()
public static void main(String[] args) {
String query = "SELECT count(*) as number_of_x from :myStream WHERE name = 'Alise' AND id = 100";
Stream<A> stream = Stream.of(
new A(100, "Alise"),
new A(90, "Bob"),
new A(100, "Carol")
);
System.out.println(QueryParser.getAsCount(query, stream, A.class));
}
Output:
1