Query
Programming relic is mostly about defining queries.
- A query is always a vector
- queries are made up of operations, themselves vectors of the form
[operator & args]
.
;; example query from the tpc-h benchmark suite
[[:from :lineitem]
[:where [<= :l_shipdate #inst "1998-09-02"]]
[:agg
[:l_returnflag
:l_linestatus]
[:sum_qty [rel/sum :l_quantity]]
[:sum_base_price [rel/sum :l_extendedprice]]
[:sum_disc_price [rel/sum [* :l_extendedprice [- 1 :l_discount]]]]
[:sum_charge [rel/sum [* :l_extendedprice [- 1 :l_discount] [+ 1 :l_tax]]]]
[:avg_qty [rel/avg :l_quantity]]
[:avg_price [rel/avg :l_extendedprice]]
[:avg_disc [rel/avg :l_discount]]
[:count_order count]]]
queries compose by adding to the vector, this is in contrast to other approaches where query optimisers take care of the overall ordering.
In relic, data always flows top-to-bottom.
The super-power is that relic allows you to materialize the query. This will convert the query into a DAG to support incremental re-computation as data in tables changes.
With few exceptions (direct index lookup) you can materialize any query with rel/mat
.
Rationale of form
In the tar pit paper, the language used to express its 'relvars' was a traditional expression tree (e.g union(a, b)
).
I wanted a data-first clojure dsl that met two goals, like any good data dsl I wanted to compose using regular clojure functions, and I wanted them to be easy to write and read as literals without ide support.
The vector form I think is close to SQL, with a nice top-to-bottom reading flow. Each operation is self-contained in its own
delimited form, and so you can create new queries with conj
, split them with split-at
and so on.
Operators
:from
a.k.a start here:where
to select only certain rows:join
sql style relation inner join:left-join
sql style relational left join:extend
compute new columns:select
project a subset of columns, computations:without
drop columns:expand
flatten nested sequences:agg
apply aggregates over rows in groups:sort
sort rows:sort-limit
sort with a limit:const
provide a constant relation:difference
set diff:intersection
set intersection:union
set union:rename
rename columns:qualify
qualify (namespace) columns
Constraints
:check
ensure certain predicates hold:req
ensure cols exist:fk
ensure a referenced row exists in some other query/table:unique
unsure only one row exists for some set of expressions:constrain
combine multiple constraints on a query/table