Relational algebra/SQL in Palamedes

The latest version (alpha 54) supports SQL syntax for operating on matrices (2D arrays) of data.

SELECT … FROM … WHERE …

The first step is to associate variable names with matrix columns. That's done with the create table syntax. The actual table may be defined before or after using create table.

create table Person (Name, Age, Weight)
Person ← [ …
   ["Harry", 34,  180], …
   ["Sally", 28,  164], …
   ["George",29,  170], …
   ["Helena",54,  54], …
   ["Peter", 34,  180]]

Now these column names may be used to query the table:

/* Sample queries */
select * from Person where Age ≥ 34
select Name, 0.5*Age from Person where Weight = 180
select Name, Age from Person where Weight = Age
select Name, Age from Person where Name like '^H' and Age ≥ 40

/* Randomly select people with a 50-50 chance of selection */
select * from Person where flip 50%

These queries simply translate into the following underlying Palamedes commands...

map sys.t$, sys.t$, {filter {Age ≥ 34}, sys.t$, Person} 
map [Name, 0.5 * Age], sys.t$, {filter {Weight = 180}, sys.t$, Person}
map [Name, Age], sys.t$, {filter {Weight = Age}, sys.t$, Person}
map [Name, Age], sys.t$, {filter {Name like '^H' and Age ≥ 40}, sys.t$, Person}
map sys.t$, sys.t$, {filter {flip 0.5}, sys.t$, Person}

The foregoing create table command merely creates the following variables:

Person.Name ← sys.t$∘0; Person.Age ← sys.t$∘1; Person.Weight ← sys.t$∘2
Name ← Person.Name; Age ← Person.Age; Weight ← Person.Weight

Natural joins

First note that joins may be run anywhere in a Palamedes program, not just in a FROM clause---a join is a multiplication operator. You can use the bowtie symbol or the word join. Palamedes figures out what columns the tables share in common and joins them where these columns equal each other.

create table Employee (Name, EmpId, DeptName)
Employee gets [ …
  ["Harry",   3415, "Finance"], …
  ["Sally",   2241, "Sales"], …
  ["George",  3401, "Finance"], …
  ["Harriet", 2202, "Sales"]]

create table Dept (DeptName,  Manager)
Dept gets [ …
  ["Finance",   "George"], …
  ["Sales",     "Harriet"], …
  ["Production","Charles"]]

Employee ⋈ Dept

Employee join Dept as EmployeeDept

vars

Note that the … AS id option in the join expression does two additional things:

  1. implicitly does a create table id (columns in the union of the two input tables)
  2. creates a new variable called id containing the result set

… ORDER BY …

The syntax is <table expression> order by <column name> [asc|desc]? [, <column name> [asc|desc]?]*. Note that you may order by any number of columns, and the optional direction (asc or desc) defaults to asc.

Moreover, there's no limitation on using order by only after a select statement. It works like a pipe to a matrix sorting function. So you can use it after a table name as shown in these examples:

create table Person (Name, Age, Weight)
Person ← [ …
   ["Harry", 34,  180], …
   ["Sally", 28,  164], …
   ["George",29,  180], …
   ["Helena",54,  54], …
   ["Peter", 34,  180]]

Person order by Name
Person order by Age asc
Person order by Age, Weight
Person order by Weight desc
Person order by Weight desc, Age asc

If your select clause modifies a column you need to provide a column alias via the as keyword to use it in an order by clause as shown here:

select Name, 0.5*Age as Age from Person order by Age desc

View tables

Use the view tables command to see what tables are defined:

create table Person (Name, Age, Weight)
create table Employee (Name, EmpId, DeptName)

view tables

/* view them in a list format instead of an array format */
view tables//listify

/* view tables that have a matching 'Age' */
view tables like 'Age' | show tables

… GROUP BY …

Unlike real SQL:

    create table Person (Name, Age, Weight)
    Person ← [ …
     ["Harry", 34,  180], …
     ["Sally", 28,  164], …
     ["George",29,  170], …
     ["Helena",54,  54], …
     ["Peter", 34,  180]]

    Person group by Age
    Person group by 2  NB. groups by 3rd column (Weight); 0 is 1st column
    Person group by Age after 34 after 1 after 0
    select Name, Age from Person where Name =~ 'a'i group by Age  NB. People with A's in their name

Side Effects and Warnings

  1. The select statement uses/overwrites the system variable sys.t$. The "t" stands for "tuple".
  2. select and join use a temp table called sys.Temp$ which will show up with the view tables command as well as a system variable called sys.Temp$.
  3. For additional debugging output run set logger on

TODOs

  1. UPDATE. UPDATE table SET a ← foo, b ↤ bar where baz. Note that Palamedes gets/← and set/↤ syntax will be preserved.

Necessary changes to existing code

(A) The existing select predicate (for randomly sampling elements from a non-equiprobable distribution) must be renamed pick so that it does not conflict with the new SQL syntax:

pick ["G", 20%, "A", 30%, "T", 30%, "C", 20%]
10 # pick ["G", 20%, "A", 30%, "T", 30%, "C", 20%]
pick {"G": 20%, "A": 30%, "T": 30%, "C": 20%}
10 # pick {"G": 20%, "A": 30%, "T": 30%, "C": 20%}

(B) Palamedes alpha 53 or prior has a bug in the way filter returns its results. See below for details. This is fixed in the latest dev version.

Proposed Names

BAMF QL
Basic Array Map-Filter Query Language.
FAQL
Pronounced "Fake-well." Abbreviation for either "Filtered-Array Query Language" or Fa(ke S)QL.
NymSQL
Pronounced "Numb Skull." Abbreviation for "Not your mother's SQL."
NyQL
Pronounced like Vick's Cold & Flu Relief. Abbreviation for "Not yet (S)QL."


AUTHORS BUGS DISCUSSION LICENSE NAME NEWS README SOURCE TODO TRY VERSION

Last update: Fri Sep 23 2016