|
||||
amos2.exe amos2.dll amos2.dmpAmos II is ready to run in <privdir>/bin by the command:
amos2 [<db>]where [<db>] is an optional name of an Amos II database image.
You need not connect to any particular database, but instead, if <db> is omitted, the system enters an empty database (named amos2.dmp), where only the system objects are defined. The system looks for amos2.dmp in the same directory as where the executable amos2.exe is located.
The executable has a number of command line parameters to
specify, e.g., the database or AmosQL script to load. To get a
list of the command line parameters do:
amos2 -h
The Amos toploop
When started, the system enters the Amos top loop where it reads
AmosQL statements, executes them, and prints their results.
The prompter in the Amos II top loop is:
Amos n>where n is a generation number. The generation number is increased every time an AmosQL statement that updates the database is executed in the Amos top loop.
Typically you start by defining meta-data (a schema) as types and properties of types represented as functions. For example, the following statement create a type named Person having two property functions name() and income():
Amos 1> create type Person propertiesWhen the meta-data is defined you usually populate the database by creating objects and updating functions.
(name Charstring,
income Number);
For example:
Amos 2> create
Person(name, income) instances
("Bill",100),
("John",250),
("Ulla",380),
("Eva", 280);
When the database is populated you can query
it, e.g.:
Amos 3> select income(p)
from Person p where name(p)="Ulla";
Usually you load AmosQL definitions from a script file
rather than entering them on the command line, e.g.
Amos 1> <
'mycode.amosql';
Transactions
Database changes can be undone by using the
rollback statement with a generation number as argument.
For example, the statement:
Amos 4> rollback 2;will restore the database to the state it had at generation number 2. A rollback without arguments undoes all database changes of the current transaction.
The statement commit makes changes non-undoable, i.e. all updates so far cannot be rolled back any more and the generation numbering starts over from 1.
For example:
Amos 2> commit;
Amos 1> ...
When your Amos II database is defined and populated, it can be saved on disk with the AmosQL statement:
save "filename";
In a later session you can connect to the saved database by starting Amos II with:
amos2 filename
To shut down Amos II orderly first save the database and then type:
Amos 1> quit;
This is all you need to know to get started with Amos II.
The remaining chapters in this document describe the basic Amos II commands. As an example of how to define and populate an Amos II database, cut-and-paste the commands in http://www.it.uu.se/research/group/udbl/amos/doc/intro.amosql. There is a tutorial on object-oriented database design with Amos II in http://www.it.uu.se/research/group/udbl/amos/doc/tut.pdf.
JavaAmos is a version of the Amos II kernel connected to the Java virtual machine (32 bits JVM). With it Java programs can call Amos II functions and send AmosQL statements to Amos II for evaluation (the callin interface) [ER00]. You can also define Amos II foreign functions in Java (the callout interface). To start JavaAmos use the script
javaamosinstead of amos2. It will enter a top loop reading and evaluating AmosQL statements. JavaAmos requires the Java jar file javaamos.jar.
Back-end relational
databases
Amos II includes a interface to relational databases using JDBC on
top of JavaAmos. Any relational database can be accessed and
queried in terms of AmosQL using this interface. The interface is
described in the section Relational
database wrapper.
Graphical database browser
The multi-database browser GOOVI [CR01] is a graphical browser for Amos II written as a Java application. You can start the GOOVI browser from the JavaAmos top loop by calling the Amos II function
goovi();It will start the browser in a separate thread.
PHP interface
Amos II includes an interface allowing programs in PHP to call Amos II servers. The interface is tested for Apache servers. To use Amos II with PHP or SQL under Windows you are recommended to download and install WAMP http://www.wamp.org/. WAMP packages together a version of the Apache web server, the PHP script language, and the MySQL database. Amos II is tested with WAMP 2.0 (32 bits). See further the file readme.txt in subdirectory embeddings/PHP of the Amos II download.
C interface
The system is interfaced with the programming language C (and C++). As with Java, Amos II can be called from C (callin interface) and foreign Amos II functions can be implemented in C. See [Ris12].
Lisp interface
There is a built-in interpreter for a subset of the programming language CommonLisp in Amos II, aLisp [Ris06]. The system can be accessed and extended using aLisp.
In general the user may enter different kinds of AmosQL statements to the Amos top loop in order
to instruct the system to do operations on the database:
This section is organized as follows:
The basic building blocks of the AmosQL query language are
described here.
A ::= B C: A consists of B followed by C.
A ::= B | C, alternatively (B | C): A consists of B or C.
A ::= [B]: A consists of B or nothing.
A ::= B-list: A consists of one or more Bs.
A ::= B-commalist: A consists of one or more Bs separated by
commas.
'xyz': The string (keyword) xyz.
create-type-stmt |
delete-type-stmt |
create-object-stmt |
delete-object-stmt |
create-function-stmt |
delete-function-stmt |
query |
update-stmt |
add-type-stmt |
remove-type-stmt |
for-each-stmt |
set-interface-variable-stmt |
declare-interface-variable-stmt |
commit-stmt |
rollback-stmt |
open-cursor-stmt |
fetch-cursor-stmt |
close-cursor-stmt |
quit-stmt |
exit-stmt
identifier ::=Notice that Amos II identifiers are NOT case sensitive; i.e. they are always internally capitalized. By contrast Amos II reserved keywords are always written with lower case letters.
('_' | letter) [identifier-character-list]
identifier-character ::=
alphanumeric | '_'
E.g.: MySalary
x
x1234
x1234_b
Syntax:
local-variable ::= identifier
E.g. my_variable
MyVariable2
Syntax:
interface-variable ::= ':' identifier
E.g. :my_interface_variable
:MyInterfaceVariable2
The user can declare an interface variable to be of a
particular type by the interface
variable declare statement:
interface-variable-declare-stmt ::= 'declare'
interface-variable-declaration-commalist
interface-variable-declaration ::= type-spec
interface-variable
E.g. declare Integer :i, Real :x3;
Interface variables can be assigned either by the into-clause of the select statement or by the
interface variable assignment statement set:
set-interface-variable-stmt ::=
'set' interface-variable '=' expr
E.g. set :x3 = 2.3;
set :i =
2 + sqrt(:x3);
constant ::=The constant false is equivalent to nil casted to type Boolean. The only legal boolean value that can be stored in the database is true and a boolean value is regarded as false if it is not in the database (close world assumption).
integer-constant | real-constant | boolean-constant |
string-constant | time-stamp | functional-constant | 'nil'
integer-constant ::=
['-'] digit-list
E.g. 123
-123
real-constant ::=
decimal-constant | scientific-constant
decimal-constant ::=
['-'] digit-list '.' [digit-list]
scientific-constant ::=
decimal-constant ['e' | 'E'] integer-constant
E.g. 1.2
-1.0
2.3E2
-2.4e-21
boolean-constant ::=
'true' | 'false'
expr ::= simple-value | function-call | collection-constr | casting | vector-indexing
| '(' query ')'
E.g. 1.23
1+2
1<2 and 1>3
sqrt(:a) + 3 *
:b
{1,2,3}
cast(:p as Student)
a[3]
sum(select income(p) from
Person p)+10
The value of an expression is computed if the expression is
entered to the Amos top loop, e.g.:
1+5*sqrt(6);
=> 13.2474487139159
Notice that Boolean expressions, predicates,
either return true, or
nothing if the expression is not true. For example:
1<2 or
3<2;
=>
TRUE
1<2 and
3<2;
=>
nothing
Entering simple expressions followed by a semicolon is the
simplest form of AmosQL queries,
e.g.:
1+sqrt(25);
returns the bag:
set :h = hobbies(:sam);
will assign :h to a bag of
Sam's hobbies is the function hobbies()
returns a bag of objects. The elements can be extracted with in:Vector element vi can be access with the
notation v[i], where the indexing i is from 0 and
up. For example:
set :v={1,2,3};
then
:v[2];
returns
3
Records represent dynamic
associations between keys and values. A record is a dynamic
and associative array. Other commonly used terms for associative
arrays are property lists, key-value pairs, dictionaries, or hash
links. Amos II uses JSON notation to construct records. For
example the following expression assigns :r to a record
where the key (property) 'Greeting' field has the value 'Hello, I
am Tore' and the key 'Email' has the value
'Tore.Andersson@it.uu.se':
set :r= {'Greeting':'Hello, I am Tore','Email':'Tore.Andersson@it.uu.se'}A field f of a record bound to a variable r can be access with the notation r[f], for example:
comment ::= '/*' character-list '*/'
Syntax:
create-type-stmt ::=Type names must be unique in the database.
'create type' type-name ['under' type-name-commalist]
['properties' '(' attr-function-commalist ')']
type-spec ::= type-name | 'Bag of' type-spec | 'Vector of' type-spec
type-name ::= identifier
attr-function ::= generic-function-name type-spec ['key']
create type Person properties (name Charstring, income Number, age Number, parents Bag of Person);Each property is a function having a single argument and a single result. The argument type of a property function is the type being created and the result type is specified by the type-spec. The result type must be previously defined. In the above example the function name() has argument type Person and result type Charstring, i.e. signatures name(Person)->Charstring and income(Person)->Number, respectively.
The new type will be a subtype of all the
supertypes in the under clause. For example, in the
following definition Student subtype of Person and
Person is supertype of Student:
create type Student under Person;
If no supertypes are specified the new type becomes a subtype of the system type named Userobject.
If key is specified for a property, it indicates that each
value of the attribute is unique and the system will raise an
error if this uniqueness is
violated. In the following example, two objects of type
Employee cannot have the same value of property
emp_no:
create type Employee under
Person properties
(emp_no
Number key);
create type TA under Student, Employee;
Syntax:
delete-type-stmt ::= 'delete type' type-nameIf the deleted type has subtypes they will be deleted as well. Functions using the deleted types will be deleted as well, in this case emp_no().
E.g. delete type Employee;
Syntax:
create-object-stmt ::=The new objects are assigned initial values for specified attributes (properties). For example:
'create' type-name
['(' generic-function-name-commalist ')'] 'instances' initializer-commalist
initializer ::= variable |
[variable] '(' expr-commalist ')'
create Person(name, income, age) instances ("Adam",3500,35), ("Eve",3900,40);The attributes can be any updatable AmosQL function having the created type as its only argument, here name() and age(). One object will be created for each initializer. Each initializer includes a comma-separated list of initial values for the specified attribute functions. Initial values are specified as expressions, for example:
create Person (name,income) instances ("Kalle "+"Persson" , 3345*1.5);The types of the initial values must match the declared result types of the corresponding functions.
create Person(name, income) instances :pelle ("Per",3836);
Then the query
income(:pelle);returns
3386Notice that interface variables such as :pelle are temporary and not saved in the database.
Bag valued functions are initialized using
the syntax 'bag(e1,...)' (syntax bag-expr),
for example:
create Person (name,parents,income,age) instances :adam ("Adam",nil,2300,64), :eve ("Eve",nil,3200,63), :cain ("Cain",bag(:adam,:eve),1500,44), ("Abel",bag(:adam,:eve),900,43), :seth ("Seth",bag(:adam,:eve),1700,42), :lilith ("Lilith",bag(:adam,:eve),4500,40), :noah ("Noah",bag(:seth,:lilith),5300,25), :ruth ("Ruth",:cain,500,24), ("Shem",bag(:noah,:ruth),800,15), :ham ("Ham",bag(:noah,:ruth),3800,16), ("Cush",:ham,10,2);
It is possible to specify nil for a value when no
initialization is desired for the corresponding function.
Syntax:
delete-object-stmt ::= 'delete' variableFor example:
delete :pelle;The system will automatically remove the deleted object from all stored functions where it is referenced.
Deleted objects are printed as
#[OID nnn *DELETED*]The objects may be undeleted by rollback;. The automatic garbage collector physically removes an OID from the database only if its creation has been rolled back or its deletion committed, and it is not referenced from some variable or external system.
function-call ::=The built-in functions plus(), minus(), times(), and div() have infix syntax +,-,*,/ with the usual priorities.
function-name '(' [parameter-value-commalist] ')' |
expr infix-operator expr |
tuple-expr
infix-operator ::= '+' | '-' | '*' | '/' | '<' | '>' | '<=' | '>=' | '=' | '!=' | 'in'
parameter-value ::= expr |
'(' select-stmt ')' |
tuple-expr
tuple-expr ::= '(' expr-commalist ')'
E.g. sqrt(2.1);
1+2;
1+2 < 3+4;
"a" + 1;
(income(:eve) + income(:ham)) * 0.5;is equivalent to:
times(plus(income(:eve),income(:ham)),0.5);The '+' operator is defined for both numbers and strings. For strings it implements string concatenation.
The result of a function call can be saved temporarily in an interface variable, for example:
set :inca = income(:adam);then the query :inca; returns 2300.
Also bags valued function calls can be saved in variables, for example:
set :pb = parents(:cain);In this case the value of :pb is a bag. To get the elements of the bag, use the in function. For example:
in(:pb);
Tuple expressions can be used to assign the result of functions returning tuples, for example:
set (:m,:f)=parents2(:cain);In a function call, the types of the actual parameters and results must be the same as, or subtypes of, the types of the corresponding formal parameters or results.
Syntax:
select-stmt ::=The select statement returns an unordered set of objects selected from the database. Duplicates are allowed in the result set of a query, i.e. the result is a bag. In case you need to construct an ordered sequence of objects rather than a bag you can use the vector selection statement.
'select' ['distinct']
[select-clause]
[into-clause]
[from-clause]
[where-clause]
[group-by-clause]
[order-by-clause]
[limit-clause]
select-clause ::= expr-commalist
into-clause ::=
'into' variable-commalist
from-clause ::=
'from' variable-declaration-commalist
variable-declaration ::=
type-spec local-variable
where-clause ::=
'where' predicate-expression
group-by-clause ::=
'group by' expression-commalist
order-by-clause ::=
'order by' expression ['asc' | 'desc']
limit-clause ::=
'limit' expression
select name(p), income(p)
from Person p
where income(p)>2500;
The from-clause declares data types of local variables used in the query. For example:
select name(p), income(p) from Person p where age(p)>20;Notice that in variables in AmosQL can be bound to objects of any type. This is different from SQL select statements where all variables must be bound to tuples only. AmosQL is based on domain calculus while SQL select expressions are based on tuple calculus.
If a function is applied on the result of a function returning a bag of values, the outer function is applied on each element of that bag, the bag is flattened. This is called Daplex semantics. For example: if there are more than one parents per parent generation of Cush there will be several names (e.g. Noah and Ruth) returned when querying:
select name(parents(parents(q))) from Person q where name(q)= "Cush";would return the bag:
"Noah" "Ruth"The where-clause gives selection criteria for the search. The where-clause is specified as a predicate. For example:
select name(p), income(p)
from Person p
where age(p)>20;
To retrieve the results of tuple valued functions in queries, use tuple expressions, e.g.
select name(m), name(f) from Person m, Person p where (m,f) = parents2(p);
Duplicates are removed from the result only when the keyword
'distinct' is specified, in which case a set (rather than a bag)
is returned from the selection.
For example, this query returns the set of different names in the database:
select distinct name(p) from Person p where age(p)>20;
The optional group-by-clause groups and summarizes (aggregates) the result. A select statement with a group-by-class is called a grouped selection. For example:
select name(p), sum(income(p))
from Person p
where age(p) > 20
group by name(p);
The optional order-by-clause sorts the result ascending
('asc', default) or descending ('desc'). A select statement with
an order-by-clause is called an ordered
selection. For example:
select name(p), income(p)
from Person p
where age(p) > 20
order by income(p) desc;
The optional limit-clause limits the number of returned values from the select statement. It is often used together with ordered selections to specify top-k queries. For example:
select name(p), income(p)
from Person p
where age(p) > 20
order by income(p) desc
limit 10;
The optional into-clause specifies variables to be bound to the result.
For example:
select p into :e from Person p where name(p) = 'Eve';This query retrieves into the environment variable :eve2 the Person whose name is 'Eve'.
set :r = (select p from Person p where name(p) = 'Eve');Inspect :r with one of these equivalent queries:
in(:r); select p from Person p where p in :r;
predicate-expression ::=
predicate-expression 'and' predicate-expression |
predicate-expression 'or' predicate-expression |
'(' predicate-expression ')' |
expr
Examples of predicates:
x < 5
child(x)
"a" != s
home(p) = "Uppsala" and name(p) = "Kalle"
name(x) = "Carl" and child(x)
x < 5 or x > 6 and 5 < y
1+y <= sqrt(5.2)
parents2(p) = (m,f)
count(select friends(x) from Person x where child(x)) < 5
The boolean operator and has precedence over or. Negation is handled by the quantifier function notany().
For example:
a<2 and a>3 or b<3 and b>2
is equivalent to
(a<2 and a>3) or (b<3 and b>2)
The comparison operators (=, !=, <, <=, and >=) are
treated as binary predicates. You can
compare objects of any type.
Predicates are also allowed in the result of a select expression.
For example, the query:
select age(:p1) < 20 and home(:p1)="Uppsala";
returns true if person :p1 is younger than 20 and
lives in Uppsala.
A grouped selection is a select statement with a group-by-clause specified. The
execution semantics of a grouped selection is different than for
regular queries.
When analyzing data it is often necessary to group data, for
example to get the sum of the salaries of employees per
department. Such regroupings are specified though the group-by-clause.
It specifies on which expressions in the select clause the data
should be grouped and summarized.
For example:
select name(d), sum(salary(p))
from Department d, Person
p
where dept(p)=d
group by name(d);
Here the group-by-clause specifies that the result should
be grouped on the name of the departments in the database. After
the grouping, for each department d the salaries of the
persons p working at that department should be summed
using the aggregate function sum().
An element of a select-clause of a
grouped selection must be one of:
In general aggregate functions such as sum(), avg(), stdev(), min(), max(), count() are applied on collections (bags) of values, rather than single objects.
Contrast the above query to the regular (non-grouped) query: select name(d), sum(salary(p))
from Department d, Person
p
where dept(p)=d;
Without the grouping the aggregate function sum() is
applied on the salary of each person p, rather than the
collection of salaries corresponding to the group key name(p)
in the grouped selection.
The group key need not be part of the result, for example the
following query returns the sum of the salaries for all
departments without revealing the department names:
select sum(salary(p))
from Department d, Person
p
where dept(p)=d
group by name(d);
The order-by-clause specifies
that the result should be (partially) sorted by the specified sort
key. The sort order is descending when desc is
specified and ascending otherwise.
For example, the following query sorts the result descending based on the sort key salary(p):
select name(p), salary(p)
from Person p
order by name(p) desc;
The sort key does not need to be part of the result.
For example, the following query list the salaries of persons in descending order without associating any names with the salaries:
select salary(p)
from Person p
order by name(p) desc;
A top-k query is a query returning the first few tuples
in a larger set of tuples based on some ranking. The order-by-clause and limit-clause can be combined to specify
top-k queries. For example, the following query returns the names
and salaries of the 10 highest income earners:
select name(p), salary(p)
from Person p
order by name(p) desc
limit 10;
The limit can be any numerical expression.For example, the following query retrieves the :k+3 lowest income earners, where :k is a variable bound to a numeric value:
select name(p), salary(p)
from Person p
order by name(p)
limit :k+3;
In the second case, the subqueries are specified as nested
select expressions returning bags, for
example:
select name(friends(p))
from Person p
where name(p)= "Bill";
The function friends() returns several (a bag of) persons on which the function name() is applied. The normal semantics in Amos II is that when a function (e.g. name()) is applied on a bag valued function (e.g. friends()) it will be applied on each element of the returned bag. In the example a bag of the names of the persons named Bill is returned. This is called Daplex semantics.
Aggregate
functions work differently. They are applied on the entire bag. For
example:
count(friends(:p));
In this case count() is
applied on the subquery of all friends of :p. The system uses a rule
that arguments are converted (coerced) into a subquery when an
argument of the calling function (e.g. count) is declared Bag of.
select sum(b), avg(b), stdev(b)
from Bag of Integer b
where b = (select income(p)
from Person p);
Elements in subqueries can be accessed with the in
operator. For example:
select name(p), count(b)
from Bag of Integer b, Person p
where b = (select p from Person p)
and p in b;
The query returns the names of all persons paired with the number
of persons in the database.
Variables may be assigned to bags by assigning values of
functions returning bags, for example:
set :f = friends(:p);
count(:f);
bequal(Bag x, Bag y) -> BooleanNotice that bequal() materializes its arguments before doing the equality computation, which may occupy a lot of temporary space in the database image if the bags are large.
See 7.4 Aggregate functions
for a details on aggregate functions.
The function some() implements
logical exist over a subquery:
some(Bag sq) -> Booleanfor example
select name(p)
from Person p
where some(parents(p));
notany(Bag sq) -> BooleanFor example:
select name(p)
from Person p
where notany(select parents(p) where age(p)>65);
vselect-stmt ::=Notice that the order-by-clause normally should be present when constructing vectors with the vselect statement in order to exactly specify the order of the elements in the vector. If no order-by-clause is present the order of the elements in the vector is arbitrarily chosen by the system based on the operators in the query, which is the order that is the most efficient to execute the query.
'vselect' ['distinct']
[select-clause]
[into-clause]
[from-clause]
[where-clause]
[group-by-clause]
[order-by-clause]
[limit-clause]
Syntax:
create-function-stmt ::=AmosQL functions may also have tuple
valued results by using the tuple-result-spec notation. For
example:
set (:mother,:father) = parents2(:eve);You can store records in stored functions, for example:
create function child(Person p) -> Booleanalternatively:
as select true where age(p)<18;
create function child(Person p) -> Boolean
as age(p) < 18;
Since the select statement returns a
bag of values, derived functions also often return a Bag
of results. If you know that a function returns a bag of
values you should indicate that in the signature. For example:
create function youngFriends(Person p)-> Bag of PersonIf you write:
as select f
from Person f
where age(f) < 18
and f in friends(p);
create function youngFriends(Person p)-> Person
as select f
from Person f
where age(f) < 18
and f in friends(p);
you indicate to the system that youngFriends()
returns a single value. However, this constraint is not enforced
by the system so if there are more that one youngFriends() the system will
treat the result as a bag. Variables declared in the result of a derived function need not
be declared again in the from clause, their types are inferred
from the function signature. For example, youngFriends() can also be
defined as:
create function youngFriends(Person p) -> Bag of Person f
as select f
where age(f) < 18
and f in friends(p);
Notice that the variable f is bound to the elements of
the bag, not the bag itself. This definition is equivalent:
Derived functions whose arguments are declared Bag of
are user defined aggregate functions. For
example:
create function myavg(Bag
of Number x) -> Number
as sum(x)/count(x);
Aggregate functions do not
flatten the argument bag. For example, the following query computes
the average age of Carl's grandparents:
select myavg(age(grandparents(q)))
from Person p
where name(q)="Carl";
select
age(m),
age(f)
from Person m,
Person f, Person p
where (m,f) =
parents2(p)
and name(p) = "Oscar";
For example, assume the following two Amos II function definitions having the same generic function name less():
create function less(Number i, Number j)->BooleanIts resolvents will have the signatures:
as i < j;
create function less(Charstring s,Charstring t)->Boolean
as s < t;
less(Number,Number) -> BooleanInternally the system stores the resolvents under different function names. The name of a resolvent is obtained by concatenating the type names of its arguments with the name of the overloaded function followed by the symbol '->' and the type of the result. The two resolvents above will be given the internal resolvent names NUMBER.NUMBER.LESS->BOOLEAN and CHARSTRING.CHARSTRING.LESS->BOOLEAN.
less(Charstring,Charstring) -> Boolean
The query compiler resolves the correct resolvent to apply based on the types of the arguments; the type of the result is not considered. If there is an ambiguity, i.e. several resolvents qualify in a call, or if no resolvent qualify, an error will be generated by the query compiler.
When overloaded function names are encountered in AmosQL function bodies, the system will use local variable declarations to choose the correct resolvent (early binding). For example:
create function younger(Person p,Person q)->Booleanwill choose the resolvent NUMBER.NUMBER.LESS->BOOLEAN, since age returns integers and the resolvent NUMBER.NUMBER.LESS->BOOLEAN is applicable to integers by inheritance. The other function resolvent CHARSTRING.CHARSTRING.LESS->BOOLEAN does not qualify since it is not legal to apply to arguments of type Integer.
as less(age(p),age(q));
On the other hand, this function:
create function nameordered(Person p,Person q)->Booleanwill choose the resolvent NUMBER.NUMBER.LESS->BOOLEAN since the function name() returns a string. In both cases the type resolution (selection of resolvent) will be done at compile time.
as less(name(p),name(q));
Late binding
Dynamic type resolution at run time, late binding, is
sometimes required to choose the correct resolvent. For example,
the query
less(1,2);will choose NUMBER.NUMBER.LESS->BOOLEAN based on the numeric types the the arguments.
create type Employee under Person;Now, suppose that we need a function that returns the gross incomes of all persons in the database, i.e. we use MANAGER.INCOME->INTEGER for managers and EMPLOYEE.INCOME->INTEGER for non-manager. In Amos II such a function is defined as:
create type Manager under Employee;
create function mgrbonus(Manager)->Integer as stored;
create function income(Employee)->Integer as stored;
create function income(Manager m)->Integer i
as income(cast(m as Employee)) + mgrbonus(m);
create function grossincomes()->Integer iSince income is overloaded with resolvents EMPLOYEE.INCOME->INTEGER and MANAGER.INCOME->INTEGER and both qualify to apply to employees, the resolution of income(p) will be done at run time.
as select income(p)
from Employee p;
/* income(p) late bound */
To avoid the overhead of late binding one may use casting.
Since the detection of the necessity of dynamic resolution is
often at compile time, overloading a function name may lead to a
cascading recompilation of functions defined in terms of that
function name. For a more detailed presentation of the management
of late bound functions see [FR95].
The type of an expression can be explicitly defined using the casting
statement:
casting ::= 'cast'(expr 'as' type-spec)
for example
create function income(Manager m)->Integer i
as income(cast(m as Employee)) + mgrbonus(m);
By using casting statements one can avoid late binding.
Amos II functions are internally represented as any other objects
and stored in the database. Object representing functions can be
used in functions and queries too. An object representing a
function is called a functional.
Second order functions take functionals as arguments or results.
The system function functionnamed()
retrieves the functional fno having
a given name fn:
functionnamed(Charstring fn) -> Function fnoThe name fn is not case sensitive.
functionnamed("plus");returns the object representing the generic function plus, while
=> #[OID 155 "PLUS"]
functionnamed("number.number.plus->number");
=> #[OID 156 "NUMBER.NUMBER.PLUS->NUMBER"]
returns the object representing the resolvent named NUMBER.NUMBER.PLUS->NUMBER.
Another example of a second order function is the system function
apply(Function fno, Vector argl) -> Bag of Vector
It calls the functional fno with
the vector argl as
argument list. The result tuples are returned as a bag of vectors,
for example:
apply(functionnamed("number.number.plus->number"),{1,3.4});
=> {4.4}
Notice how apply() represents
argument lists and result tuples as vectors.
When using second order functions one often needs to retrieve a
functional fno given
its name and the function functionnamed()
provides one way to achieve this. A simpler way is often to
use functional constants
with syntax:
functional-constant ::= '#' string-constant
for example
#'mod';
A functional constant is translated into the functional with the
name uniquely specified by the string constant. For example, the
following expression
apply(#'mod',{4,3});
=> {1}
Notice that an error is
raised if the function name specified in the functional constant
is not uniquely identifying the functional. This happens if it is
the generic name of an
overloaded function. For example, the functional constant #'plus' is illegal,
since plus() is overloaded. For
overloaded functions the name of a resolvent has to be used
instead, for example:
apply(#'plus',{2,3.5});generates an error, while
apply(#'number.number.plus->number', {2,3.5});and
=> {5.5}
apply(functionnamed("plus"),{2,3.5});The last call using functionnamed("plus") will be somewhat slower than using #'number.number.plus->number' since the functional for the generic function plus() is selected and then the system uses late binding to determine dynamically which resolvent of plus() to apply.
=> {5.5}
The transitive closure functions tclose() is a second order function to explore graphs where the edges are expressed by a transition function specified by argument fno:
tclose(Function fno, Object o) -> Bag of Object
tclose() applies the
transition function fno(o),
then fno(fno(o)), then fno(fno(fno(o))), etc until fno returns no new
result. Because of the Daplex
semantics, if the transition function fno returns a bag of values
for some argument o, the
successive applications of fno will
be applied on each element of the result bag. The result types of
a transition function must either be the same as the argument
types or a bag of the argument types. Such a function that has the
same arguments and (bag of) result types is called a closed function.
For example, assume the following definition of a graph defined
by the transition function arcsto():
create function arcsto(Integer node)-> Bag of Integer n as stored;
set arcsto(1) = bag(2,3);
set arcsto(2) = bag(4,5);
set arcsto(5) = bag(1);
The following query traverses the graph starting in node 1:
Amos 5> tclose(#'arcsto', 1);
1
3
2
5
4
In general the function tclose() traverses a graph
where the edges (arcs) are defined by the transition function. The
vertices (nodes) are defined by the arguments and results of calls
to the transition function fno,
i.e. a call to the transition function fno defines the neighbors of
a node in the graph. The graph may contain loops and tclose() will remember what
vertices it has visited earlier and stop further traversals for
vertices already visited.
You can also query the inverse of tclose(), i.e. from which nodes f can be reached, by the query:
Amos 6> select f from Integer f where 1 in tclose(#'arcsto',f);
1
5
2
If you know that the graph to traverse is a tree or a directed
acyclic graph (DAG) you can instead use the faster function
traverse(Function fno, Object o) -> Bag of Object
The children in the tree to traverse is defined by the transition
function fno. The tree is traversed in pre-order depth first. Leaf
nodes in the tree are nodes for which fno returns nothing. The function traverse() will not terminate
if the graph is circular. Nodes are visited more than once for
acyclic graphs having common subtrees.
A transition function may have extra arguments and
results, as long as it is closed. This allows to pass extra
parameters to a transitive closure computation. For example, to
compute not only the transitive closure, but also the distance
from the root of each visited graph node, specify the following
transition function:
create function arcstod(Integer node, Integer d) -> Bag of (Integer,Integer)
as select arcsto(node),1+d;
and call
tclose(#'arcstod',1,0);
which will return
(1,0)Notice that only the first argument and result in the transition function define graph vertices, while the remaining arguments and results are extra parameters for passing information through the traversal, as with arcstod(). Notice that there may be no more than three extra parameters in a transition function.
(3,1)
(2,1)
(5,2)
(4,2)
iterate(Function fn, Number maxdepth, Object x) -> Object rThe iteration is initialized by setting x0=x. Then xi+1= fn(xi) is repeadedly computed until one of the following conditions hold:
iterate(Function fn, Number maxdepth, Object x0, Object p) -> Object rThis enables flexible termination of the iteration since fn(x,p) can return nil based on both x and p.
Sometimes there is a need to have a function defined for subtypes
of a common supertype, but the function should never be used for
the supertype itself. For example, one may have a common supertype
Dog with two subtypes
Beagle and Poodle. One would like to
have the function bark
defined for different kinds of dogs, but not for dogs in general.
In this case one defines the bark function for type Dog as an abstract function, for
example::
create type Dog;
create function name(Dog)->Charstring as stored;
create type Beagle under Dog;
create type Poodle under Dog;
create function bark(Dog d) -> Charstring as foreign 'abstract-function';
create function bark(Beagle d) -> Charstring;
create function bark(Poodle d) -> Charstring;
create Poodle(name,bark) instances ('Fido','yip yip');
create Beagle(name,bark) instances ('Snoopy','arf arf');
Now you can use bark() as a function over dogs in
general, but only if the object is a subtype of Dog:
Amos 15> select bark(d) from dog d;
"arf arf"
"yip yip"
An abstract function is defined by:
create function foo(...)->... as foreign 'abstract-function'.
An abstract functions are implemented as a foreign function whose implementation is named 'abstract-function'.
If an abstract function is called it gives an informative error
message. For example, if one tries to call bark() for
an object of type Dog,
the following error message is printed:
Amos 16> create Dog instances :buggy;
NIL
Amos 17> bark(:buggy);
BARK(DOG)->CHARSTRING
is an abstract function requiring a more specific argument signature than
(DOG) for arguments
(#[OID 1009])
Syntax:
delete-function-stmt ::= 'delete function' function-nameSyntax:
update-stmt ::=
update-op update-item [from-clause] [where-clause]
update-op ::=
'set' | 'add' | 'remove'
update-item ::=
function-name '(' expr-commalist ')' '=' expr
create function name(Person) -> Charstring as stored;
create function hobbies(Person) -> Bag of Charstring as stored;
Furthermore, assume we have created two objects of type Person bound to the interface
variables :sam and :eve: create Person instances :sam, :eve;
The set statement sets
the value of an updatable function given the arguments. set name(:sam) = "Sam";
set name(:eve) = "Eve";
To populate a bag
valued function you can use bag construction:set hobbies(:eve) = bag("Camping","Diving");The add statement adds result elements to bag valued functions.
add hobbies(:sam) = "Sailing";
add hobbies(:sam) = "Fishing";
The remove statement
removes the specified tuple(s) from the result of an updatable
function returning a bag for given arguments, for example: remove hobbies(:sam) = "Fishing";
The statement set hobbies(:eve) = hobbies(:sam);
will update Eve's all hobbies to be the same a Sam's hobbies. set hobbies(:eve) = h
from Charstring h
where h in hobbies(:sam) and
h != "Sailing";
Here a query first
retrieves all hobbies h of :sam before the hobbies
of :eve are set. create function married(Person,Person)->Boolean as stored;
set married(:sam,:eve) = true;
Setting the value of a boolean function to false means that the truth value is removed from
the extent of the function. For example, to divorce Sam and Eve you
can do either of the following: set married(:sam,:eve)=false;
or
remove married(:sam) = :eve;
Not every function is updatable. Amos II defines a function to be
updatable if it is a stored function, or if it is derived from a
single updatable function with a join that includes all arguments.
In particular inverses to stored functions are updatable. For
example, the following function is updatable: create function marriedto(Person p) -> Person q
as select q where married(p,q);
The user can define update
procedures for derived functions making also non-updatable
functions updatable.Syntax:
add-type-stmt ::=The updated objects may be assigned initial values for all the specified property functions in the same manner as in the create object statement.
'add type' type-name ['(' [generic-function-name-commalist] ')']
'to' variable-commalist
The remove-type-stmt makes one or more objects no
longer belong to the specified type.
Syntax:
remove-type-stmt ::=Removing a type from an object may also remove properties from the object. If all user defined types have been removed from an objects, the object will still be member of type Userobject.
'remove type' type-name 'from' variable-commalist
createobject(Type t)->Object
createobject(Charstring tpe)->Object
deleteobject(Object o)->Boolean
addfunction(Function f, Vector argl, Vector resl)->Boolean
remfunction(Function f, Vector argl, Vector resl)->Boolean
setfunction(Function f, Vector argl, Vector resl)->Boolean
createobject() creates an object of the type specified by
its argument.
deleteobject() deletes an object.
The procedural system functions setfunction(),
addfunction(), and remfunction() update a function
given an argument list and a result tuple as vectors. They return
TRUE if the update succeeded.
To delete all rows in a stored function fn, use
dropfunction(Function fn, Integer permanent)->Function
If the parameter permanent
is the number one the deletion cannot be rolled back, which saves
space if the extent of the function is large.
meansub(Bag of Vector of Number b) -> Bag of Vector of Numbermeansub() transforms each dimension to a N(0, s) distribution (assuming that the dimension was N(u, s) distributed) by subtracting the mean u of each dimension. zscore() transforms each dimension to a N(0, 1) distribution by also dividing by the standard deviation of each dimension. maxmin() transforms each dimension to be on the [0, 1] interval by applying the transformation (w - min) ./ (max - min) to each vector w in bag b where max and min are computed using aggv(b, #'maxagg') and aggv(b, #'minagg') respectively.
zscore(Bag of Vector of Number b) -> Bag of Vector of Number
maxmin(Bag of Vector of Number b) -> Bag of Vector of Number
meansub((select {i, i/2 + 10}returns
from integer i
where i in iota(1, 5)));
{-2.0,-1.0}Principal Component Analysis is performed using the function pca():
{-1.0,-0.5}
{0.0,0.0}
{1.0,0.5}
{2.0,1.0}
scatter2(Bag of Vector v) -> Integer
scatter2l(Bag of Vector v) -> Integer
scatter2p(Bag of Vector v) -> Integer
scatter3(Bag of Vector v) -> Integer
scatter3l(Bag of Vector v) -> Integer
scatter3p(Bag of Vector v) -> Integer
scatter2(Vector of Integer projs, Bag of Vector v) -> Integer
scatter2l(Vector of Integer projs, Bag of Vector v) -> Integer
scatter2p(Vector of Integer projs, Bag of Vector v) -> Integer
scatter3(Vector of Integer projs, Bag of Vector v) -> Integer
scatter3l(Vector of Integer projs, Bag of Vector v) -> Integer
scatter3p(Vector of Integer projs, Bag of Vector v) -> Integer
csv_file_tuples(Charstring file) -> Bag of
Vector
1,2.3,a b c
4,5.5,d e f
{1,2.3,"a b c"}
{4,5.5,"d e f"}
writecsvfile(Charstring file, Bag b) -> Boolean
writecsvfile("myoutput.csv", bag({1,"a b",2.2},{3,"d
e",4.5})
1,a b,2.2
3,d e,4.5
read_ntuples(Charstring file) -> Bag of Vector
"This is the first line" another word
1 2 3 4.45 2e9
"This line is parsed into two fields" 3.14
{"This is the first line","another","word"}
{1,2,3,4.45,2000000000.0}
{"This line is parsed into two fields",3.14}
Syntax:
open-cursor-stmt ::=If the optional into clause is present in a fetch cursor statement, it will bind elements of the first result tuple to variables. There must be one variable for each element in the next cursor tuple.
'open' cursor-name 'for' expr
cursor-name ::=
variable
fetch-cursor-stmt ::=
'fetch' cursor-name [into-clause]
close-cursor-stmt ::=
'close' cursor-name
If no into clause is present in a fetch cursor statement a single result tuple is fetched and displayed.
For example:create person (name,age) instances :Viola ('Viola',38);
open :c1 for (select p from Person p where name(p) = 'Viola');
fetch :c1 into :Viola1;
close :c1;
name(:Viola1);
--> "Viola";
The close cursor statement
deallocates the scan.
Cursors allow iteration over very large bags created by queries
or function calls. For example,
set :b = (select iota(1,1000000)+iota(1,1000000));
/* :b contains a bag with 10**12 elements! */
open :c for :b;
fetch :c;
-> 2
fetch :c;
-> 3
etc.
close :c;
Cursors are implemented using a special datatype called Scan that allows iterating
over very large bags of tuples using limited memory. The
following functions are available for accessing the tuples in a
scan as vectors:
next(Scan s)->Vector
moves the cursor to the next tuple in a scan and returns the
cursor tuple. The fetch cursor
statement is syntactic sugar for calling next().
this(Scan s)->Vector
returns the current tuple in a scan without moving the cursor
forward.
returns the next tuple in a scan without moving the cursor
forward.
Syntax:
procedural-function-definition ::=For example, the procedural function creperson() creates a new person and sets the properties name() and income(), i.e. it is a constructor for persons:
block | procedure-stmt
procedure-stmt ::=
create-object-stmt |
delete-object-stmt |
for-each-stmt |
update-stmt |
add-type-stmt |
remove-type-stmt |
set-local-variable-stmt |
query |
if-stmt |
commit-stmt |
abort-stmt |
loop-stmt |
while-stmt |
open-cursor-stmt |
fetch-cursor-stmt |
close-cursor-stmt
block ::=
'begin'
['declare' variable-declaration-commalist ';']
procedure-stmt-semicolonlist
'end'
return-stmt ::=
'return' expr
for-each-stmt ::=
'for each' [for-each-option] variable-declaration-commalist
[where-clause] for-each-body
for-each-option ::= 'distinct' | 'copy'
for-each-body ::= procedure-body
if-stmt ::=
'if' expr
'then' procedure-body
['else' procedure-body]
set-local-variable-stmt ::=
'set' local-variable '=' expr
while-stmt ::=
'while' expr 'do' procedure-stmt-semicolonlist 'end while'
loop-stmt ::=
'loop' procedure-stmt-semicolonlist 'end loop'
leave-stmt ::=
'leave'
create function creperson(Charstring nm,Integer inc) -> Person pThe procedural function makestudent() makes a person a student and sets the student's score() property:
as
begin
create Person instances p;
set name(p)=nm;
set income(p)=inc;
return p;
end;
set :p = creperson('Karl',3500);
The for-each statement statement iterates over the result of the query specified by the variable declarations executing the for-each body for each result variable binding of the query. For example the following procedural function adds inc to the incomes of all persons with salaries higher than limit and returns their old incomes:
create function increase_incomes(Integer inc,Integer thres)The for-each statement does not return any value at all unless a return statement is called within its body as in increase_incomes().
-> Integer oldinc
as for each Person p, Integer i
where i > thres
and i = income(p)
begin
return i;
set income(p) = i + inc
end;
set_addfunction(Function f, Function
up)->Boolean
set_remfunction(Function f, Function up)->Boolean
set_setfunction(Function f, Function up)->Boolean
The function f is the function for which we wish to
declare a user update function and up is the
actual update procedure. The arguments of a user update procedures
is the concatenation of argument and result tuples of f.
For example, assume we have a function
create function
netincome(Employee e) -> Number
as income(e)-taxes(e);
Then we can define the following user update procedure:
create function
set_netincome(Employee e, Number i) -> Boolean
as begin
set taxes(e)= i*taxes(e)/income(e) +
taxes(e);
set income(e) =
i*(1-taxes(e))/income(e) +
income(e);
end;
The following declaration makes netincome() updatable with the set
statement:
set_setfunction(#'employee.netincome->number',
#'employee.number->boolean');
Now one can update netincome() with, e.g.:
Amos II databases can be manipulated using SQL as an
alternative to AmosQL. The SQL preprocessor translates SQL
commands to corresponding AmosQL statements. The SQL
preprocessor is called using a special foreign function:
sql(Charstring query)->Bag of vector result
To make it possible to use an Amos II function in SQL queries,
its name must be prefixed with 'sql:'. Thus an Amos II function whose name
is sql:<table> is regarded
from SQL as a table named <table>
and can be queried and updated using SQL statements passed as
argument to the foreign function sql.
For example, assume we define the stored functions:
create function sql:employee(Integer ssn) -> (Charstring name, Number Income, Integer dept) as stored;
create function sql:dept(Integer dno) -> Charstring dname as stored;
Then we can populate the tables by the following calls to the
sql function:
sql("insert into employee values (12345, ‘Kalle’, 10000, 1)");
sql("insert into employee values (12386, ‘Elsa’, 12000, 2)");
sql("insert into employee values (12493, ‘Olof’, 5000, 1)");
sql("insert into dept values(1,’Toys’)");
sql("insert into dept values(2,’Cloths’)");
Examples of SQL queries are:
sql("select ssn from employee where name = ‘Kalle’");
sql("select dname from dept, employee where dept = dno and name=’Kalle’");
The parser is based on the SQL-92 version of SQL. Thus, the SQL processor
allows an Amos II database be both updated and queried using
SQL-92. The parser passes most of the SQL-92 validation
test. However, SQL views are not supported. For further
details see http://www.it.uu.se/research/group/udbl/Theses/MarkusJagerskoghMSc.pdf.
The command line option
amos2 ... -q sql...
will make Amos II accept SQL as query language in the top loop
rather than AmosQL.
nameserver(Charstring name)->Charstring
The function makes the current stand-alone database into a name
server and registers there itself as a peer with the given name.
If name is empty ("") the name server will become anonymous
and not registered as a peer. It can be accessed under the peer
name "NAMESERVER" though.
listen()
The function starts the peer listening loop. It informs the name
server that this peer is ready to receive incoming messages. The
listening loop can be interrupted with CTRL-C and resumed again
by calling listen().
The name server must be listening before any other peer can
register.
register(Charstring name)->Charstring
The function registers in the name server the current
stand-alone database as a peer with the given name. The
system will complain if the name is already registered in the
name server. The peer needs to be activated with listen() to be able to
receive incoming requests. The name server must be running on
the local host.
register(Charstring name, Charstring
host)->Charstring
Registers the current database as a peer in the federation name
server running on the given host.
Signals error if peer with same name already registered in
federation.
reregister(Charstring name)->Charstring
reregister(Charstring name, Charstring
host)->Charstring
as register() but
first unregisters another registered peer having same name
rather than signaling error. Good when restarting peer
registered in name server after crash so the crashed peer will
be unregistered.
this_amosid()->Charstring name
Returns the name of the peer where the call is issued.
Returns the string "NIL" if issued in a not registered
standalone Amos II system.
other_peers()->Bag of Charstring name
Returns the names of the other peers in the federation
managed by the name server.
ship(Charstring peer, Charstring cmd)-> Bag
of Vector
Ships the AmosQL command cmd
for execution in the named peer. The result is shipped back to
the caller as a set of tuples represented as vectors. If
an error happens at the other peer the error is also shipped
back.
call_function(Charstring peer, Charstring fn, Vector args,
Integer stopafter)-> Bag of Vector
Calls the Amos II function named fn with argument list args in peer. The result is
shipped back to the caller as a set of tuples represented as
vectors. The maximum number of tuples shipped back is limited
by stopafter. If an
error happens at the other peer the error is also shipped
back.
send(Charstring peer, Charstring
cmd)-> Charstring peer
Sends the AmosQL command cmd
for asynchronous execution in the named peer without waiting for
the result to be returned. Errors are handled at the other peer
and not shipped back.
send_call(Charstring peer, Charstring fn, Vector args)->
Boolean
Calls the Amos II function named fn with argument list args asynchronously in
the named peer
without waiting for the result to be returned. Errors are
handled at the other peer and not shipped back.
broadcast(Charstring cmd)-> Bag of
Charstring
Sends the AmosQL command cmd
for asynchronous execution in all other peers. Returns the names
of the receiving peers.
gethostname()->Charstring name
Returns the name of the host where the current peer is running.
kill_peer(Charstring name)->Charstring
Kills the named peer. If the peer is the name server it will not
be killed, though. Returns the name of the killed peer.
kill_all_peers()->Bag of Charstring
Kills all peers. The name server and the current peer will still
be alive afterwards. Returns the names of the killed peers.
kill_the_federation()->Boolean
Kills all the peers in the federation, including the name server
and the peer calling kill_the_federation.
is_running(Charstring peer)->Boolean
Returns true if peer
is listening.
wait_for_peer(Charstring peer)->Charstring
Waits until the peer is running and then returns the peer name.
amos_servers()->Bag of Amos
Returns all peers managed by the name server on this computer.
You need not be member of federation to run the function.
Once you have established connections to Amos II peers you can
define views of data from your peers. You first have to import
the meta-data about selected types and functions from the peers.
This is done by defining proxy
types and proxy
functions [RJK03] using the
system function import_types:
import_types(Vector of Charstring typenames, Charstring
p)-> Bag of Type pt
defines proxy types pt
for types named typenames
in peer p. Proxy
functions are defined for the functions in p having the imported
types as only argument. Inheritance among defined proxy
types is imported according to the corresponding
inheritance relationships between imported types in the peer p.
Once the proxy types and functions are defined they can
transparently be queried. Proxy types can be references using @ notation to reference
types in other peers.
For example,
select
name(p) from Person@p1;
selects the name
property of objects of type Person in peer p1.
import_types
imports only those functions having one of typenames
as its single arguments. You can import other functions using
system function import_func:
import_func(Charstring fn, Charstring p)->Function pf
imports a function named fn
from peer p
returning proxy function pf.
On top of the imported types object-oriented multi-peer views
can be defined, as described in [RJK03]
consisting of derived types
[JR99a] whose extents are derived through
queries intersecting extents of other types, and IUTs [JR99b]
whose extents reconciles unions of other type extents. Notice
that the implementation of IUTs is limited. (In particular the
system flag latebinding('OR');
must be set before IUTs can be used and this may cause other
problems).
This chapter first describes multi-directional foreign functions [LR92], the basis for accessing external systems from Amos II queries. Then we describe how to query relational databases through Amos II. Finally some general types and functions used for defining wrappers of external sources are described.
Amos II provides a number of primitives for accessing
different external data sources by defining wrappers for
each kind external sources. A wrapper is a software module for
making it possible to query an external data source using
AmosQL. The basic wrapper interface is based on user defined multi-directional foreign
functions having various capabilities
used to access external data sources in different ways [LR92] depending on what variables are bound
or free in an execution plan, the binding patterns. On top of the basic foreign
function mechanism object oriented abstractions are defined
through mapped types [FR97]. A number of important query rewrite
techniques used in the system for scalable access to wrapped
sources, in particular relational databases, are described in [FR97]. Rewrites for handling scalable
execution of queries involving late bound function calls are
described in [FR95]. Multi-database views
are further described in [JR99a][JR99b]. The distributed query decomposer is
described in [JR02] and [RJK03].
A general wrapper for relational databases using JDBC is predefined in Amos II.
parents(Person)-> Bag of Personwe can ask these AmosQL queries:
parents(:p); /* Result is the bag of parents of :p */It is often desirable to make foreign Amos II functions multi-directional as well. As a very simple example, we may wish to ask these queries using the square root function sqroots above:
select c from Person c where :p in parents(c);
/* Result is bag of children of :p */
sqroots(4.0); /* Result is -2.0 and 2.0 */With simple foreign functions only the forward (non-inverse) function call is possible. Multi-directional foreign functions permit also the inverse to be called in queries.
select x from Number x where sqroots(x)=4.0;
/* result is 16.0 *
sqroots(4.0)=2.0;
/* Is the square root of 4 = 2 */
create function plus(Number x, Number y) -> Number rFor further details on how to define multidirectional foreign functions for different implementation languages see [Ris12][ER00].
as multidirectional
('bbf' key foreign 'plus--+') /* addition*/
('bfb' key foreign 'plus-+-') /* subtraction */
('fbb' key select x where y+x=r); /* Addition is commutative */
create function <cfn>(Function f, Vector bpat, Vector args)The cost function is normally called at compile time when the optimizer needs the cost and fanout of a function call in some query. The arguments and results of the cost function are:
-> (Integer cost, Integer fanout) as ...;
e.g.
create function typesofcost(Function f, Vector bpat, Vector args)
-> (Integer cost, Integer fanout) as foreign ...;
costhint(Charstring fn,Charstring bpat,Vector ch)->Booleane.g.
costhint("number.sqroots->number","bf",{4,2});fn is the full name of the resolvent.
costhint("number.sqroots->number","fb",{2,1});
There is a predefined wrapper for relational databases using the JDBC standard Java-based relational database interface. The JDBC wrapper is tested with MySQL Connector and Microsoft's SQLServer driver.
An instance of type Relational represents a relational database and functions of type Relational implements the interface to relational databases. The general wrapper Relational is an abstract wrapper in the sense that it does not implement an interface a a specific relational DBMS therefore has no instances. Several of the database interface functions of type Relational are defined as abstract functions. In the type hierarchy there is a specific implemented wrapper for JDBC represented by type Jdbc. The type Jdbc has one instance for each relational database JDBC connection. The type hierarchy is currently:
Datasource
|
Relational
|
Jdbc
If some other interface than JDBC (e.g. ODBC) is used for a
relational database it would require the implementation of a new
wrapper also being subtype to Relational.
The use of abstract functions type checking to find
equivalent implementations for different relational database
interfaces.
jdbc("db1","com.mysql.jdbc.Driver");Connecting the data source to a database
connect(Relational r, Charstring database, Charstring username, Charstring password) -> Relationalwhere r is the data source object, db is the identifier of the database to access, along with user name and password to use when accessing the database. For example, if the relational database called 'Personnel' resides on the local computer and MySQL is used for the managing it, the following opens a connection to the database for user 'U1' with password 'PW':
disconnect(Relational r) -> Booleanfor example:
relational_named(Charstring nm)-> Relationalfor example:
tables(Relational r)for example
-> Bag of (Charstring table, Charstring catalog,
Charstring schema, Charstring owner)
tables(relational_named("db1"));The function tables() returns a bag of tuples describing the tables stored in the relational database.
has_table(Relational r, Charstring table_name) -> Booleanfor example
has_table(relational_named("db1"),"SALES");
columns(Relational r, Charstring table_name)for example
-> Bag of (Charstring column_name, Charstring column_type)
cardinality(Relational r, Charstring table_name) -> Integerfor example
cardinality(relational_named("db1"),"SALES");
primary_keys(relational r, charstring table_name)for example:
-> Bag of (charstring column_name, charstring constraint_name)
primary_keys(relational_named("db1"),"CUSTOMER");
imported_keys(Jdbc j, Charstring fktable)for example
-> Bag of (Charstring pktable, Charstring pkcolumn, Charstring fkcolumn)
imported_keys(relational_named("db1"),"PERSON_TELEPHONES");The elements of the result tuples denote the following:
pktable
- The table referenced by the foreign
key. pkcolumn
- The column referenced by the foreign
key. fkcolumn
- The foreign key column in the table.
exported_keys(Jdbc j, Charstring pktable)for example
-> Bag of (Charstring pkcolumn, Charstring fktable, Charstring fkcolumn)
pkcolumn
- The primary key column in the table.
fktable
- The table whose foreign key
references the table. fkcolumn
- The foreign key column in the table
that references the table.SQL statements
The function sql() executes an arbitrary SQL
statement as a string:
sql(Relational r, Charstring query) -> Bag of Vector
results
The result is a bag of results tuples represented as vectors.
If the SQL statement is an update a single tuple containing one
number is returned, being the number of rows affected by the
update. Example:
sql(relational_named("db1"), "select NAME from PERSON where
INCOME > 1000 and AGE>50");
Parameterized SQL
statements
To execute the same SQL statement with different parameters
use:
sql(Relational r, Charstring query, Vector params) -> Bag
of Vector results
The parameters params are substituted into the corresponding
occurrences in the SQL statement, for example:
import_table(Relational r, Charstring table_name) -> Mapped_typefor example
import_table(relational_named("db1"),"SALES");The view is represented by a mapped type which is a type whose extent is defined by the rows of the table. Each instance of the mapped type corresponds to a row in the table. The name of the mapped type is constructed by concatenating the table name, the character _ and the data source name, for example Person_db1. Mapped type names are internally capitalized, as for other Amos II types.
For each columns in the mapped relational database table import_table()
will generate a corresponding derived wrapper function
returning the column's value given an instance of the mapped
type. For example, a table named person
having the
column ssn
will have a function
ssn(Person_db1)->Integerreturning the social security number of a person from the imported relational table.
import_table(Relational r,for example
Charstring table_name,
Boolean updatable)
-> Mappedtype mt
import_table(relational_named("db1"),"COUNTRY",true);If the flag updatable is set to true the functions in the view are transparently updatable so the relational database is updated when instances of the mapped type are created or the extent of some wrapper function updated. For example:
create Country_db1(currency,country) instances ("Yen","Japan");
set currency(c)= "Yen" from Country_db1 c where country(c)= "Japan";
The most general resolvent of import_table() is:
import_table(Relational r, Charstring catalog_name,
Charstring schema_name, Charstring table_name,
Charstring typename, Boolean updatable,
Vector supertypes) -> Mappedtype mt
The table resides in the given catalog and schema. If catalog
is "", the table is assumed not to be in a catalog. If schema
is "", the table is assumed not to be in a schema. The
parameter typename is the desired name of the mapped
type created, as alternative to the system generated
concatenation of table and data source name.
The
parameter updatable gives an updatable mapped type. The
parameter supertypes
is a vector of either type names or type
objects, or a mixture of both. The effect is that Amos II will
perceive the mapped type as an immediate subtype of the
supertypes.
import_table(Relational r,
Charstring table_name,
Charstring typename,
Boolean updatable,
Vector supertypes) -> Mappedtype mt
import_table(Relational r,
Charstring table_name,
Charstring type_name,
Boolean updatable) -> Mappedtype mt
All tables in relational databases do not correspond to
'entities' in an ER diagram and therefore cannot be directly
mapped to types. The most common case is tables representing
many-to-many relationships between entities. Typically such
tables have two columns, each of which is a foreign key imported
from two other tables, expressing the many-to-many relationship
between the two. Only entities are imported as types and special
types are not generated for such relationship tables. A
many-to-many relationship in a relational database corresponds
to a function returning a bag in AmosQL, and can be
imported using import_relation() rather than import_table():
import_relation(Relational r,
Charstring table_name, Charstring argument_column_name,
Charstring result_column_name, Charstring function_name,
Boolean updatable)
-> Function
table_name
- the name of the table containing
the relation.argument_column_name
- the name of the column
which is argument of the function.result_column_name
- the name of the column
which is result of the function.function_name
- the desired name of the
function.updatable
- whether the function should be
transparently updatable via set
, add
,
and remove
.For example, assume we have two entity types, person and telephone. Most of the time telephones are not regarded as entities in their own respect since nobody would care to know more about a telephone than its number. However, assume that also the physical location of the telephone is kept in the database, so telephones are an entity type of their own.
A person can be reached through several telephones, and every telephone may be answered by several person. The schema looks as follows:
ssn |
name |
... |
|
|
|
ssn |
ext_no |
|
|
|
|
ext_no |
location |
... |
|
|
|
import_table(my_relational, 'person');
import_table(my_relational, 'telephone');
import_relation(my_relational, 'telephone',
'ssn','ext_no','phones', false);
create function phones(person@my_relational p) ->
telephone@my_relational t as select t where phones(ssn(p)) =
ext_no(t);
Notice that only relationship functions with a single argument
and result can be generated, i.e. composite foreign keys are not
supported.
amos_type(Datasource ds, Charstring native_type_name) -> Type;
amos_type(relational_named("IBDS"),"VARCHAR");amos_type returns the Amos II type corresponding to the a specific data source.
wrapped_type(Datasource ds, Type t) -> Charstring typename;
wrapped_type(relational_named("IBDS"),typenamed("CHARSTRING"));returns the data source type corresponding to an Amos II type. Since one external type may correspond to more than one Amos II type, the function wrapped_type() is not the inverse of amos_type(). The most common relational types and their Amos II counterparts are provided by default. Both functions are stored functions that can be updated as desired for future wrappers.
=(Object x, Object y) -> Boolean (infix operator =)All objects can be compared. Strings are compared by characters, lists by elements, OIDs by identifier numbers. Equality between a bag and another object denotes set membership of that object. The comparison functions can be overloaded for user defined types.
!=(Object x, Object y) -> Boolean (infix operator !=)
>(Object x, Object y) -> Boolean (infix operator >)
>=(Object x,Object y) -> Boolean (infix operator >=)
<(Object x, Object y) -> Boolean (infix operator <)
<=(Object x,Object y) -> Boolean (infix operator <=)
abs(Number x) -> Number yiota() constructs a bag of integers between l and u.
div(Number x, Number y) -> Number z Infix operator /
max(Object x, Object y) -> Object z
minus(Number x, Number y) -> Number z Infix operator -
mod(Number x, Number y) -> Number z
plus(Number x, Number y) -> Number z Infix operator +
times(Number x, Number y) -> Number z Infix operator *
power(Number x, Number y) -> Number z Infix operator ^
iota(Integer l, Integer u)-> Bag of Integer z
sqrt(Number x) -> Number z
integer(Number x) -> Integer i Round x to nearest integer
real(Number x) -> Real r Convert x to real number
roundto(Number x, Integer d) -> Number Round x to d decimals
log10(Number x) -> Real y
for each Integer i where i in iota(1,n)
print(1);
aggfn(Bag of Type1 x) -> Type2
The following are examples of predefined aggregate functions:
sum(Bag of Number x) -> NumberAggregate functions can be used in subqueries or bags.
count(Bag of Object x) -> Integer
avg(Bag of Number x) -> Real
stdev(Bag of Number x) -> Real
max(Bag of Object x) -> Object
min(Bag of Object x) -> Object
in(Bag of Object b) -> Bag in(Vector v) -> BagFor example:
in({1,2,2}); => 1 2 2
count(Bag of Object b) -> IntegerFor example:
count(iota(1,100000)); => 100000
sum(Bag of Number b) -> NumberFor example:
sum(iota(1,100000)); => 705082704
avg(Bag of Number b) -> RealFor example:
avg(iota(1,100000)); => 50000.5
stdev(Bag of Number b) -> RealFor example:
stdev(iota(1,100000)); => 28867.6577966877
max(Bag of TP b) -> TP rThe type of the result is the same as the type of elements of argument bag. For example:
maxagg(Bag of Tp b) -> TP r
max(bag(3,4,2))+2; => 6maxagg() is an alias for max().
min(Bag of TP b) -> TP rThe type of the result is the same as the type of elements of argument bag. For example:
minagg(Bag of Tp b) -> TP r
minagg(bag(3,4,2))+2; => 4minagg() is an alias for min():
notany(Bag of Object b) -> Boolean bFor example:
notany(bag()); => TRUE notany(select n from number n where n>5 and n in {1,2,3}); TRUE
some(Bag of Object b) -> Boolean bFor example:
some(iota(1,1000000)); => TRUE some(select n from number n where n in {1,2,3} and n < 10); => TRUEThe aggregate function concatagg() makes a string of the elements in a bag b:
concatagg(Bag of Object b)-> Charstring sFor example:
concatagg(bag("ab",2,"cd")); => "ab2cd" concatagg(inject(bag("ab",2,"cd"),",")); => "ab,2,cd"
unique(Bag of TP b) -> Bag of TP rThe type of the result bag is the same as the type of elements of argument bag. For example:
unique(bag(1,2,1,4)); => 1 2 4The generalized aggregate function exclusive() extract non-duplicated elements from a bag b:
exclusive(Bag of TP b) -> Bag of TP rThe type of the result bag is the same as the type of elements of argument bag. For example:
exclusive(bag(1,2,1,4)); => 2 4The generalized aggregate function inject() inserts x between elements in a bag b:
inject(Bag of Object b, Object x) -> Bag of Object rFor example:
inject(bag(1,2,3),0); => 1 0 2 0 3The generalized aggregate functions topk() and leastk() return the k highest and lowest elements in a bag of key-value pairs p:
topk(Bag b, Number k) -> Bag of (Object rk,
Object value)
leastk(Bag b, Number k) -> Bag of (Object rk, Object
value)
If the tuples in b have only one attribute (the rk attribute) the value will
be nil. The limit clause of
select statements provide a more general way to do this, do these
functions are normally not used.
For example,
topk(iota(1, 100), 3);
returns
sort(Bag b)->VectorNotice that the result of sorting an unordered bag is a vector.
sort(Bag b, Charstring order)->Vector
Amos 1> sort(1-iota(1,3));The default first case is to sort the result in increasing order.
=> {-2,-1,0}
Amos 2> sort(1-iota(1,3),'dec');
{0,-1,-2}
Amos 3> sort(select i, 1-i from Number i where i in iota(1,3));
{{1,0},{2,-1},{3,-2}}
sortbagby(Bag b, Integer pos, Charstring order) -> VectorFor example:
sortbagby(Bag b, Vector of Integer pos, Vector of Charstring order)->Vector
Amos 1> sortbagby((select i, mod(i,2) from Number i where i in iota(1,3)),1,'dec');In the first case a single tuple ordering directive i specified by two arguments, one for the tuple position and one for the ordering direction. The result tuple positions are enumerated 1 and up.
=> {{3,1},{2,0},{1,1}}
Amos 2> sortbagby((select i, mod(i,2) from Number i where i in iota(1,3)),{2,1},{'inc','inc'});
=> {{2,0},{1,1},{3,1}}
sortvector(Vector v1, Function compfno) -> VectorExample:
sortvector(Vector v, Charstring compfn) -> Vector
sortbag(Bag b, Function compfno) -> Vector
sortbag(Bag b, Charstring compfn) -> Vector
create function younger(Person p1, Person p2) -> Boolean
as age(p1) < age(p2);
/* Sort all persons ordered by their age */
sortbag((select p from Person p), 'YOUNGER');
subtypes(Type
t) -> Bag of Type s
supertypes(Type t) -> Bag of Type s
returns the types immediately below/above type t in the
type hierarchy.
allsupertypes(Type t) -> Bag of Type s
returns all types above t in the type hierarchy.
typesof(Object o) -> Bag of Object t
returns the type set of an object.
typeof(Object o) -> Type t
returns the most specific type of an object.
typenamed(Charstring nm) -> Type t
returns the type named nm. Notice that type names are in
upper case.
name(Type t) -> Charstring nm
returns the name of the type t.
attributes(Type t) -> Bag of Function g
returns the generic
functions having a single argument of type t and a
single result.
methods(Type t) -> Bag of Function r
returns the resolvents having a single argument of type t
and a single result.
cardinality(Type t) -> Integer c
returns the number of object of type t and all its
subtypes.
objectname(Object o, Charstring nm) -> Boolean
returns true if the object o has the
name nm.
allfunctions() -> Bag of Function
returns all functions in the database.
functionnamed(Charstring
nm) -> Function
returns the object representing the function named nm.
Useful for second order
functions.
theresolvent(Charstring
nm) -> Function
returns the single resolvent
of a generic function named nm.
If there is more than one resolvent for nm an error is raised. If fn is the name of a
resolvent its functional is returned. The notation #'...'
is syntactic sugar for theresolvent('..');
name(Function f) -> Charstring
returns the name of the function f.
signature(Function
f) -> Charstring
returns the signature of f.
generic(Function f) -> Function
returns the generic function
of a resolvent.
resolvents(Function g) -> Bag of Function
returns the resolvents of an overloaded
function g.
resolvents(Charstring fn) -> Bag of Function
retturns the resolvents of an overloaded function named fn.
resolventtype(Function f) -> Bag of Type
returns the types of only the first
argument of the resolvents of function resolvent f.
arguments(Function r) -> Bag of Vector
returns vector describing arguments of signature of resolvent
r. Each element in the vector is a triplet (vector)
describing one of the arguments with structure {type,name,uniqueness}
where type is the type of the argument, name
is the name of the argument, and uniqueness is either key
or nonkey depending on the declaration of the argument..
For example,
arguments(#'timespan');
--> {{#[OID 371 "TIMEVAL"],"TV1","nonkey"},
{#[OID 371 "TIMEVAL"],"TV2","nonkey"}}
results(Function r) -> Bag of Vector
Analogous to arguments for result (tuple) of function.
arity(Function f)-> Integer
returns the number of arguments of function.
width(Function f) -> Integer
returns the width of the result tuple of function f.
sourcecode(Function f)
-> Bag of Charstring
sourcecode(Charstring fname) -> Bag of Charstring
returns the sourcecode of a function f, if
available. For generic
functions the sources of all resolvents are returned. For example:
sourcecode("sqrt");
To find all functions whose definitions contain the string
'tclose' use:
select sc
from Function f, Charstring sc
where like_i(sc,"*tclose*") and
source_text(f)=sc;
usedwhere(Function f) -> Function c
returns the functions calling the function f.
useswhich(Function f) -> Function c
returns the functions called from the function f.
userfunctions() -> Bag
of Function
returns all user defined functions in the database.
usertypes() -> Bag of
Type
returns all user defined types in the database.
allfunctions(Type t)-> Bag of Function
returns all functions where one of the arguments are of type t.
extent(Type t)->Bag of ObjectFor example, to count how many functions are defined in the database call:
count(extent(typenamed("function")));To get all surrogate objects in the database call:
shallow_extent(Type t)->Bag of ObjectFor example:
shallow_extent(typenamed("object"));returns nothing since type Object has no own instances.
extent(Function fn) -> Bag of VectorFor example,
extent(#'coercers');For stored functions the extent is directly stored in the local database. The example query thus returns the state of all stored functions. The state of the local database is this state plus the deep extent of type Object.
The extent is always defined for stored functions and can also be computed for derived functions through their function definitions. The extent of a derived function may not be computable, unsafe, in which case the extent function returns nothing.
The extent of a foreign function is always empty.
The extent of a generic function is the union of the extents of its resolvents.
optmethod(Charstring new) -> Charstring oldoptmethod returns the old setting of the optimization method.
sets the optimization method used for cost-based optimization in Amos II to the method named new.
Three optimization modes for AmosQL queries can be chosen:
"ranksort": (default) is fast but not always optimal.
"exhaustive": is optimal but it may slow down the optimizer considerably.
"randomopt": is a combination of two random optimization heuristics:
Iterative improvement and sequence heuristics [Nas93].
optlevel(Integer i,Integer j);where i and j are integers specifying number of iterations in iterative improvement and sequence heuristics, respectively. Default settings is i=5 and j=5.
reoptimize(Function f) -> Boolean
reoptimize(Charstring fn) -> Boolean
reoptimizes function named fn.
For example, in the following function there can be only one name per person:
create function name(Person)->Charstring as stored;By contrast, names() allow more than one name per person:
create function names(Person p)->Bag of Charstring nm as stored;Any other argument or result declaration can be suffixed with the keyword key to indicate the position of a unique index. For example, the following definition puts a unique index on nm to prohibit two persons to have the same name:
create function name(Person p)->Charstring nm key as stored;Indexes can also be explicitly created on any argument or result with a procedural system function create_index():
create_index(Charstring fn, Charstring arg, Charstring index_type,For example:
Charstring uniqueness)
create_index("person.name->charstring", "nm", "hash", "unique");The parameters of create_index() are:
create_index("names", "charstring", "mbtree", "multiple");
fn: The name of a stored function. Use the resolvent name for overloaded functions.
arg: The name of the argument/result parameter to be indexed. When unambiguous, the names of types of arguments/results can also be used here.
index_type: The kind of index to put on the argument/result. The supported index types are currently hash indexes (type hash), ordered main-memory B-tree indexes (type mbtree), and X-tree spatial indexes (type xtree). The default index for key/nonkey declarations is hash.
uniqueness: Index uniqueness indicated by unique for unique indexes and multiple for non-unique indexes.
Indexes are deleted by the procedural system function:
drop_index(Charstring functioname, Charstring argname);The meaning of the parameters are as for function create_index(). There must always be at least one index left on each stored function and therefore the system will never delete the last remaining index on a stored function.
To save space it is possible to delete the default index on the first argument of a stored function. For example, the following stored function maps parts to unique identifiers through a unique hash index on the identifier:
create type Part;partid() will have two indexes, one on p and one on id. To drop the index on p do the following:
create function partid(Part p)->Integer id as stored;
drop_index('partid', 'p');
For example, to cluster the properties name and address of persons one can define:
create function personprops(Person p) ->Clustering does not improve the execution time performance significantly. However, clustering can decrease the database size considerably.
(Charstring name,Charstring address) as stored;
create function name(Person p) -> Charstring nm
as select nm
from Charstring a
where personprops(p) = (nm,a);
create function address(Person p) -> Charstring a
as select a
from Charstring nm
where personprops(p) = (nm,a);
unload(Charstring filename)->Boolean
generates an AmosQL script that restores the current local
database state if loaded into an Amos II peer with the same
schema as the current one.
excluded_fns()->Bag of Function
set of functions that should not be unloaded. Can be updated by
user.
error(Charstring msg) -> Boolean
prints an error message on the terminal and raises an exception.
Transaction aborted.
print(Object x) -> Boolean
openwritefile(Charstring filename)->Boolean
quit;
quits Amos II. If the system is registered as a peer it will be
removed from the name server.
exit;
starts the multi-database browser GOOVI[CR01]. This works only under JavaAmos.
goovi();
The redirect statement reads AmosQL statements from a file:
redirect-stmt ::= '<' string-constantFor example
< 'person.amosql';load_amosql(Charstring filename)->Charstring
[CR01]K.Cassel and T.Risch: An Object-Oriented Multi-Mediator Browser. Presented at 2nd International Workshop on User Interfaces to Data Intensive Systems, Zürich, Switzerland, May 31 - June 1, 2001
[ER00] D.Elin and T. Risch: Amos II Java Interfaces, Uppsala University, 2000.
[FR95] S. Flodin and T. Risch, Processing Object-Oriented Queries with Invertible Late Bound Functions, Proc. VLDB Conf., Zürich, Switzerland, 1995.[JR99a] V.Josifovski, T.Risch: Functional Query Optimization over Object-Oriented Views for Data Integration Journal of Intelligent Information Systems (JIIS), Vol. 12, No. 2-3, 1999.
[JR99b] V.Josifovski, T.Risch: Integrating
Heterogeneous
Overlapping Databases through Object-Oriented Transformations.
In Proc. 25th Intl. Conf. On Very Large Databases,
Edinburgh, Scotland, September 1999.
[JR02] V.Josifovski, T.Risch: Query
Decomposition for a Distributed Object-Oriented Mediator
System . Distributed
and Parallel Databases J., Kluwer, May 2002.
[KJR03] T.Katchaounov, V.Josifovski, and T.Risch: Scalable View Expansion in a Peer Mediator System, Proc. 8th International Conference on Database Systems for Advanced Applications (DASFAA 2003), Kyoto, Japan, March 2003.
[LR92] W.Litwin and T.Risch: Main Memory Oriented Optimization of OO Queries Using Typed Datalog with Foreign Predicates, IEEE Transactions on Knowledge and Data Engineering, Vol. 4, No. 6, December 1992 ( http://user.it.uu.se/~udbl/publ/tkde92.pdf).
[Nas93] J.Näs: Randomized optimization of object oriented queries in a main memory database management system, MSc thesis, LiTH-IDA-Ex 9325 Linköping University 1993.
[Ris12] T.Risch: Amos
II C Interfaces, Uppsala University, 2012.
[Ris06]T.Risch: ALisp v2
User's Guide, Uppsala University, 2006.
[RJK03] T.Risch, V.Josifovski, and T.Katchaounov: Functional Data Integration in a Distributed Mediator System, in P.Gray, L.Kerschberg, P.King, and A.Poulovassilis (eds.): Functional Approach to Data Management - Modeling, Analyzing and Integrating Heterogeneous Data, Springer, ISBN 3-540-00375-4, 2003.