For the development of scientific programs it is desirable to use the common mathematic notation and operators.
This requires access to numeric packages and the possibility of operator overloading in a very general way - more general than traditional languages (C, C++, Fortran 90, ADA, ...) support.
For scientific programs object oriented features play a subordinate role:
The meaning of an operator is implicated by it's operator symbol ('*' for example performs a multiplication). The type of the operands is not important (for the meaning). You can multiply integer numbers or floats or matrices, it makes no formal difference.
For classes it is the other way round. The name of the class (type) implicates generally spoken the allowed operations (methods). So the structuring criterium is the type and not the operator (method).
The COX language extension is upward compatible to C and supports a general operator concept.
The improvement of readability may not induce loss of speed.
In the following text COXHOME
is the location of the COX system. This should be
c:/cox
on OS/2 systems and /usr/local/cox
on UNIX systems.
/usr/local
for UNIX systems.
COXHOME/rtl
)
to LIBPATH
in the config.sys
file (OS/2 only).
COXHOME/etc/cox.cfg
in the section directories
.
The include
entry must begin with the gcc fixinclude path, if used
with the gcc compiler or on systems with oldstyle prototypes in header files, follwed by the
system include path.
The dll
entry must contain the libc
library either directly or the path in which this
library exists.
COXHOME/elisp
to your EMACSLOADPATH
in order to use the Emacs
interactive interface for COX (coxi.el
) and
COXHOME/info
to your INFOPATHS
variable, beacuse the integrated help--system
searches in the paths specified there for keywords given to the help
function.
COXHOME/bin
to the PATH
variable. On UNIX
systems create a (symbolic) link from COXHOME/bin/cox
to /usr/local/bin/cox
.
The cox
directory contains the following subdirectories and files
coxcpp
, geninit
, cox
, graphic
cox.cfg
and a sample Startup.emx
file for use with dmake
version 3.8 (for OS/2).
BIAS
, Profil
, LAPACK
, BLAS
Start the interpreter with cox -i
. The interpreter will then
process all header files included by startup.h
.
By typing m = Hilbert 3;
you will create a variable m
with
type MATRIX
which contains the hilbert matrix with dimension 3.
You will see some warnings, indicating that m
is undeclared at this time
and will become declared as a variable with type MATRIX
(2)
Warning: undeclared identifier defaults to variable. Warning: return-type of undeclared variable has been reconstructed: MATRIX m
Print this variable typing print m;
. This should produce the
following output:
((1.000000e+00 ; 5.000000e-01 ; 3.333333e-01) (5.000000e-01 ; 3.333333e-01 ; 2.500000e-01) (3.333333e-01 ; 2.500000e-01 ; 2.000000e-01))
Entering print svd m;
will print the singular values of matrix
m
(try help "svd";
for an explanation).
(1.408319e+00 ; 1.223271e-01 ; 2.687340e-03)
The following expression (U,s,V) = svd m;
will assign the
left singular vectors to U
, the singular values to s
and
the right singular vectors to V
. The left-hand-side of the
assignment is an aggregate consisting of the components (U
,
s
and V
).
The COX language is an extension of C and does not allow true matrix or vector literals. Such literals must be represented by strings. This means that the type of the variable must be already known to the system when assigning a matrix or vector literal.
Now create a vector VECTOR v;
.
After this you can convert a string to a (column) vector v = "1;2;3";
.
You can multiply matrix and vector as usual print m*v;
(3.000000e+00 ; 1.916667e+00 ; 1.433333e+00)'
or just access the last two elements of the vector v print v[2..]
.
(2.000000e+00 ; 3.000000e+00)'
char *
, which represents the
exact value the programmer typed.
aggregate
new
and delete
for memory management
break identifier, continue identifier
)
typeof(expression)
.
nullfix
, prefix
, postfix
, infix
, cast
,
operator
, precedence
, type
, attribute
, delete
forbidden
temporary
, constructor
, destructor
,
aggregate
, new
, template
, typeof
, inline
, FctResult
general operator concept means: operators which differ in
may have the same names (symbols).
You may define new operators (e.g. index operator ..
), or
redefine an existing one with a new kind (e.g. postfix !
).
The operator name Id
(identity) exists as kind prefix-operator (Id(3)
)
as kind nullfix (Id - R * A
)
An allowed operator name may be any C-identifier or it may consist of operator symbols
{a-zA-Z_} ( {a-zA-Z_} | {0-9} )*
{| ^ ! $ % & / = + - * ~ < > . : ?}*
? : , . ... ** =* =& =-
Names with ambiguities to allowed expressions in C are forbidden
(e.g. <-
because of a <-3
, which is allowed in C).
(Declaration)
result-type nullfix operator
op-name ;
where
If the declaration of an identifier is omitted, this identifier defaults to a variable. The type will automatically be reconstructed, if possible. Since references and const-types need to be initialized at their declaration, the reconstructed type will never be a reference- or const-type.
(Example)
int nullfix operator Rand;
Declares a nullfix operator Rand
, which implementation should
return a random number. Every evaluation induces a new calculation of
the result. This is not a simple variable, which holds it value but is a
function without parameters.
(Declaration)
result-type prefix operator
op-name (
op-type );
result-type prefix operator
op-name (
op-type ) precedence
op-kind op-name2 ;
result-type prefix operator
op-name (
op-type
) precedence
num ;
where
prefix
, postfix
or infix
).
(Example)
double prefix operator Sin(double x ) precedence 28;
This example shows the declaration of a prefix operator Sin
,
which has the recommended precedence for functions (very high).
new
pointer-type prefix operator new( unsigned long );
(Use)
pointer = new (
type );
pointer = new (
type )[
no-elements ];
pointer = new (
type )(
constructor-parameter );
pointer = new (
type )[
no-elements ](
constructor-parameter );
If there is no new
operator defined, the function malloc
will be called.
If you want to implement your own memory manager, you can overload the
new
operator with result type void *
. This one will be
called if the exact matching operator does not exist.
Objects with constructors are initialized directly after the memory allocation.
If the objects need destructors, the number of created objects will be stored in an additional memory area (which is allocated automatically).
delete
void prefix operator delete(
pointer-type );
(Use)
delete
pointer ;
The precompiler first looks for an exact matching operator, then for one
with argument type void *
. If they both are nonexistent, the function
free
will be used instead.
Before discarding the memory, all needed destructors will be called. Therefore the additional memory area, which stores the number of elements is needed.
result-type postfix operator
op-name (
op-type );
result-type postfix operator
op-name (
op-type );
precedence
op-kind op-name2 ;
result-type postfix operator
op-name (
op-type )
precedence
num ;
where
prefix
, postfix
or infix
).
(Example)
long postfix operator !(long n ) precedence 28;
This example declares the exclamation mark as a postfix operator. This notation is usually used for denoting the factorial.
result-type infix operator
op-name (
lop-type
;
rop-type );
result-type infix operator
op-name (
lop-type
;
rop-type ) precedence
op-kind op-name2 ;
result-type infix operator
op-name (
lop-type
;
rop-type ) precedence
num ;
where
prefix
, postfix
or infix
).
If the assignment operator is not overloaded (the normal case!), in many cases the precompiler can implicitly execute the assignment.
Result types which are not simple, like structures, are passed to the operator as a hidden reference parameter. In many C++ compilers a temporary object is generated on the heap at runtime. Here, every operator knows the address of his result. The precompiler manages the allocation (and deletion!) of temporary variables at the compile time and can place them on the stack. There is no need to use a garbage collector for that!
This method makes the treating of simple types and other types uniform:
In simple expression (e.g. A = B * C
), there is no need for a
temporary variable. The operator can write the result directly in the
left-hand side.
That saves one temporary variable per expression - that are 100 % in
simple expressions. The programmer has to handle the case where the
result is identical to one operand (A = B * A
).
Copy assignments (A = B
) are executed by the copy cast operator
(if existent, otherwise bitwise).
In the current version there only exist two kinds of parentheses (bracket)
operators (()
und []
).
For the compiler the open paren is a prefix operator, which occurs in conjunction with the closing one.
You should not overload parentheses.
(Declaration)
result-type prefix operator ()(
op-type );
result-type prefix operator [](
op-type );
The open paren is an infix operator, which occurs in conjunction with the closing one. The left operand typically is an array or vector, the right operand often an (integer) index.
The right operand can also be an aggregate
. This is useful for
matrices because you can use two or more indices.
(Declaration)
result-type infix operator ()(
lop-type
;
rop-type );
result-type infix operator [](
lop-type
;
rop-type );
The call of a (classical) function is interpreted as an index operator
()
. The function name is a pointer to the concrete function.
Cast operators are used to convert one type in another (different) one. They can be called
explicitly (e.g. p = (void *) &i
) or the call is generated by the precompiler.
(Declaration)
type cast operator(
op-type );
type cast operator( const
type & );
where type and op-type may be any allowed types.
The lower example declares a copy cast operator. It is used for value parameters. It should copy its argument, so that the copy does not share any memory areas with the original.
This is necessary for all structures which contain pointers.
The copy cast operator is also used to execute a simple copy assignment (A = B
), if no
special assignment operator is defined.
If there is no copy cast operator defined for a type, a bitwise copy will be performed.
type constructor( void );
type constructor(
argument-type );
void destructor(
type & );
The argument of a constructor should contain the subtype information like the dimension of
the vector or something similar.
The constructor is called before the variable is used the first time.
At the end of the duration of life, a variable is deleted by the destrcutor.
The precedence of predefined operators can be changed. The precedence
may not depend on the result or argument types, but only on
operator kind (prefix
, postfix
or infix
) and
its name (3).
If a precedence is not specified for an operator, the default value 5 is assumed and a warning message is printed.
Odd precedences imply an associativity from left to right, even precedences from right to left.
You may not only overload argument or result types, but also the operator kind (general operator overloading).
For keeping the language readable, the argument and result types do not influence the operator kind.
Ambiguities
MATRIX prefix operator -(MATRIX); MATRIX infix operator -(MATRIX; MATRIX); MATRIX prefix operator Id(int); IdTyp Id; /* nullfix operator Id */
The expression Id - R * A
ist ambiguious. The following meanings are possible:
(Id) - (R * A)
-
is an infix operator, Id
is a
nullfix operator
Id(-(R * A))
-
is a prefix operator, Id
is a prefix operator
In this example the precompiler defaults to the upper meaning and prints a warning message. You may force any meaning by using parentheses.
An aggregate
is a type, which consists of components like a struct
,
but they need not to be consecutive in memory.
This loose binding allows the combination of independend expressions to one aggregate, without copying.
Expression lists, which are enclosed in parentheses, are interpreted as an aggregate
.
Aggregates may also appear on the left hand side, so you can define a prefix operator, which
returns either the eigenvalues w = eig A
or the eigenvalues and the eigenvectors
of a matrix (w,v) = eig A
.
This feature completes the operator concept. Functions simply are prefix operators with one aggregate argument.
printf ( "s = %s", s );
Where printf
is the prefix operator and ( "s = %s", s )
is the argument, an aggregate.
The only restriction is the ambiguity between aggregates and expression lists.
The traditional meaning can be forced by double parentheses or an explicit type cast. The expression
printf(("'s = %s"', s ));
is allowed in both languages (C and COX), but only results in the output of
s
(as Format).
The possibility of overloading result types makes it easy to overload the result type of literals also.
The Precompiler interpretes floatingpoint literals either as
float
or double
or as char *
(string).
This allows you to access the exact value of a constant
(e.g. 0.1
) at runtime.
The following code fragment shows a possible application of this kind of
overloading.
The second infix operator has to convert its right operand from a
string to a desired data type (e.g. interval
) on its own.
interval infix operator *(interval; interval);
interval infix operator *(interval; char * );
interval cast operator (double );
main()
{interval ia, ib, ic; double d; ia = ib * ic; /* intervall multiplication */ ia = ib * d; /* uses a cast-operator from double to interval */ ia = ib * 0.1 /* multiplication (interval, char *) */ }
In C the only possibility to write vector--like literals is to write them as strings. The only possibility to pass a different number of parameters to a function are open parameter lists (ellipse) but there is no way for the function itself to determine automatically the number (or types) of arguments.
Typed open parameter lists are like open parameter lists, but:
(Example)
VECTOR prefix operator VEC(int Dim, double ... );
v = VEC( 1.0, 2.0, 3.0 );
The function VEC
is called with 3 for the formal parameter Dim
and then the three
double values.
The values can be accessed using the well known va_start
, va_arg
and va_end
macros. But the number of actual parameters and their type is known.
In C the break
and continue
statement allow of the current
loop. Sometimes it is necessary to leave not only the current but also
some outer ones of the nested loops. In COX this can be achieved by
named loops.
(Example)
int fct(void) {int i,j,k;
loop_i: for (i=0; i<I; i++) { for ( j=0; j<J; j++) for (k=0; k<K; k++) if ( !doit(i,j,k) ) break loop_i; } }
Many algorithms are, if designed carefully, independent from the type of their arguments.
In compilative languages you can implement such algorithms using templates.
The compiler generates special instances for the type combinations needed for an operator.
(Declaration)
template type
T attribute
scalar simple , type
T2 ;
T prefix operator $(const
T2 & t);
where the template is only valid for (result) types with attribute scalar and/or simple. So you can use different templates for other (type) attributes.
If there does not exist an operator direct matching the operand types, the operand will be converted as follows:
int
to char
If a pure reference is needed (non const reference parameter)
or a 'left-hand-side' of an assignment, the following result-types are preferred:
temporary T &
, then T &
otherwise T
.
For a non-lvalue type:
T
, T &
otherwise temporary T &
.
int fct(a,b) char a; double b; { ... }
a[]
) with constructor-initialized components. This
will be fixed.
#pragma
directives are ignored, but nevertheless written to
the output file. This means that you have to be very careful using
pragmas in the interpreter, especially if they affect the alignment or
something like that.
extern "C"
declared ones) will expect the struct
result as a hidden (first) reference parameter. So be carefull calling
them by a function pointer.
New (fixed) releases of the COX system are available at:
http://www.ti3.tu-harburg.de
You may have trouble using the const concept in the C++ programming language.
If you have a class VECTOR
, which implements an index operator
[]
the following two cases cause problems:
Example 1
friend double & operator [](VECTOR & v, int i) { ... } void print(const VECTOR & v ) {int i; for (i=0; i<Dimension(v); i++) cout << v[i] << " "; // problem: [] needs v as a reference // use: cout << ((VECTOR &) v)[i] << " "; }
Example 2
double VECTOR::operator [](int i) const { ... } int main(void) {VECTOR v; v[3] = 4; }
COX supports overloading of result types, so you may have both []
operators. The system will call different operators for reading and
writing access.
Since references are automatically dereferenced pointers, they also need only that much memory. They are not automatically dereferenced, if they occur as a parameter or as an argument of 'return'.
You may also write to non constant references (Lvalue). They may appear everywhere a variable may appear.
So it is possible to assign a value to a function result.
The existence of sub-typing implies the existence of partial references, e.g. if VECTOR is a subvector of another VECTOR, it references some of its elements, but has a different dimension. The (pre-) compiler uses them as normal variables.
const &
)They are usefull to avoid a copy cast operator call for arguments.
Some operators (e.g. A += B
) usually return a reference.
In this case the argument (A
) should be returned.
Some operators have to create new objects (e.g. x[3..7] = y
, where x[3..7]
is
a subvector of x
).
In C++ the new object must be manually created on the heap using the new
operator (there exist a few cases in that you can return a constructor
call, which
then performs the actual operation).
All objects on the heap have to be purged manually.
COX supports a temporary & result type.
This result type causes the (pre-) compiler to create a new object on the stack (this is an ordinary automatic variable) which will be purged automatically when leaving the scope. You may define special operators to manipulate such temporary objects.
The language extensions distinguishes constant, normal and temporary references. So you can implement different operators for reading and writing access.
A typical example for subtyping are matrices. The general type
matrix
is dynamically (at runtime) divided into matrices with
different dimensions. This subtype information is not known to the
(pre-) compiler.
Some languages make problems with subtyping - they mix up constructors and cast operators.
In C++ the constructor arguments sometime contain the new value. On the other side C++ uses constructors for type conversion.
This language extension is upward compatible to C, so plain C code is accepted.
The problem occurs on linking with C modules.
Similar to C++ the identifiers must be supplemented with their type to force unequivocality
at link level. A C compiler only supplements a leading underscore (_
).
So if you want to call a function from a C library, you must tell the (pre-) compiler not to
use this name mangling declaring these prototypes as follows:
extern "C" { /* Prototypes */ }
.
For writing libraries which can be used with COX and C++, you must restrict to portable language elements.
Non-portable language extensions are:
You can use the file coxcomp.h
in the /cox/packages/Profil
directory. This header files defines some symbols which are very usefull
for writing portable libraries, Profil itself is one.
For debugging on COX source level, tell the (pre-) compiler to write out the line numbers as
line
-directives.
In the research phase of scientific programs you need a system with short response times for testing new (algorithmic) ideas so you would choose an interactive system.
Properties of (typical) interactive languages:
In the implementation phase you need a short execution time. A compilative language is adequate for that.
Properties of (typical) compilative languages:
The properties are almost complementary. This forces the programmer to restrict himself to a small language subset, if he has to implement an algorithm for both systems.
To avoid the use of two different languages and environments, one for the research phase and one for the implementation phase, we developed the language extension COX. This languages satisfies both needs. You can interact with the interpreter and you can also create fast compilated programs from the same source.
The interpreter converts floatingpoint literals to the internal
representation at run-time. So the result may differ according to the
rounding-mode. To avoid this you should set a function to be called just
before doing the conversion (SetPreConversionHook
). Look at startup.h
for the
default value (BiasRoundNear
).
void PrintDLLNames(void);
PrintDLLNames();
void UnloadDLL(const char *);
UnloadDLL("profil.dll");
void UnloadDLLs(void);
UnloadDLLs();
void ReleaseAllCallHooks(void);
ReleaseAllCallHooks();
@fct
), a call hook is allocated to interpret this function
every time the address is called.
Since the system cannot exactly determine the last usage of such call
hook, they cannot be released automatically. So if you get an error, that the system cannot allocate
another call hook, you must first release previous used ones.
void SetPreConversionHook(void (*f)(void));
SetPreConversionHook( BiasRoundNear );
setjmp
and longjmp
.
va_start
and va_next
. This will be fixed.
The following preprocessor symbols are predefined:
__cox
__cplusplus
__cox_interactive
, interactive mode only
The COX system first processes the environment variables for the location of the configuration file and the include paths, then the configuration file, then the other environment variables and finally the commandline options.
COXHOME/etc/cox.cfg
)
COXHOME/packages/startup.h
) (overrides the path given in the
configuration file).
COXHOME/bin/coxcpp
) (overrides the path given in the
configuration file). The c preprocessor used for the
interpreter has to write out every preprocessed line immediately. So the
normal Gnu cccp
can't be used (at least for the interpreter!).
The resulting include path constists of the C_INCLUDE_PATH
and
the CPLUS_INCLUDE_PATH
variables preceeded by the include path
given in the configuration file.
The COX system first processes the environment variables for the location of the configuration file and the include paths, then the configuration file, then the other environment variables and finally the commandline options.
size
properties
char
means signed char
.
const char *
const
), false if they have the type char * const
.
T fct();
, may have
parameters, false if not.
char
).
Otherwise the minimum of component-size and alignment-size will be taken.
directories
C_INCLUDE_PATH
and
CPLUS_INCLUDE_PATH
environment variables preceeded by the value of this keyword.
COX_CPP
environment variable). The c
preprocessor used for the interpreter has to write out every
preprocessed line immediately. So the normal Gnu cccp
can't be
used (at least for the interpreter!).
COX_STARTUP
environment variable).
cpp
cpp
.
cpp
used for interactive use only.
startup.h
file.
warnings
print a warning message for
options
-o
and the name of the object file is used to compile
this source file into an appropriate object format.
additional types
Some compiler have additional predefined types. These types are not defined in any header file, but used in some. The additional types depend heavily on the compiler and the operating system. This section gives you the possibility to tell the COX system which additional types exist.
[additional types] __builtin_type = unsigned int
builtin functions
Some compiler have builtin functions, which are not declared in any header file. In most cases these are functions to handle open parameter lists. This sections gives you the possibility to tell the COX system which builtin functions your compiler uses.
[builtin functions] __builtin_next_arg = void *
This shows the necessary entry for using GCC on LINUX.
The COX system first processes the environment variables, then the configuration file and finally the commandline options.
use '+' instead of '-' to invert meaning -?, -h this help -o outfilename -I includepath -D define -i (interactive use) -d (generate .def file for OS/2 DLL) -t (generate tag-file) -r (redirect stderr to stdout) -q (quiet) -s (no signal handler) -p (open cpp as a pipe) -v (verbose mode) -a (generate ANSI instead of K&R C) -l (write source linenumber as line-directive instead of comment) -Ws (turn warning s on) -ws (turn warning s off) s = {all, hidden-symbol, precedence, undeclared-ident, undeclared-ident-type, int-pointer, pointer-int, loose-digits, overload-variable, cannot-load-symbol, cannot-load-dll}
-- to be written ---
The COX language allows constructors even for static objects (like C++). Such constructors must be called before the first program statement.
The utility geninit
collects the module initialization functions
of all modules and generates a new module which only contains
arrays for:
_COXGlobalInit
).
_COXGlobalClose
).
For dynamic link libraries there exist a function which is called by
the operating system (_DLL_InitTerm
for OS/2). Under EMX, this
function calls __ctordtorInit()
and __ctordtorTerm()
,
which can be generated by geninit
using the -d
option.
You should should use dmake
if possible.
The following changes to startup.emx should be made(4):
# Global Constructor and Destructor array created by "init" COXGLOBALS = cox_init$O COXPRE := c:\cox\cox$E COXFLAGS := COXGENINIT := c:\cox\bin\geninit$E # Rule for making C Files from COX Files using the precompiler %.c : %.cox ; $(COXPRE) $(COXFLAGS:s,\,/) $(<:s,\,/) -o $(s,\,/) # Executables .IF $(USE_COX) %$E : ; $(COXGENINIT) $(&:s,\,/) -o$(COXGLOBALS) -l$(LD) \ $(LDFLAGS) $(null,$(LDDIRS) $(null) -L$(LDDIRS:s,\,/,:t" -L")) \ $(null,$(LDLIBS) $(null) -l$(LDLIBS:b:s,,,:t" -l")) -o $(s,\,/) .ELSE %$E : ; $(LD) $(LDFLAGS) $(&:s,\,/) -L$(LDDIRS:s,\,/,:t" -L") \ -l$(LDLIBS:b:s,,,:t" -l") -o $(s,\,/) .ENDIF
Syntax of the geninit
call:
geninit
filename {filename} [-oOutputname] [-d] [-lLinker options]
The first Linker option must be the name of the linker itself.
-L instead of -l forces the ar parameter order.
The -d
option enables the creation of __ctordtorInit
and
__ctordtorTerm
which are called automatically by the OS/2
operating system when loading a DLL.
Example for use with OS/2 and gcc for linking DLLs.
geninit RealOp.obj Error.obj Constants.obj Functions.obj Interval.obj -d -ocox_init.c -lgcc -L/emx/lib/mt -Zomf -Zdll -Zmtd -Zcrtdll -o Profil.dll Bias0.obj /emx/lib/mt/sys.lib Profil.def -s -lwrap
Example for use with LINUX and ar for linking archives.
geninit RealOp.o Error.o Constants.o Functions.o Interval.o -ocox_init.c -L ar Profil.a Bias0.o