Introduction

Gambol is a full spectrum progressively typed general purpose application programming language. It aims to bring together the fluidity and extensibility of scripting languages and the performance and safety of systems languages without being either. Gambol relies both on the state of the art and also pushes it further to implement a novel approach in tackling the above issue from the ground up. It is as a result the world's first general purpose progressively typed programming language to be both AoT (Ahead-of-Time) compiled and to provide full AST (Abstract Syntax Tree) reflection at runtime.

Overview

Gambol offers a new approach to solving the two language problem. Using a self caching compiler that lazily loads the live AST at runtime, Gambol can produce executables that are highly optimized, do not suffer from delays and inefficiencies of JIT compiled languages and at the same time offers the full dynamic experience of scripting languages. Gambol in addition offers improvements in many different areas from syntax to the type system, language features and the standard library. To get a peek at what code in Gambol looks like checkout the code snippets demonstrated below.

Syntax

Gambol draws inspiration from a variety of contemporary languages and strives to provide a syntax that is familiar but with improvements that conform to the priciple of progressive disclosure of capability. Let's get started with some naming conventions. Types are upper case, variables lower case and you could write all code in one line or multiple lines with optional semicolons to separate instructions:

Hello World

w = `world`
print(`hello ` w ` from Gambol`)  #! prints hello world from Gambol

The above Hello World program (and that's the entire program) also demonstrates basic string interpolation and automatic type assignment. The following are a few more example programs to illustrate the look and feel of Gambol's syntax.

This may be thought of as too complicated in some languages as a second example and is not even supported in some other languages but hopefully Gambol's syntax will make it easier on the eyes! - a user defined type with one property of type Int64 overloading the += operator and the str method.

Operator Overloading

type MyType {

    += (s Self, o Self) -> Self { s.prop += o.prop }

    fun str(s Self) -> String { `value of prop is ` s.prop }

    prop Int64
}

x = MyType() {prop: 1}
y auto = MyType() {prop: 2}

x += y

print(x) #! prints value of prop is 3

You probably also noticed the explicit variable declaration for y with the keyword auto. Always a good practice to explicitly declare but sometimes you may not have the time to invest. Here is another application demonstrating progressive disclosure of capability and type inference:

Progressive Disclosure

fun add_no_type(a, b) { a + b }

fun add_with_type(a Int64, b Int64) -> Int64 { a + b }

#Gambol.function.alwaysinline
fun add_parametric[T](a T, b Int16) -> T { return a + b }


print(add_no_type(1, 2))   #! prints 3
print(add_with_type(1 Int32, 2))   #! prints 3
print(add_parametric(1, 2))   #! prints 3

And yes you can declare types for literals too including your own types!

basic list comprehension:

Basic List Comprehensions

l = [3 * x for x in ..20 keeping x % 2 == 0]
print(l)    
#! prints [0, 6, 12, 18, 24, 30, 36, 42, 48, 54]

There is a lot more to cover regarding the syntax, semantics, types system and others. If you'd like to learn more follow the Gettting Started Guide.

Type System

One of Gambol's primary goals is to expose every operation available in the target CPU. This includes vector operations exposed through SIMD and its derivatives, fma operations and other things normally not found in scripting languages or even some systems languages.

This commitment to performance also means conferring to the programmer the ability to create types with value semantics as well as reference types, to offer unsafe operations when necessary and to enable creation of zero cost abstractions and libraries.

To that end Gambol offers a 4 tiered gradual type system with automatic memory management that allows for both performance focused applications and dynamic operations wherever needed. You could define the level of specificity you like when annotating a variable with a type to indicate if a variable is to strictly, structurally or nominally conform with that type. You could also create dynamic variables that can take up values of any type as in a scripting language.

Some of the additional tools Gambol offers to reuse code, write safe code, draw maximal performance and reduce cognitive load include:

Automatic type assignment
parametric polymorphism with automatic type inference
compile time evaluation of expressions, branches and loops that enable efficient use of hardware resources and conditional compilation
Hygienic macros
strict or structural function interfaces
support for object oriented and functional programming
support for horizontal as well as vertical extension of types
the RAII idiom
enum variants
arbitrary precision arithmatic
Support for async/await with a generalized coroutine library
threading library with no such limitations as Python's GIL
safe closures that capture by value and by reference
zero cost exceptions with full exception trace stack accessible at runtime
function decorators and attributes
low level features such as pointers, calling conventions, pass by reference etc.

The Mirror Library

The mirror library provides facilities to parse, compile and run new code at runtime. It also provides facilities to inspect every element of the live AST if necessary. The AST is live in the sense that variables in the program e.g. fat references to objects or functions point to the same AST. As mentioned earlier AST nodes are loaded as needed to minimize memory usage.

The following is an example of using the mirror library to inspect a function.

Reflection

import mirror

fun my_fun(a Int32, b String) -> Int64 { a += b.len(); 123 * a }

f = mirror.Function(my_fun)

print(`function's symbol: ` f.symbol)   
print(`number of positional parameters: ` f.parameters.len())  
print(`name of the second parameter: ` f.parameters[1].symbol)   
print(`function's return type: ` f.return_type)   
print(`first instruction in the function: ` f.body.instruction(0))
print(`\nentire function: \n\n` f)

output

function's symbol: my_fun
number of positional parameters: 2
name of the second parameter: b
function's return type: Int64
first instruction in the function: a += b.len()

entire function:

fun my_fun(a Int32, b String) -> Int64 {

  a += b.len()

  123 * a

};

But you could do even more with the mirror library. Here is another example of something that's talked a lot about mostly in the movies! (but also in scripting languages). Nonetheless, it has its use cases in the real world as well.

Self Modifying Code

from mirror import Program, Module

a = 12
print(a)  #!  prints 12

prg_string = `a = 1332`

module, _ = Module.parse(prg_string)
Program.analyze(module)
Program.generate_code(module)
Program.run_module(module)

print(a)     #!   prints 1332

In the above example new code is created from a string but there are more ways to create new code. You could for example use the API of the mirror library to create new functions programmatically.

One important application of full runtime AST reflection is the possibility to extend the language to new hardware domains like GPUs, TPUs or custom ASICs. You could inspect code for a function and then generate binary for a custom target. This has been one of Python's advantages over systems languages making projects such as Taichi Lang or Numba possible.

Other applications of a fully reflective code include automatic differentiation of functions, probabilistic programming and generating documentation from the source code. As a readily available example of the latter the entire documentation for the standard library on this website is generated from the source code using a script that uses mirror to find the signature of every function and type and their corresponding documentation.

A last hypothetical use of the mirror library is the creation of continuously self modifying code that runs fast with applications in optimization or Artificial Intelligence.

Interoperability

Gambol exposes the GNU libtool with additional tools to help find and call foreign functions easily. The ability to override the member access operator and to define properties with string literals as names helps to create wrapper modules that expose an idiomatic API for calling into foreign libraries. For demonstration purposes only, a Python library is included to show the idiomatic syntax for calling into Python libraries such as matplotlib below:

Python Extension

from python import *
plt = import(`matplotlib.pyplot`) 

plt.plot([1, 2, 3, 4])
plt.ylabel(`some numbers`)
plt.show()

There is a Jupyter kernel included that further demonstrates interactions with python (in addition to running Gambol in Jupyter of course!). The kernel uses the mirror library to compile and run each code snippet.

It is also possible to define #extern functions in Gambol to be called from other programs.

In addition Gambol can generate debug symbols and integrate with debugging tools such as asan and others which could help shed light on issues in situations that require low level debugging e.g. in presence of unsafe code.

Prior Art

Gambol's inception sparked primarily out of a frustration with the arsenal of languages that dominate software development today. While languages such as Python, Ruby and Lua may seem to offer a much quicker pathway to translate thoughts into code, their lack of type safety and a capable type system often times make it more difficult to perform that translation correctly if the project is or has the potential to grow larger or more complex than initially planned.

Another important shortcoming of scripting languages is their inability to take full advantage of the CPU. Where and if this becomes a priority the current state of affairs is to make use of the extensibility of these languages and rely on systems languages such as C, C++, Rust etc. to do the heavy lifting which creates what's known as the two language problem. Systems languages are not known for their ease of use. They incur a much higher cognitive burden on the programmer even in doing simple tasks as they require tracking additional pieces of information throughout the code and sometimes provide many different options at each step of coding that bifurcate, trifurcate or break the chain of thought of the programmer. This prohibits not only the joy of coding but could also hinder productivity at times when it seems to be completely unnecessary. For example the level of control that systems languages provide may not be necessary at all to merely take full advantage of the hardware! it may be necessary to meet certain requirements when bootstrapping an operating system but not when creating a higher level application. Some systems languages like C++ also suffer from a bloated syntax that has evolved over many decades to cram as many new features as possible as newer competing languages discovered those useful higher level constructs.

With all that said there have been attempts to combine the best of the two worlds before.

Just AoT Compile it Instead?

One approach is to try to create an AoT compiler for existing scripting languages. One example of these efforts are projects that attempt to simulate the syntax of scripting languages like Python, Ruby etc. with a compiler that infers and adds static types during compilation in order to produce performant code. Recent languages like Codon and Crystal would fall into this category. This approach however does not address the deeper issue of marrying the dynamic features of the language with the safety and performance of static typing. Often times as in examples above, dynamic elements of the parent language are omitted in these derivatives and with that omission goes the flexibility and extensibility that the scripting languages provide as those traits arise out of more than just having a nice syntax.

Language Extensions

A different category of languages claim to gradually implement enough features to one day become supersets of the languages they try to emulate. The definition of superset here being that the ideal language encompasses the entire syntax and semantics of the parent language with additional features added usually to improve performance of the code. Cython (created 17 years ago) is an example of such a language that attempts to compile Python code into C. It however falls short of its goals to cover the entire Python semantics (you can't for example inspect Cython code like you can in Python) and the resultant program is not as efficient as a program that's hand optimized in C. It also makes debugging such programs extremely difficult as there are multiple compilers and languages involved in the process. A more recent systems language Mojo uses the same approach to compile Python-looking programs to MLIR and eventually LLVM. While Mojo's static code runs fast due to compiler optimizations and its compile time evaluation capabilities, it outsources all dynamic functionality to the Python interpreter (CPython) and therefore loses the flexibility and extensibility that Python offers at runtime. It for example is incapable of full AST inspection at runtime, and is not a truely dynamic language but tries to emulate dynamism by marshalling data back and forth between the Python interpreter and Mojo proper wherever such assignments are necessary.

At a higher level, trying to fully emulate another language carries a lot of risk as the semantics of that other language may conflict with the concepts of static typing entirely either now or in the future. The other language is a project of its own with their own decision making processes and may not necessarily take into account compatibility with derivative programming languages. Emulating another language faithfully also means you inherit all the flaws of that language by definition and will have no chance to improve upon them.

JIT Compilers

A third group of languages provide dynamism through JIT (Just-in-Time) compilation and therefore at the very least suffer from delays at runtime by definition. The JIT compiler in these languages also may not produce the most efficient code possible as there are timing constraints imposed on the compiler. Numba is a library that JIT compiles Python functions by adding a decorator on the function. This comes with many limitations and does not work for every function. Numba for example is not aware of the entire codebase just the function passed to it and therefore cannot make inferences and deductions that a proper compiler can. Julia is a JIT compiled programming language that offers full dynamism and has great performance. It can be demostrated however that Julia does not produce the fastest code possible to take full advantage of the hardware in certain situations. Julia also lacks a fully-fledged type system limiting its usability to primarily the domain of numerical computing. It has an unfamiliar syntax that is more suitable for mathematical notation than software engineering. The JIT compiler in julia also comsumes an enormous amount of memory at runtime. There are third party packages that attempt to precompile Julia code and produce an executable to address some of the above issues however at this time they do not have support for all Julia features and are experimental. It is unclear if Julia's semantics even allow for the entire language to be pre-compiled in the first place.