Skip to content
John Wilkes edited this page May 13, 2025 · 116 revisions

Other Language References

Some interesting languages to get ideas from:

Wikipedia also has a list of programming languages.

Interesting Articles/Videos

Tasks

Features

  • New struct definition syntax
  • Built-in functions
  • Cast ints to pointers
  • Optional
  • trusted
  • Don't require var (and const?) to be defined when declared
  • Enums
  • New function definition syntax
  • Function overloading
  • Default function arguments
  • Named function arguments
  • For loop pointer iterator?
  • Add "methods"

Cleanup

  • Allow global constants to be declared in any order
  • Require () for if/elif, while, and for condition expressions
  • Use = instead of : for initializing struct members
  • Make function parameters immutable
  • Assignment operators should not be binary expressions
  • Const x = z != 0 && y / z == 2;
  • Print global constants in CHeaderPrinter?
  • Use 'void' instead of 'UnitType' in LLVM IR?
  • Should integers implicitly cast to floats?

Compiler Options

Create verb commands for the compiler similar to git. Ideas:

  • build - Compile
  • complete - Auto-complete for shells
  • doc - Build documentation
  • help - Print help for commands
  • init - Initialize a new project
  • run - Build and run the executable (maybe make this a sub-option to build like in the Zig compiler)
  • test - Run tests
  • version - Print the version

build and/or run should have an option to compile a program and run it:

wip run -c 'print("Hi!");'

Types

Structs

Struct Declaration

Change struct declaration syntax:

const MyStruct = struct
{
    name str,
    age u8,
};

Initialization

Alternative struct initialization using ( and ) instead of { and }. This would prevent ambiguous situations.

var s = MyStruct
(
    name = "Al",
    age = 32,
);

- OR -

Use : after the struct type expression. Would this prevent all ambiguous situations?

var s = MyStruct:
{
    name = "Al",
    age = 32,
};

Layout

Add an attribute to allow users to specify the layout of struct members:

const S = struct @Layout("Packed")
{
    a i32,
    b bool,
    c f64,
    d bool,
};

Options for struct layout:

  • SmallAligned (default) - Members are laid out to make the struct's memory footprint as small as possible while still being aligned
  • Manual - Members are laid out in the order specified in the struct definition and aligned
  • Packed - Members are laid out in the order specified in the struct definition but are not aligned

Questions:

  • For the "SmallAligned" option, should bools be combined into a bitfield? How will taking the address of a bool member be handled?
  • Should nested struct members be re-arranged inside the parent struct?
  • Should there be an option to pack bits into a bit field?

Using

Use using to directly access nested struct members:

const A = struct
{
    x i32,
};

const B = struct
{
    a using A,
    y i32,
};

var b1 = B { x: 1, y: 2 };
var b2 = B { a: A { x: 1 }, y: 2 };

b1.x += 2;
b1.a.x += 2;

Views

Add a way to access the same memory in different ways. All members in a view must have the same size.

Syntax idea:

const Address = struct { a u8, b u8, c u8, d u8, };

const IPv4 = struct
{
    view
    {
        n u32,
        a []u8,
        s Address,
    }
};

Self

Add the Self keyword which will be an alias for a struct's type in its definition.

const ListNode = struct
{
    prev &Self,
    next &Self,
};

Unions

Use enum to designate which member is active like Zig.

const U = union
{
    i i32,
    f f64,
};

Optional

Create an optional type from any type by prepending ?.

var int1 ?i32 = 123;
var int2 ?i32 = none;
var string ?str = "";

What should be the syntax for none? Options:

  • none
  • none(i32)
  • none'i32'
  • ?i32.none

Optional type properties:

  • HasValue - Boolean for whether or not a value exists
  • Value - Get the value if one exists; otherwise, aborts

The ?. operator returns the member value as an optional if the left-hand operand is not none; otherwise, it returns none.

fun getName(person ?&Person) ?str
{
    person?.Name
}

The ?? operator returns the left-hand operand if it is not none; otherwise, it returns the right-hand operand.

fun numOrZero(num ?i32) i32
{
    num ?? 0
}

# the function above is equivalent to the following

fun numOrZero2(num ?i32) i32
{
    if num.HasValue
    {
        num.Value
    }
    else
    {
        0
    }
}

Methods

A struct method can be any function where the first parameter is a pointer (or value?) of that type. Can be called on pointers or values.

const S = struct
{
    a i32,
    b i32,
};

fun add(self &S, c i32) &S
{
    self.a += c;
    self.b += c;
    self
}

fun test()
{
    var s = S { a: 1, b: 2 };
    s.add(2);
}

Tuples

# initializing a tuple
var t1 (i32, str, bool) = (19, "Joe", false);

# initializing a tuple with some named fields
var t2 (age i32, name str, bool) = (22, "Jack", true);

# fields can be accessed by index or name
var age1 i32 = t1.0;
var age2 i32 = t2.age;

# unpacking a tuple
var (i, s, b) = t1;

Anonymous Types

Do we need anonymous types? It seems like tuples have all the functionality of anonymous types and more.

var a = { x = 12, y = 13 };

Enums

Enum values start at 0 and increment by one unless explicitly set with the = (or :?) operator.

const Animal = enum
{
    Chicken, # = 0
    Cow = 5,
    Dog, # = 6
    Platypus = 12,
    Zebra = 7,
};

Arrays

Should arrays have sizes declared at compile time? If so, I should rename what are currently called "arrays" to "slices".

Arrays declared on the stack should not be returned from a function. Arrays will be passed by reference. I need to come up with a syntax to denote passing mutable vs. immutable references. Ideas:

fun f1(a []u64) { } # immutable
fun f2(a mut []u64) { } # mutable?

What should be the syntax for multi-dimensional arrays? Ideas:

var a1 [,]i32 = ...; # 2D array
var a2 [,,]i32 = ...; # 3D array

Strings

Binary Strings

Add binary (ASCII) strings that can be specified with a b prefix: b"abc".

Raw Strings

Add raw strings that can be specified with an r prefix and have no escape sequences: r"C:\path\to\file.txt". Should raw strings also support multi-line strings?

OR

A raw string starts with an r followed by any number of double quotes ("). It ends with an equal number of double quotes and another r. It has no escape sequences. It supports multi-line strings.

r"He said, "How are you?""r
r"C:\path\to\file.txt"r

OR

A raw string starts with one or more r followed by a double quote ("). It ends with a double quote and an equal number of r. It has no escape sequences. It supports multi-line strings.

rrr"He said, "How are you?""rrr
r"C:\path\to\file.txt"r

OR

A raw string starts with one or more backticks. It ends with the same number of backticks.

var s1 = `raw string`;
var s2 =
```
raw string
```

Null escape sequence

Add \0 escape sequence for null char?

var s = "abc\0";

Control Flow

For Loop

for loop over array with pointer:

for x & in array
{
    # x is a pointer to each element in array
}

break and continue with labels:

outer:
for x in array1
{
    inner:
    for y in array2
    {
        if x == y
        {
            break outer;
        }
    }
}

While Loop Default

Rust has a feature where a break in a loop can return a value. I could do something similar with a while loop if a default value could be specified for when the loop exited.

while i < 10
default 100
{
    # ...
    break 10;
    # ...
    i += 1;
}

Match

A few different ideas for syntax are below:

Rust style:

match x
{
    1 => "a",
    2 => "b",
    _ => "x",
}

C#-ish style

match x
{
    case 1: "a",
    case 2: "b",
    default: "x",
}

This one involves less typing, but I would need to figure out how to deconflict it with type initialization.

match x
{
    1: "a",
    2: "b",
    else: "x",
}

Multiple values:

match x
{
              1: 10, # match 1
           2, 3: 20, # match 2 or 3
          5..<8: 30, # match 5, 6, or 7
         10..13: 40, # match 10, 11, 12, or 13
    20, 30..<33: 50, # match 20, 30, 31, or 32
           else:  0,
}

Operators

Declare and Assign

Potential new declare-and-assign syntax (similar to Go).

Variables:

x i32 := 123;

Constants:

x i32 ::= 123;

Does this lead to any ambiguous cases for more complex type names? If so, I could use a similar syntax to Jai:

Variables:

x : i32 = 123;

Constants:

x : i32 : 123;

Modulus vs. Remainder

Should the % operator calculate the modulus or the remainder (see Stack Overflow question)? I could make % the remainder and add another operator (e.g. %%) to calculate the modulus.

Division

I'm thinking about implementing division similar to Python 3. The / operator will always return a floating point number, even if its operands are integers. The // operator would be added to denote integer division. I think this would make division less error prone and the syntax more explicit. However, it's not obvious what type of floating point number should be returned by integer division (single or double precision).

Chaining comparisons

Support chaining comparisons like Python and Julia. For example, x < y < z would be similar to (x < y) && (y < z) except that y would only be evaluated once.

Functions

Function Definition

Make function definition more consistent with variable definition:

# without type
const add = fun(x i32, y i32) i32
{
    x + y
}

# with type
const add fun(x i32, y i32) i32 = fun(x i32, y i32) i32
{
    x + y
}

Type Syntax

Should I add -> before the return type? Would this prevent ambiguous parsing situations?

fun sum(a i32, b i32) -> i32
{
    return a + b;
}

Default Parameters

The current idea is that default values for parameters could be specified in a function definition using the = operator:

fun myFunction(a i32, b i32 = 1, c bool = false) i32
{
    # ...
}

All parameters having default values must be at the end of the parameter list. You would not be able to specify a parameter without a default value after a parameter with a default value:

# This would produce a compile error
fun myFunction(a i32, b i32 = 1, c bool) i32
{
    # ...
}

Named Arguments

Support named arguments when calling a function:

fun test(a i32, b i32, c bool) i32
{
    # ...
}

test(10, c = false, b = 42);

Should : be used (like C#) or = (like Python)? I prefer how = looks. However, could that get developers in trouble if they where trying to type ==? The only bad scenario I can think of is this:

fun test(b bool) bool
{
    # ...
}

# developer means to write this...
test(b == true);

# ...but accidentally writes this
test(b = true);

Variadic Functions

Ideas for variadic function syntax:

fun option1(param1 i32, param2 i32, params... []i32); # I prefer this one

fun option2(param1 i32, param2 i32, params... i32);

fun option3(param1 i32, param2 i32, params []i32...);

fun option4(param1 i32, param2 i32, params i32...);

Trusted

Add the trusted keyword to enable more error-prone features. Ideas for trusted features:

  1. Calling an extern function. An extern function must be called in a trusted block unless trusted is added to its declaration
extern fun e1();
extern trusted fun e2();

fun f()
{
    trusted
    {
        e1();
    }

    e2();
}
  1. Pointer math. This includes assigning an integer literal to a pointer, and performing arithmetic and bit operations on a pointer (e.g. +, -, <<, etc.)
fun f(p1 &i32) &i32
{
    var p2 &i32 = p1;

    trusted
    {
        p2 += 4;
    }

    return p2;
}
  1. Casting pointers
var p1 &i32 = &x;
trusted
{
    var p2 &i8 = cast(&i8, p1);
}
  1. Uninitialized variables
var x i32 = trusted;

It seems a little odd to use trusted here. Maybe undef would be better?

  1. Uninitialized arrays
var array []i32 = [1_024; trusted];
  1. Setting string members
fun f(chars []u8) str
{
    trusted
    {
        var s str = trusted;

        s.Data = chars;
        s.Size = chars.Size;

        return s;
    }
}

Built-Ins

Built-ins start with @. They can precede functions, structs, or struct members.

Should we have a way for users to write built-in functions?

const MyStruct = struct
@XmlSerializable
@JsonSerializable
{
    x i32,
    y bool,

    @NoSerialize
    z str,
};

Math

@e

The mathematical constant e (2.71828...).

@pi

The mathematical constant π (3.14159...).

@tau

The mathematical constant τ (2π or 6.28318...).

@abs(n int|float) int|float

Absolute value of int or float.

@max(values... any) any

Returns the maximum of the given values.

@min(values... any) any

Returns the minimum of the given values.

@fma(x float, y float, z float) float

Calculates fused multiply-add (x * y + z).

@sqrt(n float) float

Calculates the square root of a float.

@pow(n float, e float) float

Calculate n raised to the e power.

@log(n float, base float) float

Calculate the logarithm of n with base base.

@toDegrees(n float) float

Converts from radians to degrees.

@toRadians(n float) float

Converts from degrees to radians.

@sin(n float) float

Calculates the sine of a float.

@cos(n float) float

Calculates the cosine of a float.

@tan(n float) float

Calculates the tangent of a float.

@asin(x float) float

Calculates the arcsine of x.

@acos(x float) float

Calculates the arccosine of x.

@atan(x float) float

Calculates the arctangent of x.

@atan2(y float, x float) float

Calculates the arctangent of y / x. I might combine this with @atan().

Bit Manipulation

@ones(t type) int

Returns a value of all ones for the given integer type.

@popCount(value int) int

Returns the population count (number of 1s) in the given int.

@rotLeft(value int, n int) int

Rotate value left by n bits.

@rotRight(value int, n int) int

Rotate value right by n bits.

Other

@assert(value bool, errorMessage str)

Run-time assert that value is true. Otherwise, print errorMessage and exit.

@bitCast(t type, value any) t

  • Cast ints <-> pointers
  • Cast ints <-> floats
  • Cast ints <-> ints

Types must be of the same size.

@cast(t type, value any) t

Replace current cast with this.

@compileAssert(value bool, errorMessage str)

Compile-time assert that value is true. Otherwise, causes a compile error and prints errorMessage.

embedFile(filename str) []u8

Embed a file in the binary.

@error(message str)

Trigger a compiler error.

@sizeOf(value any) usize

Return the size in bytes (or bits?) of an identifier, struct member, or type.

@sourceInfo() SourceInfo

The source code info where this function is called. Returns the following struct:

const SourceInfo = struct
{
    Filename str,
    Line usize,
};

@typeInfo(t type) TypeInfo

Called on a type and returns the following struct:

const TypeInfo = struct
{
    Name str,
    Size usize, # size in bytes
    IsInt bool,
    IsSignedInt bool,
    IsUnsignedInt bool,
    IsFloat bool,
    IsPointer bool,
    IsStruct bool,
    IsArray bool,
    InnerType ?type,
    InnerTypeInfo ?TypeInfo,
};

@typeOf(value any) type

Return the type of an expression. Should there be an restrictions on what kind of expression this can be called on?

@Debug

Creates a ToString() function for the marked struct.

@Deprecated(message str)

Marks a function/struct as deprecated and prints a compiler warning if it's used.

@Deprecated("Do not use this!")
fun old_function() { }

@Layout(layoutType str)

See description here.

@Test(name str)

Marks a function as an automated test (unit test, regression test, etc.).

Namespaces

namespace Level1
{
    namespace Level2
    {
    }
}

namespace Level1.Level2
{
}

Or maybe do something completely different: Jon Blow's opinion.

Threading

Add a keyword to run a function on another thread. This should create a Task object with a handle to the thread. The function's return value could be safely accessed through the task object. Examples:

# run doStuff() on a new thread
run doStuff(1, 2, 3);

var task Task<i32> = run doStuff(1, 2, 3);
var x i32 = task.Result; # this will block until the thread completes

Keyword ideas:

  • launch
  • run
  • spawn
  • start
  • thread

I like start as the keyword, but that could be a common variable name.

Random Thoughts

  • Multiple function implementations?
  • Do we need RAII?
  • Multiple memory allocators
  • defer keyword (Go, Jai)
  • Run code at compile time (e.g. #run directive) (D, Jai)
  • Static assert
  • Pack bools in structs
  • Option to init arrays Jai
  • Add int and uint
  • Set size of array index? Jai
  • for loop iterate by pointer: Jai
  • Loop non-local break/continue by specifying iterator Jai
  • Enums Jai
  • Force functions to be inline Jai
  • Re-write C, C++, etc. in WIP?
  • Use composition instead of inheritance. Add syntax (using?) to decrease typing? Jai
  • Struct data hot/cold Jai
  • AOS (arrays of structures) vs. SOA (structure of arrays) Jai
  • Embedded assembly
  • Importing import only a, b, c or import except x, y, z
  • Pass context into all functions with logger, allocator, etc.? Jai
  • Multiple return values Jai
  • Named return values? Jai
  • Loop over varargs Jai
  • Expand array into varargs
  • Efficient template code generation Jai
  • "Baking" template functions Jai
  • Debug module for attaching debugger and setting breakpoint
  • transmute operator Odin
  • try to return errors Zig
  • catch unreachable Zig
  • Why C++ namespaces are bad Jai
  • inline while Zig
  • Function parameters Zig
Clone this wiki locally