Tue 20 Sep 2016 — Fri 14 Oct 2016

Julia

https://github.com/JeffBezanson/phdthesis/raw/master/main.pdf

This is the introduction to Julia white paper.

It's about code selection vs performance. Dynamic dispatch (multi-methods). Automatic type specialization?

With bonuses code generation? Code generation can be inlined?

Why does technical computing tend to use code generation? Probably because thinking up abstractions is hard (sometimes it's not possible)? Does this lead to horrors?

Doesn't appear to have considered how the language will affect tooling (the Scala problem again…).

The gist is What can we do with a little bit of type information?.

Technical Computing

We use a dynamic language so that we can fudge things (guess the type etc.).

We need multiple binding stages.

Maths often doesn't fit into type systems.

Vectorisation

This has been used for performance in the past, but:

  • Can be weird to write.
  • Hard to know how to get good performance.
  • Sometimes doesn't give good performance compared to writing a loop.

Data Structures

C-like languages use data structures (performance).

Python-like languages use lots of pointers (flexibility).

Pointer data structures get really bad as you need to fit stuff in memory.

Types

Julia aims to let you change the type-system a lot?

It looks like you can overload methods with more specific types, and also get multiple dispatch.

By contrast, in R and Python etc. you have to do a lot of checking with conditionals.

Writing these extra types is the library author's job, so we trade off a bit of their time for some extra compiler understanding, and the end user still gets to treat it as a box they can copy and paste into.

This isn't quite the same as multimethods, because it allows specialization? I don't follow? Why not?

Types are Values

Fair enough.

Can use typejoin to compute a union of types.

Specialisation

  1. Take a generic function
  2. The compiler generates some specific versions of it for faster performance
  3. Let the human override some of these with their own implementation

Their own implementation doesn't have to be just for performance - it could be for different behaviour.

Because functions are specialized, their output is not generic. This means we know what the functions they call will be called with, which means we can analyse all the way down. Did I understand this right?

The important thing is we always choose the most specific version of a function available?

TODO Regular Expressions

Regular expressions denote sets? How so?

TODO Symbolic Computing

Symbolic rewrite rules - not sure what this is.

Predictate Dispatch

This is basically evaluating some if statements to decide which function to run.

Domain Theory

How do we get meaning out of programs? Domain theory.

Abstract interpretation used by compilers to get information about the program.

If we can discover some property about part of the program, we may be able to safely optimise it.

Humans may have more information about what can actually be in a part of a program than the computer does.

Language

Data structures may be mutable or not.

It's got clever indexing/subsetting like R (calls it projection), and mutation of subsets too.

Types are parameterized.

Constructors

Types are tags plus things you can make with tags by combining them with Union and friends.

There is new, which takes a type (tag) and some data.

The type may be constructed programatically (example given is MyType(x+y).

Usually you don't use new directly. A library author exposes it with wrapper functions.

Staged Programming

Add @generated

Use :() to quote an expression. (Only pure functional macros allowed.)

Interpolate inside quotation using $ inside :()

This works nicely (and efficiently), because we have lots of run-time type information anyway, since we are doing our specialization calculations.

Staging only works with concrete types.

Function Types

Arrow style function types are not useful.