# Julia Nomenclature

These are some terms that get thrown around a lot by julia programmers. This is a brief writeup of a few of them.

## Closures

Closures are when a function is created (normally via returning from anotehr function) that references some variable in the enclosing scope. We say that it closes over those variables.

### Simple

This closure closes over count

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

### Useful

I use this to control early stopping when training neural networks. This closes over best_loss and remaining_patience

Input:

Output:

Input:

Output:

Input:

Output:

## you may be using closures without realising it

e.g. the following closes over model


function runall(dates)
model = Model()
pmap(dates) do the_day
simulate(model, the_day)
end
end


# Parallelism

3 types:

• Multiprocessing / Distributed
• Asynchronous / Coroutines

## Multiprocessing / Distributed

• this is pmap, remotecall, @spawn.
• Actually starts seperate julia process
• potentially on another machine
• Often has high communication overhead

• this is @threads
• Also in julia 1.2 is coming PARTR
• Can be unsafe, care must always be taken to do things in a threadsafe way

## Asynchronous / Coroutines

• this is @async, and @asyncmap
• Does not actually allow two things to run at once, but allows tasks to take turns running
• Mostly safe
• Does not lead to speedup unless the “work” is done elsewhere
• e.g. in IO the time is spent filling network buffers / spinning up disks
• e.g. if you are spawning extra process like with run time is spent in those processes.

# Dynamic Dispatch vs Static Dispatch

• If which method to call needs to be dicided at runtime then it will be a dynamic dispatch
• i.e. if it nees to be is decided by the values of the input, or by external factors
• If it can be decided at compile time it will be a static dispatch
• i.e. if it can be decided only by the types of the input

Input:

Output:

Input:

Output:

Input:

Output:

## Type Stability

Closely related to Dynamic vs Static Dispatch

• If the return type can decided at compile time then it will be a type stable
• i.e. if the return type is decided only by the types of the input
• If the return type can’t decided until run time then it will be a type unstable
• i.e. if the return type is decided by the values of the input, or by external factors

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

# Type Piracy

If your package did not define the

• Function (name); or
• at least 1 of the argument types

You are doing a type piracy, and this is a bad thing.
By doing type piracy you can break code in other models even if they don’t import your definitions.

Input:

Output:

### Lets define a new method, to reduce the magnitude first element by the first argument and the second by the second

we are going to call it mapreduce because is is kind of mapping this reduction in magnitude. And because this is a slightly forced example.

Input:

Input:

Output:

### Lets sum some numbers

Input:

Output:

DimensionMismatch("arrays could not be broadcast to a common size")

Stacktrace:

[11] mapreduce(::Function, ::Function, ::Array{Int64,1}) at ./In[19]:3

[12] _sum at ./reducedim.jl:653 [inlined]

[13] _sum at ./reducedim.jl:652 [inlined]

[14] #sum#550 at ./reducedim.jl:648 [inlined]

[15] sum(::Array{Int64,1}) at ./reducedim.jl:648

[16] top-level scope at In[21]:1


## Glue Packages

Sometimes to make two packages work together, you have to make them aware of each others types.

For example to implement

convert(::Type(DataFrame), axisarray::AxisArray)


where

• convert is from Base
• DataFrame is from DataFrames.jl
• AxisArray is from AxisArrays.jl

Then the only way to do this without type piracy is to do it either DataFrames.jl or AxisArrays.jl. But that isn’t possible without adding a dependency which isn’t great.

So instead we have a Glue Package, eg, DataFrameAxisArrayBuddies.jl, that adds this method. It is piracy but it is fairly safe, since it is adding behavour to types that would normally be a method error as is. Misdemenor type piracy.

## Wrapper Types and Delegation Pattern

I would argue that this is a core part of polymorphism via composition.

In the following example, we construct SampledVector, which is a vector-like type that has fast access to the total so that it can quickly calculate the mean. It is a wrapper of the Vector type, and it delegates several methods to it.

Even though it overloads Statistics.mean, and push!, size and getindex from Base, we do not commit type piracy, as we alrways own one of the types – the SampleVector.

Input:

Output:

Input:

Input:

Input:

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

Input:

Output:

## Views vs Copies

In julia indexing slices from arrays produces a copy. ys = xs[1:3, :] will allocate a new array with the first 3 rows of xs copied into it. Modifying ys will not modify xs. Further ys is certain to work fast from suitable CPU operations because of its striding. However, allocating memory itself is quiet slow.

In contrast one can take a view into the existing array: with @view or the function view. ys = @view xs[1:3, :] will make a SubArray which acts like an array that contains only the first 3 rows of xs. But creating it will not allocate (in julia 1.5 literally not at all, prior to that it will allocate a handful of bytes for a pointer.) Further, mutating the content of ys will mutate the content of xs. It may or may not be able to hit very fast operations, depending on the striding pattern. It will probably be pretty good none-the-less, since allocation is very slow.

Note this is a difference from numpy where it is always views, and you opt-out by calling copy(x[...]), and from MATLAB where it is always copying, its been long enough that that i don’t remember how to opt into views.

The concept of views vs copies is more general than just arrays. Substrings are views into strings.

They also apply to DataFrames. Indexing into DataFrames is fairly intuitive, though when you write it down it looks complex. @view and indexing works as per with Arrays, where normal indexing creates a new dataframe (or vector if just 1 column index) with a copy of the selected region, and @view makes a SubDataFrame. But there is the additional interesting case, that accessing a dataframe column either by getproperty (as in df.name) or via ! indexing (as in df[!, :name]) creates what is conceptually a view into that column of the DataFrame. Even though it is AbstractVector typed (rather than SubVector typed), it acts like a view, in that creating it is nonallocating, and mutating it mutates the original dataframe. Implementation wise it is actually direct access to the DataFrame’s internal column storage, but semantically it is a view into that column of the dataframe.

## Tim Holy Traits

Traits as something that naturally falls out of functions that can be performed on types at compile time, and on having multiple dispatch. See previous post for details. and better future post.