# Multi-Threading

This experimental interface supports Julia's multi-threading capabilities. Types and functions described here might (and likely will) change in the future.

`Base.Threads.threadid`

— Function.`Threads.threadid()`

Get the ID number of the current thread of execution. The master thread has ID `1`

.

`Base.Threads.nthreads`

— Function.`Threads.nthreads()`

Get the number of threads available to the Julia process. This is the inclusive upper bound on `threadid()`

.

`Base.Threads.@threads`

— Macro.`Threads.@threads`

A macro to parallelize a for-loop to run with multiple threads. This spawns `nthreads()`

number of threads, splits the iteration space amongst them, and iterates in parallel. A barrier is placed at the end of the loop which waits for all the threads to finish execution, and the loop returns.

`Base.Threads.Atomic`

— Type.`Threads.Atomic{T}`

Holds a reference to an object of type `T`

, ensuring that it is only accessed atomically, i.e. in a thread-safe manner.

Only certain "simple" types can be used atomically, namely the primitive boolean, integer, and float-point types. These are `Bool`

, `Int8`

...`Int128`

, `UInt8`

...`UInt128`

, and `Float16`

...`Float64`

.

New atomic objects can be created from a non-atomic values; if none is specified, the atomic object is initialized with zero.

Atomic objects can be accessed using the `[]`

notation:

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> x[] = 1
1
julia> x[]
1
```

Atomic operations use an `atomic_`

prefix, such as `atomic_add!`

, `atomic_xchg!`

, etc.

`Base.Threads.atomic_cas!`

— Function.`Threads.atomic_cas!(x::Atomic{T}, cmp::T, newval::T) where T`

Atomically compare-and-set `x`

Atomically compares the value in `x`

with `cmp`

. If equal, write `newval`

to `x`

. Otherwise, leaves `x`

unmodified. Returns the old value in `x`

. By comparing the returned value to `cmp`

(via `===`

) one knows whether `x`

was modified and now holds the new value `newval`

.

For further details, see LLVM's `cmpxchg`

instruction.

This function can be used to implement transactional semantics. Before the transaction, one records the value in `x`

. After the transaction, the new value is stored only if `x`

has not been modified in the mean time.

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_cas!(x, 4, 2);
julia> x
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_cas!(x, 3, 2);
julia> x
Base.Threads.Atomic{Int64}(2)
```

`Base.Threads.atomic_xchg!`

— Function.`Threads.atomic_xchg!(x::Atomic{T}, newval::T) where T`

Atomically exchange the value in `x`

Atomically exchanges the value in `x`

with `newval`

. Returns the **old** value.

For further details, see LLVM's `atomicrmw xchg`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_xchg!(x, 2)
3
julia> x[]
2
```

`Base.Threads.atomic_add!`

— Function.`Threads.atomic_add!(x::Atomic{T}, val::T) where T <: ArithmeticTypes`

Atomically add `val`

to `x`

Performs `x[] += val`

atomically. Returns the **old** value. Not defined for `Atomic{Bool}`

.

For further details, see LLVM's `atomicrmw add`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_add!(x, 2)
3
julia> x[]
5
```

`Base.Threads.atomic_sub!`

— Function.`Threads.atomic_sub!(x::Atomic{T}, val::T) where T <: ArithmeticTypes`

Atomically subtract `val`

from `x`

Performs `x[] -= val`

atomically. Returns the **old** value. Not defined for `Atomic{Bool}`

.

For further details, see LLVM's `atomicrmw sub`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_sub!(x, 2)
3
julia> x[]
1
```

`Base.Threads.atomic_and!`

— Function.`Threads.atomic_and!(x::Atomic{T}, val::T) where T`

Atomically bitwise-and `x`

with `val`

Performs `x[] &= val`

atomically. Returns the **old** value.

For further details, see LLVM's `atomicrmw and`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_and!(x, 2)
3
julia> x[]
2
```

`Base.Threads.atomic_nand!`

— Function.`Threads.atomic_nand!(x::Atomic{T}, val::T) where T`

Atomically bitwise-nand (not-and) `x`

with `val`

Performs `x[] = ~(x[] & val)`

atomically. Returns the **old** value.

For further details, see LLVM's `atomicrmw nand`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(3)
Base.Threads.Atomic{Int64}(3)
julia> Threads.atomic_nand!(x, 2)
3
julia> x[]
-3
```

`Base.Threads.atomic_or!`

— Function.`Threads.atomic_or!(x::Atomic{T}, val::T) where T`

Atomically bitwise-or `x`

with `val`

Performs `x[] |= val`

atomically. Returns the **old** value.

For further details, see LLVM's `atomicrmw or`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(5)
Base.Threads.Atomic{Int64}(5)
julia> Threads.atomic_or!(x, 7)
5
julia> x[]
7
```

`Base.Threads.atomic_xor!`

— Function.`Threads.atomic_xor!(x::Atomic{T}, val::T) where T`

Atomically bitwise-xor (exclusive-or) `x`

with `val`

Performs `x[] $= val`

atomically. Returns the **old** value.

For further details, see LLVM's `atomicrmw xor`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(5)
Base.Threads.Atomic{Int64}(5)
julia> Threads.atomic_xor!(x, 7)
5
julia> x[]
2
```

`Base.Threads.atomic_max!`

— Function.`Threads.atomic_max!(x::Atomic{T}, val::T) where T`

Atomically store the maximum of `x`

and `val`

in `x`

Performs `x[] = max(x[], val)`

atomically. Returns the **old** value.

For further details, see LLVM's `atomicrmw max`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(5)
Base.Threads.Atomic{Int64}(5)
julia> Threads.atomic_max!(x, 7)
5
julia> x[]
7
```

`Base.Threads.atomic_min!`

— Function.`Threads.atomic_min!(x::Atomic{T}, val::T) where T`

Atomically store the minimum of `x`

and `val`

in `x`

Performs `x[] = min(x[], val)`

atomically. Returns the **old** value.

For further details, see LLVM's `atomicrmw min`

instruction.

**Examples**

```
julia> x = Threads.Atomic{Int}(7)
Base.Threads.Atomic{Int64}(7)
julia> Threads.atomic_min!(x, 5)
7
julia> x[]
5
```

`Base.Threads.atomic_fence`

— Function.`Threads.atomic_fence()`

Insert a sequential-consistency memory fence

Inserts a memory fence with sequentially-consistent ordering semantics. There are algorithms where this is needed, i.e. where an acquire/release ordering is insufficient.

This is likely a very expensive operation. Given that all other atomic operations in Julia already have acquire/release semantics, explicit fences should not be necessary in most cases.

For further details, see LLVM's `fence`

instruction.

## ccall using a threadpool (Experimental)

`Base.@threadcall`

— Macro.`@threadcall((cfunc, clib), rettype, (argtypes...), argvals...)`

The `@threadcall`

macro is called in the same way as `ccall`

but does the work in a different thread. This is useful when you want to call a blocking C function without causing the main `julia`

thread to become blocked. Concurrency is limited by size of the libuv thread pool, which defaults to 4 threads but can be increased by setting the `UV_THREADPOOL_SIZE`

environment variable and restarting the `julia`

process.

Note that the called function should never call back into Julia.

# Low-level synchronization primitives

These building blocks are used to create the regular synchronization objects.

`Base.Threads.Mutex`

— Type.`Mutex()`

These are standard system mutexes for locking critical sections of logic.

On Windows, this is a critical section object, on pthreads, this is a `pthread_mutex_t`

.

See also `SpinLock`

for a lighter-weight lock.

`Base.Threads.SpinLock`

— Type.`SpinLock()`

Create a non-reentrant lock. Recursive use will result in a deadlock. Each `lock`

must be matched with an `unlock`

.

Test-and-test-and-set spin locks are quickest up to about 30ish contending threads. If you have more contention than that, perhaps a lock is the wrong way to synchronize.

See also `Mutex`

for a more efficient version on one core or if the lock may be held for a considerable length of time.