343 lines
7.8 KiB
Plaintext
343 lines
7.8 KiB
Plaintext
---
|
|
engine: julia
|
|
---
|
|
|
|
# Containers
|
|
|
|
Julia offers a wide selection of container types with largely similar interfaces.
|
|
We introduce `Tuple`, `Range`, and `Dict` here, and in the next chapter we will cover `Array`, `Vector`, and `Matrix`.
|
|
|
|
These containers are:
|
|
|
|
- **iterable:** You can iterate over the elements of the container:
|
|
```julia
|
|
for x ∈ container ... end
|
|
```
|
|
- **indexable:** You can access elements via their index:
|
|
```julia
|
|
x = container[i]
|
|
```
|
|
and some are also
|
|
|
|
- **mutable:** You can add, remove, and modify elements.
|
|
|
|
Furthermore, there are several common functions, e.g.,
|
|
|
|
- `length(container)` --- number of elements
|
|
- `eltype(container)` --- type of elements
|
|
- `isempty(container)` --- test whether container is empty
|
|
- `empty!(container)` --- empties container (only if mutable)
|
|
|
|
|
|
## Tuples
|
|
|
|
A tuple is an immutable container of elements. It is therefore not possible to add new elements or change the value of an element.
|
|
|
|
|
|
|
|
```{julia}
|
|
t = (33, 4.5, "Hello")
|
|
|
|
@show t[2] # indexable
|
|
|
|
for i ∈ t println(i) end # iterable
|
|
```
|
|
|
|
A tuple is an **inhomogeneous** type. Each element has its own type, which is reflected in the tuple's type:
|
|
|
|
```{julia}
|
|
typeof(t)
|
|
```
|
|
|
|
|
|
Tuples are frequently used as function return values to return more than one object.
|
|
|
|
```{julia}
|
|
# Integer division and remainder:
|
|
# quotient and remainder are assigned to variables q and r
|
|
|
|
q, r = divrem(71, 6)
|
|
@show q r;
|
|
```
|
|
As you can see here, parentheses can be omitted in certain constructs.
|
|
This *implicit tuple packing/unpacking* is also commonly used in multiple assignments:
|
|
|
|
|
|
```{julia}
|
|
x, y, z = 12, 17, 203
|
|
```
|
|
|
|
|
|
```{julia}
|
|
y
|
|
```
|
|
|
|
Some functions require tuples as arguments or always return tuples. Then you sometimes need a tuple with a single element.
|
|
|
|
This is written as:
|
|
|
|
```{julia}
|
|
x = (13,) # a 1-element tuple
|
|
```
|
|
|
|
The comma - not the parentheses - makes the tuple.
|
|
|
|
```{julia}
|
|
x= (13) # not a tuple
|
|
```
|
|
|
|
|
|
## Ranges
|
|
|
|
We have already used *range* objects in numerical `for` loops.
|
|
|
|
```{julia}
|
|
r = 1:1000
|
|
typeof(r)
|
|
```
|
|
There are various *range* types. As you can see, they are parameterized types based on the numeric type, and `UnitRange` is, for example, a *range* with step size 1. Their constructors are usually named `range()`.
|
|
|
|
The colon is a special syntax.
|
|
|
|
- `a:b` is parsed as `range(a, b)`
|
|
- `a:b:c` is parsed as `range(a, c, step=b)`
|
|
|
|
|
|
|
|
*Ranges* are obviously iterable, not mutable, but indexable.
|
|
|
|
```{julia}
|
|
(3:100)[20] # the 20th element
|
|
```
|
|
|
|
|
|
|
|
|
|
Recall the semantics of the `for` loop: `for i in 1:1000` means **not**:
|
|
|
|
- 'The loop variable `i` is incremented by one in each iteration' **but rather**
|
|
- 'The loop variable is successively assigned the values 1,2,3,...,1000 from the container'.
|
|
|
|
However, it would be very inefficient to actually create this container explicitly.
|
|
|
|
- _Ranges_ are "lazy" vectors that are never really stored as a concrete list anywhere. This makes them so useful as iterators in `for` loops: memory-efficient and fast.
|
|
- They are "recipes" or generators that respond to the query "Give me your next element!".
|
|
- In fact, the supertype `AbstractRange` is a subtype of `AbstractVector`.
|
|
|
|
The macro `@allocated` outputs how many bytes of memory were allocated during the evaluation of an expression.
|
|
|
|
```{julia}
|
|
@allocated r = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
|
|
```
|
|
|
|
|
|
```{julia}
|
|
@allocated r = 1:20
|
|
```
|
|
|
|
|
|
|
|
|
|
The function `collect()` is used to convert to a "real" vector.
|
|
|
|
|
|
```{julia}
|
|
collect(20:-3:1)
|
|
```
|
|
|
|
Quite useful, e.g., when preparing data for plotting, is the *range* type `LinRange`.
|
|
|
|
```{julia}
|
|
LinRange(2, 50, 300)
|
|
```
|
|
`LinRange(start, stop, n)` generates an equidistant list of `n` values where the first and last are the specified limits.
|
|
With `collect()` you can also obtain the corresponding vector if needed.
|
|
|
|
|
|
## Dictionaries
|
|
|
|
- _Dictionaries_ (German: "associative list" or "lookup table" or ...) are special containers.
|
|
- Entries in a vector `v` are addressable by an index 1,2,3....: `v[i]`
|
|
- Entries in a _dictionary_ are addressable by more general _keys_.
|
|
- A _dictionary_ is a collection of _key-value_ pairs.
|
|
- Thus, _dictionaries_ in Julia have the parameterized type `Dict{S,T}`, where `S` is the type of _keys_ and `T` is the type of _values_.
|
|
|
|
|
|
They can be created explicitly:
|
|
```{julia}
|
|
# Population in 2020 in millions, source: wikipedia
|
|
|
|
EW = Dict("Berlin" => 3.66, "Hamburg" => 1.85,
|
|
"München" => 1.49, "Köln" => 1.08)
|
|
```
|
|
|
|
|
|
```{julia}
|
|
typeof(EW)
|
|
```
|
|
|
|
and indexed with the _keys_:
|
|
|
|
```{julia}
|
|
EW["Berlin"]
|
|
```
|
|
|
|
Of course, querying a non-existent _key_ is an error.
|
|
```{julia}
|
|
EW["Leipzig"]
|
|
```
|
|
|
|
You can also ask beforehand...
|
|
```{julia}
|
|
haskey(EW, "Leipzig")
|
|
```
|
|
|
|
... or use the function `get(dict, key, default)`, which does not throw an error for non-existent keys but returns the third argument.
|
|
|
|
```{julia}
|
|
@show get(EW, "Leipzig", -1) get(EW, "Berlin", -1);
|
|
```
|
|
|
|
You can also request all `keys` and `values` as special containers.
|
|
```{julia}
|
|
keys(EW)
|
|
```
|
|
|
|
|
|
```{julia}
|
|
values(EW)
|
|
```
|
|
|
|
|
|
You can iterate over the `keys`...
|
|
```{julia}
|
|
for i in keys(EW)
|
|
n = EW[i]
|
|
println("The city $i has $n million inhabitants.")
|
|
end
|
|
```
|
|
|
|
or directly over `key-value` pairs.
|
|
```{julia}
|
|
for (stadt, ew) ∈ EW
|
|
println("$stadt : $ew Million.")
|
|
end
|
|
```
|
|
|
|
### Extending and Modifying
|
|
|
|
You can add additional `key-value` pairs to a `Dict`...
|
|
```{julia}
|
|
EW["Leipzig"] = 0.52
|
|
EW["Dresden"] = 0.52
|
|
EW
|
|
```
|
|
|
|
|
|
and change a `value`.
|
|
```{julia}
|
|
# Oh, the Leipzig number was from 2010, not 2020
|
|
|
|
EW["Leipzig"] = 0.597
|
|
EW
|
|
```
|
|
|
|
A pair can also be deleted via its `key`.
|
|
```{julia}
|
|
delete!(EW, "Dresden")
|
|
```
|
|
|
|
Many functions can work with `Dicts` like with other containers.
|
|
|
|
```{julia}
|
|
maximum(values(EW))
|
|
```
|
|
|
|
### Creating an Empty Dictionary
|
|
|
|
Without type specification ...
|
|
```{julia}
|
|
d1 = Dict()
|
|
```
|
|
|
|
and with type specification:
|
|
```{julia}
|
|
d2 = Dict{String, Int}()
|
|
```
|
|
|
|
### Conversion to Vectors: `collect()`
|
|
|
|
- `keys(dict)` and `values(dict)` are special data types.
|
|
- The function `collect()` converts them to a `Vector` type.
|
|
- `collect(dict)` returns a list of type `Vector{Pair{S,T}}`
|
|
|
|
|
|
```{julia}
|
|
collect(EW)
|
|
```
|
|
|
|
|
|
```{julia}
|
|
collect(keys(EW)), collect(values(EW))
|
|
```
|
|
|
|
### Ordered Iteration over a Dictionary
|
|
|
|
We sort the keys. As strings, they are sorted alphabetically. With the `rev` parameter, sorting is done in reverse order.
|
|
```{julia}
|
|
for k in sort(collect(keys(EW)), rev = true)
|
|
n = EW[k]
|
|
println("$k has $n million inhabitants ")
|
|
end
|
|
```
|
|
|
|
We sort `collect(dict)`. This is a vector of pairs. With `by` we define what to sort by: the second element of the pair.
|
|
|
|
```{julia}
|
|
for (k,v) in sort(collect(EW), by = pair -> last(pair), rev=false)
|
|
println("$k has $v million inhabitants")
|
|
end
|
|
```
|
|
|
|
### An Application of Dictionaries: Counting Frequencies
|
|
|
|
We do "experimental probability" with 2 dice:
|
|
|
|
Given `l`, a list with the results of 100,000 double dice rolls, i.e., 100,000 numbers between 2 and 12.
|
|
|
|
How frequently do the numbers 2 to 12 occur?
|
|
|
|
|
|
We (let) roll:
|
|
|
|
```{julia}
|
|
|
|
l = rand(1:6, 100_000) .+ rand(1:6, 100_000)
|
|
```
|
|
|
|
We count the frequencies of the events using a dictionary. We take the event as the `key` and its frequency as the `value`.
|
|
```{julia}
|
|
# In this case, one could also solve this with a simple vector.
|
|
# A better illustration would be, e.g., word frequency in
|
|
# a text. Then i is not an integer but a word=string
|
|
|
|
d = Dict{Int,Int}() # the dict for counting
|
|
|
|
for i in l # for each i, d[i] is incremented.
|
|
d[i] = get(d, i, 0) + 1
|
|
end
|
|
d
|
|
```
|
|
|
|
The result:
|
|
|
|
```{julia}
|
|
using Plots
|
|
|
|
plot(collect(keys(d)), collect(values(d)), seriestype=:scatter)
|
|
```
|
|
|
|
##### The explanatory image:
|
|
|
|
[https://math.stackexchange.com/questions/1204396/why-is-the-sum-of-the-rolls-of-two-dices-a-binomial-distribution-what-is-define](https://math.stackexchange.com/questions/1204396/why-is-the-sum-of-the-rolls-of-two-dices-a-binomial-distribution-what-is-define)
|