Files
JuliaKurs23/chapters/6_ArraysEtcP1.qmd
2026-03-05 20:09:16 +01:00

335 lines
7.3 KiB
Plaintext

---
engine: julia
---
# Containers
Julia offers a wide selection of container types with largely similar interfaces.
This chapter introduces `Tuple`, `Range`, and `Dict`; the next chapter covers `Array`, `Vector`, and `Matrix`.
These containers are:
- **iterable:** You can iterate over the elements of the container:
```julia
for x ∈ container ... end
```
- **indexable:** You can access elements via their index:
```julia
x = container[i]
```
and some are also
- **mutable:** You can add, remove, and modify elements.
Furthermore, there are several common functions, e.g.,
- `length(container)` --- number of elements
- `eltype(container)` --- element type
- `isempty(container)` --- test if container is empty
- `empty!(container)` --- empties the container (if mutable)
## Tuples
A tuple is an immutable container of elements. You cannot add new elements or change existing values.
```{julia}
t = (33, 4.5, "Hello")
@show t[2] # indexable
for i ∈ t println(i) end # iterable
```
A tuple is an **inhomogeneous** type. Each element has its own type, which is reflected in the tuple's type:
```{julia}
typeof(t)
```
Tuples are frequently used as function return values to return more than one object.
```{julia}
# Integer division and remainder:
# Assign quotient and remainder to variables `q` and `r`:
q, r = divrem(71, 6)
@show q r;
```
Parentheses can be omitted in certain constructs.
This *implicit tuple packing/unpacking* is commonly used in multiple assignments:
```{julia}
x, y, z = 12, 17, 203
```
```{julia}
y
```
Some functions require tuples as arguments or always return tuples. A single-element tuple is written as:
```{julia}
x = (13,) # a 1-element tuple
```
The comma - not the parentheses - makes the tuple.
```{julia}
x= (13) # not a tuple
```
## Ranges
We have already used *range* objects in numerical `for` loops.
```{julia}
r = 1:1000
typeof(r)
```
There are various *range* types. `UnitRange`, for example, is a *range* with step size 1. Their constructors are typically all named `range()`.
The colon is a special syntax.
- `a:b` is parsed as `range(a, b)`
- `a:b:c` is parsed as `range(a, c, step=b)`
*Ranges* are iterable, immutable, and indexable.
```{julia}
(3:100)[20] # the 20th element
```
Recall the semantics of the `for` loop: `for i in 1:1000` does **not** mean 'increment the loop variable `i` by one each iteration'; **rather**, it means 'successively assign the values 1, 2, 3, ..., 1000 to the loop variable from the container'.
Creating this container explicitly would be very inefficient.
- _Ranges_ are "lazy" vectors never stored as concrete lists. This makes them ideal as `for` loop iterators: memory-efficient and fast.
- They are "recipes" or generators that respond to the query "Give me your next element!".
- In fact, the supertype `AbstractRange` is a subtype of `AbstractVector`.
The macro `@allocated` outputs how many bytes of memory were allocated during the evaluation of an expression.
```{julia}
@allocated r = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
```
```{julia}
@allocated r = 1:20
```
The `collect()` function converts a range to a concrete vector.
```{julia}
collect(20:-3:1)
```
Quite useful, e.g., when preparing data for plotting, is the *range* type `LinRange`.
```{julia}
LinRange(2, 50, 300)
```
`LinRange(start, stop, n)` generates `n` equidistant values from start to stop. Use `collect()` to obtain the corresponding vector if needed.
## Dictionaries
- _Dictionaries_ (also known as associative arrays or lookup tables) are special containers.
- Whereas vector entries are addressed by integer indices: `v[i]`; dictionary entries are addressed by more general _keys_.
- A dictionary is a collection of _key-value_ pairs with parameterized type `Dict{S,T}`, where `S` is the key type and `T` is the value type.
Create a dictionary explicitly:
```{julia}
# Population in 2020 in millions, source: wikipedia
Ppl = Dict("Berlin" => 3.66, "Hamburg" => 1.85,
"München" => 1.49, "Köln" => 1.08)
```
```{julia}
typeof(Ppl)
```
and indexed with the _keys_:
```{julia}
Ppl["Berlin"]
```
Querying a non-existent _key_ throws an error.
```{julia}
Ppl["Leipzig"]
```
Check beforehand with `haskey()`...
```{julia}
haskey(Ppl, "Leipzig")
```
Or use `get(dict, key, default)`, which returns the default value instead of throwing an error.
```{julia}
@show get(Ppl, "Leipzig", -1) get(Ppl, "Berlin", -1);
```
You can also request all `keys` and `values` as special containers.
```{julia}
keys(Ppl)
```
```{julia}
values(Ppl)
```
Iterate over the `keys`...
```{julia}
for i in keys(Ppl)
n = Ppl[i]
println("The city $i has $n million inhabitants.")
end
```
Or iterate directly over `key-value` pairs.
```{julia}
for (city, pop) ∈ Ppl
println("$city : $pop Million.")
end
```
### Extending and Modifying
Add `key-value` pairs to a `Dict`...
```{julia}
Ppl["Leipzig"] = 0.52
Ppl["Dresden"] = 0.52
Ppl
```
Change a `value`:
```{julia}
# Update: Leipzig data was from 2010, not 2020
Ppl["Leipzig"] = 0.597
Ppl
```
Delete a pair by its `key`:
```{julia}
delete!(Ppl, "Dresden")
```
Many functions work with `Dicts` like other containers.
```{julia}
maximum(values(Ppl))
```
### Creating an Empty Dictionary
Without explicit types:
```{julia}
d1 = Dict()
```
With explicit types:
```{julia}
d2 = Dict{String, Int}()
```
### Conversion to Vectors: `collect()`
- `keys(dict)` and `values(dict)` return special container types.
- `collect()` converts them to `Vector`s.
- `collect(dict)` returns a `Vector{Pair{S,T}}`.
```{julia}
collect(Ppl)
```
```{julia}
collect(keys(Ppl)), collect(values(Ppl))
```
### Ordered Iteration over a Dictionary
We sort the keys. As strings, they are sorted alphabetically. With the `rev` parameter, sorting is done in reverse order.
```{julia}
for k in sort(collect(keys(Ppl)), rev = true)
n = Ppl[k]
println("$k has $n million inhabitants ")
end
```
Let's sort `collect(dict)`, a vector of pairs. Use `by` to specify the sort key: the second element of each pair.
```{julia}
for (k,v) in sort(collect(Ppl), by = pair -> last(pair), rev=false)
println("$k has $v million inhabitants")
end
```
### An Application of Dictionaries: Counting Frequencies
Let's do "experimental stochastics" with 2 dice:
Let `l` be a vector containing 100,000 sums of two dice rolls (numbers from 2 to 12).
How frequently does each number from 2 to 12 occur?
Roll the dice:
```{julia}
l = rand(1:6, 100_000) .+ rand(1:6, 100_000)
```
Count event frequencies using a dictionary. Use the event as the `key` and its frequency as the `value`.
```{julia}
# In this case, a simple vector would also work.
# A better use case for dictionaries is word frequency in texts,
# where keys are strings instead of integers.
d = Dict{Int,Int}() # dictionary for counting
for i in l # for each i, increment d[i]
d[i] = get(d, i, 0) + 1
end
d
```
Result:
```{julia}
using Plots
plot(collect(keys(d)), collect(values(d)), seriestype=:scatter)
```
Explanatory image:
[https://math.stackexchange.com/questions/1204396/why-is-the-sum-of-the-rolls-of-two-dices-a-binomial-distribution-what-is-define](https://math.stackexchange.com/questions/1204396/why-is-the-sum-of-the-rolls-of-two-dices-a-binomial-distribution-what-is-define)