{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "81cc02aa",
   "metadata": {},
   "source": [
    "# Container\n",
    "\n",
    "\n",
    "Julia bietet eine große Auswahl von Containertypen mit weitgehend ähnlichem Interface an. \n",
    "Wir stellen hier `Tuple`, `Range` und `Dict` vor, im nächsten Kapitel dann  `Array`, `Vector` und `Matrix`. \n",
    "\n",
    "Diese Container sind: \n",
    "\n",
    "- **iterierbar:** Man kann über die Elemente des Containers iterieren: \n",
    "```julia \n",
    "for x ∈ container ... end\n",
    "```\n",
    "- **indizierbar:** Man kann auf Elemente über ihren Index zugreifen: \n",
    "```julia\n",
    "x = container[i]\n",
    "``` \n",
    "und einige sind auch   \n",
    "\n",
    "- **mutierbar**: Man kann Elemente hinzufügen, entfernen und  ändern.\n",
    "\n",
    "Weiterhin gibt es eine Reihe gemeinsamer Funktionen, z.B.\n",
    "\n",
    "- `length(container)`  --- Anzahl der Elemente\n",
    "- `eltype(container)`  --- Typ der Elemente\n",
    "- `isempty(container)` --- Test, ob Container leer ist\n",
    "- `empty!(container)`  --- leert Container (nur wenn mutierbar)\n",
    "\n",
    "\n",
    "## Tupeln\n",
    "\n",
    "Ein Tupel ist ein nicht mutierbarer Container von Elementen. Es ist also nicht möglich, neue Elemente dazuzufügen oder den Wert eines Elements zu ändern.   \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ec893f0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "t = (33, 4.5, \"Hello\")\n",
    "\n",
    "@show   t[2]               # indizierbar\n",
    "\n",
    "for i ∈ t println(i) end   # iterierbar"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21549769",
   "metadata": {},
   "source": [
    "Ein Tupel ist ein **inhomogener** Typ. Jedes Element hat seinen eigenen Typ und das zeigt sich auch im Typ des Tupels:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5add8abb",
   "metadata": {},
   "outputs": [],
   "source": [
    "typeof(t)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8bc972e",
   "metadata": {},
   "source": [
    "Man verwendet Tupel gerne als Rückgabewerte von Funktionen, um mehr als ein Objekt zurückzulieferen.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85b3f1fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Ganzzahldivision und Rest: \n",
    "# Quotient und Rest werden den Variablen q und r zugewiesen\n",
    "\n",
    "q, r = divrem(71, 6)\n",
    "@show  q  r;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d144679d",
   "metadata": {},
   "source": [
    "Wie man hier sieht, kann man in bestimmten Konstrukten die Klammern auch weglassen.\n",
    "Dieses *implict tuple packing/unpacking* verwendet man auch gerne in Mehrfachzuweisungen:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2ef18811",
   "metadata": {},
   "outputs": [],
   "source": [
    "x, y, z = 12, 17, 203"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "759e8783",
   "metadata": {},
   "outputs": [],
   "source": [
    "y"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2bed8606",
   "metadata": {},
   "source": [
    "Manche Funktionen bestehen auf Tupeln als Argument oder geben immer Tupeln zurück. Dann braucht man manchmal ein Tupel aus einem Element. \n",
    "\n",
    "Das notiert man so:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e985c23d",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = (13,)         # ein 1-Element-Tupel"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "174aed3a",
   "metadata": {},
   "source": [
    "Das Komma - und nicht die Klammern -- macht das Tupel. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8359e689",
   "metadata": {},
   "outputs": [],
   "source": [
    "x= (13)         # kein Tupel"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5fb2b62",
   "metadata": {},
   "source": [
    "## Ranges\n",
    "\n",
    "Wir haben *range*-Objekte schon in numerischen `for`-Schleifen verwendet.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba823baf",
   "metadata": {},
   "outputs": [],
   "source": [
    "r = 1:1000\n",
    "typeof(r)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6dd2ebf4",
   "metadata": {},
   "source": [
    "Es gibt verschiedene *range*-Typen. Wie man sieht, sind es über den Zahlentyp parametrisierte Typen und `UnitRange` ist z.B. ein  *range* mit der Schrittweite 1. Ihre Konstruktoren heißen in der Regel `range()`. \n",
    "\n",
    "Der Doppelpunkt ist eine spezielle Syntax. \n",
    "\n",
    "- `a:b`  wird vom Parser umgesetzt zu `range(a, b)` \n",
    "- `a:b:c` wird umgesetzt zu `range(a, c, step=b)`\n",
    "\n",
    "\n",
    "*Ranges* sind  offensichtlich iterierbar, nicht mutierbar aber indizierbar.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ee0a9ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "(3:100)[20]   # das zwanzigste Element"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4524b63",
   "metadata": {},
   "source": [
    "Wir erinnern an die Semantik der `for`-Schleife: `for i in 1:1000` heißt **nicht**:\n",
    "\n",
    "- 'Die Schleifenvariable `i` wird bei jedem Durchlauf um eins erhöht' **sondern** \n",
    "- 'Der Schleifenvariable werden nacheinander die Werte 1,2,3,...,1000 aus dem Container zugewiesen'. \n",
    "\n",
    "Allerdings wäre es sehr ineffektiv, diesen Container tatsächlich explizit anzulegen. \n",
    "\n",
    "- _Ranges_ sind \"lazy\" Vektoren, die nie wirklich irgendwo als konkrete Liste abgespeichert werden. Das macht sie als Iteratoren in `for`-Schleifen so nützlich: speichersparend und schnell.\n",
    "- Sie sind 'Rezepte' oder Generatoren, die auf die Abfrage 'Gib mir dein nächstes Element!' antworten.\n",
    "- Tatsächlich ist der Muttertyp `AbstractRange`  ein Subtyp von `AbstractVector`.\n",
    "\n",
    "Das Macro `@allocated` gibt aus, wieviel Bytes an Speicher bei der Auswertung eines Ausdrucks alloziert wurden.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7049d454",
   "metadata": {},
   "outputs": [],
   "source": [
    "@allocated [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dbc66a57",
   "metadata": {},
   "outputs": [],
   "source": [
    "@allocated 1:20"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d98f5f4",
   "metadata": {},
   "source": [
    "Zum Umwandeln in einen 'richtigen' Vektor dient die Funktion `collect()`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fe033c47",
   "metadata": {},
   "outputs": [],
   "source": [
    "collect(20:-3:1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a94e5258",
   "metadata": {},
   "source": [
    "Recht nützlich, z.B. beim Vorbereiten von Daten zum Plotten, ist der *range*-Typ `LinRange`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e4b40085",
   "metadata": {},
   "outputs": [],
   "source": [
    "LinRange(2, 50, 300)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bde8b38f",
   "metadata": {},
   "source": [
    "`LinRange(start, stop, n)` erzeugt eine äquidistante Liste von `n` Werten von denen der erste und der letzte die vorgegebenen Grenzen sind. \n",
    "Mit `collect()` kann man bei Bedarf auch daraus den entsprechenden Vektor gewinnen.\n",
    "\n",
    "\n",
    "## Dictionaries \n",
    "\n",
    "- _Dictionaries_ (deutsch: \"assoziative Liste\" oder \"Zuordnungstabelle\" oder ...) sind spezielle Container. \n",
    "- Einträge in einem Vektor `v` sind durch einen Index 1,2,3.... addressierbar: `v[i]`\n",
    "- Einträge in einem _dictionary_ sind durch allgemeinere _keys_ addressierbar. \n",
    "- Ein _dictionary_ ist eine Ansammlung von _key-value_-Paaren.\n",
    "- Damit haben _dictionaries_ in Julia  den parametrisierten Typ `Dict{S,T}`, wobei `S` der Typ der _keys_ und `T` der Typ der _values_ ist\n",
    "\n",
    "\n",
    "Man kann sie explizit anlegen:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25c7aefc",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Einwohner 2020 in Millionen, Quelle: wikipedia\n",
    "\n",
    "EW = Dict(\"Berlin\" => 3.66,  \"Hamburg\" => 1.85, \n",
    "          \"München\" => 1.49, \"Köln\" => 1.08)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9620f365",
   "metadata": {},
   "outputs": [],
   "source": [
    "typeof(EW)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3b4d2af",
   "metadata": {},
   "source": [
    "und mit den _keys_ indizieren:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "62938a1b",
   "metadata": {},
   "outputs": [],
   "source": [
    "EW[\"Berlin\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73ddf61d",
   "metadata": {},
   "source": [
    "Das Abfragen eines nicht existierenden _keys_ ist natürlich ein Fehler.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e16eb41",
   "metadata": {},
   "outputs": [],
   "source": [
    "EW[\"Leipzig\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49a6d001",
   "metadata": {},
   "source": [
    "Man kann ja auch vorher mal anfragen...\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e43aec5d",
   "metadata": {},
   "outputs": [],
   "source": [
    "haskey(EW, \"Leipzig\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1cefaa1",
   "metadata": {},
   "source": [
    "... oder die Funktion `get(dict, key, default)` benutzen, die bei nicht existierendem Key keinen Fehler wirft sondern das 3. Argument zurückgibt. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e31c792c",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show get(EW, \"Leipzig\", -1)   get(EW, \"Berlin\", -1);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f625eaaa",
   "metadata": {},
   "source": [
    "Man kann sich auch alle `keys` und `values` als spezielle Container geben lassen.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21460e3a",
   "metadata": {},
   "outputs": [],
   "source": [
    "keys(EW)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a87d9788",
   "metadata": {},
   "outputs": [],
   "source": [
    "values(EW)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "97b96369",
   "metadata": {},
   "source": [
    "Man kann über die `keys` iterieren...\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "41e9189f",
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in keys(EW)\n",
    "    n = EW[i]\n",
    "    println(\"Die Stadt $i hat $n Millionen Einwohner.\")\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2fe653a",
   "metadata": {},
   "source": [
    "odere gleich über `key-value`-Paare.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0b445207",
   "metadata": {},
   "outputs": [],
   "source": [
    "for (stadt, ew) ∈ EW\n",
    "    println(\"$stadt : $ew  Mill.\")\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6fcd4019",
   "metadata": {},
   "source": [
    "### Erweitern und Modifizieren\n",
    "\n",
    "Man kann in ein `Dict` zusätzliche `key-value`-Paare eintragen...\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "94c4f076",
   "metadata": {},
   "outputs": [],
   "source": [
    "EW[\"Leipzig\"] = 0.52\n",
    "EW[\"Dresden\"] = 0.52 \n",
    "EW"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed90d684",
   "metadata": {},
   "source": [
    "und einen `value`  ändern.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c57e8dd2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Oh, das war bei Leipzig die Zahl von 2010, nicht 2020\n",
    "\n",
    "EW[\"Leipzig\"] = 0.597\n",
    "EW"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d0eadd7",
   "metadata": {},
   "source": [
    "Ein Paar kann über seinen `key` auch gelöscht werden.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0dfd9f2d",
   "metadata": {},
   "outputs": [],
   "source": [
    "delete!(EW, \"Dresden\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3db2aa1f",
   "metadata": {},
   "source": [
    "Zahlreiche Funktionen können mit `Dicts` wie mit anderen Containern arbeiten.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1204aaab",
   "metadata": {},
   "outputs": [],
   "source": [
    "maximum(values(EW))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a4946b4",
   "metadata": {},
   "source": [
    "### Anlegen eines leeren Dictionaries\n",
    "\n",
    "Ohne Typspezifikation ...\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1f82ac79",
   "metadata": {},
   "outputs": [],
   "source": [
    "d1 = Dict()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27c98eaf",
   "metadata": {},
   "source": [
    "und mit Typspezifikation:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1b8f201e",
   "metadata": {},
   "outputs": [],
   "source": [
    "d2 = Dict{String, Int}()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7fe71a3b",
   "metadata": {},
   "source": [
    "### Umwandlung in Vektoren: `collect()`\n",
    "\n",
    "- `keys(dict)` und `values(dict)` sind spezielle Datentypen. \n",
    "-  Die Funktion `collect()` macht daraus eine Liste vom Typ `Vector`. \n",
    "-  `collect(dict)` liefert eine Liste vom Typ `Vector{Pair{S,T}}` \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2dc85bdd",
   "metadata": {},
   "outputs": [],
   "source": [
    "collect(EW)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "100dbbf1",
   "metadata": {},
   "outputs": [],
   "source": [
    "collect(keys(EW)), collect(values(EW))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f9703f73",
   "metadata": {},
   "source": [
    "### Geordnetes Iterieren über ein Dictionary\n",
    "\n",
    "Wir sortieren die Keys. Als Strings werden sie alphabetisch sortiert. Mit dem `rev`-Parameter wird rückwärts sortiert.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3c02f0ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "for k in sort(collect(keys(EW)), rev = true)\n",
    "    n = EW[k]\n",
    "    println(\"$k hat $n Millionen Einw.  \")\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c20f0654",
   "metadata": {},
   "source": [
    "Wir sortieren `collect(dict)`. Das ist ein Vektor von Paaren. Mit `by` definieren wir, wonach zu sortieren ist: nach dem 2. Element des Paares. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "89ee04e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "for (k,v) in sort(collect(EW), by = pair -> last(pair), rev=false)\n",
    "    println(\"$k hat $v Mill. EW\")\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94ba9752",
   "metadata": {},
   "source": [
    "### Eine Anwendung von Dictionaries: Zählen von Häufigkeiten\n",
    "\n",
    "Wir machen 'experimentelle Stochastik' mit 2 Würfeln: \n",
    "\n",
    "Gegeben sei `l`, eine Liste mit den Ergebnissen von 100 000 Pasch-Würfen, also 100 000 Zahlen zwischen 2 und 12.\n",
    "\n",
    "Wie häufig sind die Zahlen 2  bis 12?\n",
    "\n",
    "\n",
    "Wir (lassen) würfeln:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a7ff829d",
   "metadata": {},
   "outputs": [],
   "source": [
    "l = rand(1:6, 100_000) .+ rand(1:6, 100_000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "48a58503",
   "metadata": {},
   "source": [
    "Wir zählen mit Hilfe eines Dictionaries die Häufigkeiten der Ereignisse. Dazu nehmen wir das Ereignis als `key` und seine Häufigkeit als `value`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dd2bcd79",
   "metadata": {},
   "outputs": [],
   "source": [
    "# In diesem Fall könnte man das auch mit einem einfachen Vektor\n",
    "# lösen. Eine bessere Illustration wäre z.B. Worthäufigkeit in\n",
    "# einem Text. Dann ist i keine ganze Zahl sondern ein Wort=String\n",
    "\n",
    "d = Dict{Int,Int}()     # das Dict zum 'reinzählen'\n",
    "\n",
    "for i in l                    # für jedes i wird d[i] erhöht.      \n",
    "    d[i] = get(d, i, 0) + 1   \n",
    "end\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62fe53b9",
   "metadata": {},
   "source": [
    "Das Ergebnis:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "154252d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "using Plots\n",
    "\n",
    "plot(collect(keys(d)), collect(values(d)), seriestype=:scatter)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd482553",
   "metadata": {},
   "source": [
    "##### Das Erklär-Bild dazu:\n",
    "\n",
    "[https://math.stackexchange.com/questions/1204396/why-is-the-sum-of-the-rolls-of-two-dices-a-binomial-distribution-what-is-define](https://math.stackexchange.com/questions/1204396/why-is-the-sum-of-the-rolls-of-two-dices-a-binomial-distribution-what-is-define)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.10.2",
   "language": "julia",
   "name": "julia-1.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}