JuliaKurs23/nb/numerictypes.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3a993968",
   "metadata": {},
   "source": [
    "# Maschinenzahlen\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4f4dbb6a",
   "metadata": {},
   "outputs": [],
   "source": [
    "for x ∈ ( 3, 3.3e4,  Int16(20), Float32(3.3e4), UInt16(9) )\n",
    "    @show x sizeof(x) typeof(x)\n",
    "    println()\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "82bbdd01",
   "metadata": {},
   "source": [
    "## Ganze Zahlen *(integers)*\n",
    "\n",
    "Ganze Zahlen werden grundsätzlich als Bitmuster fester Länge gespeichert. Damit ist der Wertebereich endlich. \n",
    "**Innerhalb dieses Wertebereichs** sind Addition, Subtraktion, Multiplikation und die Ganzzahldivision mit Rest \n",
    "exakte Operationen ohne Rundungsfehler. \n",
    "\n",
    "Ganze Zahlen gibt es in den zwei Sorten `Signed` (mit Vorzeichen) und `Unsigned`, die man als Maschinenmodelle  für ℤ bzw. ℕ auffassen kann.\n",
    "\n",
    "\n",
    "### *Unsigned integers*\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ab84c4da",
   "metadata": {},
   "outputs": [],
   "source": [
    "subtypes(Unsigned)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f458fcd",
   "metadata": {},
   "source": [
    "UInts sind Binärzahlen mit  n=8, 16, 32, 64 oder 128 Bits Länge und einem entsprechenden Wertebereich von\n",
    "$$\n",
    "0 \\le x < 2^n \n",
    "$$\n",
    "Sie werden im *scientific computing* eher selten verwendet. Bei hardwarenaher Programmierung dienen sie z.B. dem Umgang mit Binärdaten und Speicheradressen. Deshalb werden sie von Julia standardmäßig auch als Hexadezimalzahlen (mit Präfix `0x` und Ziffern `0-9a-f`) dargestellt.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b6b30a0",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 0x0033efef\n",
    "@show x typeof(x) Int(x)\n",
    "\n",
    "z = UInt(32)\n",
    "@show z typeof(z);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25774520",
   "metadata": {},
   "source": [
    "### *Signed Integers* \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "341e9267",
   "metadata": {},
   "outputs": [],
   "source": [
    "subtypes(Signed)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8ea7009",
   "metadata": {},
   "source": [
    "*Integers* haben den Wertebereich\n",
    "$$\n",
    "-2^{n-1} \\le x < 2^{n-1}  \n",
    "$$\n",
    "\n",
    "\n",
    "In Julia sind ganze Zahlen standardmäßig 64 Bit groß:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79490021",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 42\n",
    "typeof(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "295048cd",
   "metadata": {},
   "source": [
    "Sie haben daher den Wertebereich: \n",
    "$$\n",
    "-9.223.372.036.854.775.808 \\le x \\le 9.223.372.036.854.775.807\n",
    "$$\n",
    "\n",
    "Negative Zahlen werden im Zweierkomplement dargestellt:\n",
    "\n",
    "$x \\Rightarrow -x$ entspricht: _flip all bits, then add 1_\n",
    "\n",
    "Das sieht so aus:\n",
    "\n",
    "::: {.content-visible when-format=\"html\"}\n",
    "![Eine Darstellung des fiktiven Datentyps `Int4`](../images/Int4.png){width=50%}\n",
    ":::\n",
    "\n",
    "::: {.content-visible when-format=\"pdf\"}\n",
    "![Eine Darstellung des fiktiven Datentyps `Int4`](../images/Int4.png){width=50%}\n",
    ":::\n",
    "\n",
    "Das höchstwertige Bit zeigt das Vorzeichen an. Sein „Umkippen“ entspricht allerdings nicht der Operation `x → -x`.\n",
    "\n",
    ":::{.callout-important}\n",
    "Die gesamte Ganzzahlarithmetik ist eine __Arithmetik modulo $2^n$__\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "677353d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 2^62 - 10 + 2^62"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7da20d70",
   "metadata": {},
   "outputs": [],
   "source": [
    "x + 20"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2cdce6d7",
   "metadata": {},
   "source": [
    "Keine Fehlermeldung, keine Warnung! Ganzzahlen fester Länge liegen nicht auf einer Geraden, sondern auf einem Kreis!\n",
    "\n",
    ":::\n",
    "\n",
    "Schauen wir uns ein paar *Integers* mal an:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "098d82f4",
   "metadata": {},
   "outputs": [],
   "source": [
    "using Printf\n",
    "\n",
    "for i in (1, 2, 7, -7, 2^50, -1, Int16(7), Int8(7))\n",
    "    s = bitstring(i)\n",
    "    t = typeof(i)\n",
    "    @printf \"%20i %-5s ⇒  %s\\n\" i t s \n",
    "end "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "388bed6e",
   "metadata": {},
   "source": [
    "... und noch etwas mehr *introspection*:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e164bb25",
   "metadata": {},
   "outputs": [],
   "source": [
    "typemin(Int64), typemax(Int64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2c94b3df",
   "metadata": {},
   "outputs": [],
   "source": [
    "typemin(UInt64), typemax(UInt64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d2f77c9",
   "metadata": {},
   "outputs": [],
   "source": [
    "typemin(Int8), typemax(Int8)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74d88d56",
   "metadata": {},
   "source": [
    "## Arithmetik ganzer Zahlen\n",
    "\n",
    "#### Addition, Multiplikation\n",
    "Die Operationen `+`,`-`,`*` haben die übliche exakte Arithmetik **modulo $2^{64}$**.\n",
    "\n",
    "#### Potenzen `a^b`\n",
    "\n",
    "- Potenzen `a^n` werden für natürliche Exponenten `n` ebenfalls modulo $2^{64}$ exakt berechnet. \n",
    "- Für negative Exponenten ist das Ergebnis eine Gleitkommazahl.\n",
    "- `0^0` ist [selbstverständlich](https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero#cite_note-T4n3B-4) gleich 1.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "46b5e26a",
   "metadata": {},
   "outputs": [],
   "source": [
    "(-2)^63, 2^64, 3^(-3), 0^0 "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0187fb1e",
   "metadata": {},
   "source": [
    "- Für natürliche Exponenten wird [*exponentiation by squaring*](https://de.wikipedia.org/wiki/Bin%C3%A4re_Exponentiation) verwendet, so dass z.B. `x^23` nur 7 Multiplikationen benötigt: \n",
    "$$\n",
    "x^{23} = \\left( \\left( (x^2)^2 \\cdot x \\right)^2  \\cdot x \\right)^2  \\cdot x \n",
    "$$\n",
    "\n",
    "#### Division\n",
    "\n",
    "- Division `/` erzeugt  eine Gleitkommazahl.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "67cb4661",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 40/5"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94408807",
   "metadata": {},
   "source": [
    "#### Ganzzahldivision und Rest\n",
    "\n",
    "- Die Funktionen `div(a,b)`,  `rem(a,b)`  und `divrem(a,b)` berechnen den Quotient der Ganzzahldivision, den dazugehörigen Rest *(remainder)* bzw. beides als Tupel.  \n",
    "- Für `div(a,b)` gibt es die Operatorform `a ÷ b` (Eingabe: `\\div<TAB>`) und für `rem(a,b)` die Operatorform `a % b`.  \n",
    "- Standardmäßig wird bei der Division „zur Null hin gerundet“, wodurch der dazugehörige Rest dasselbe Vorzeichen wie der Dividend `a` trägt:  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3ccdabcf",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show divrem( 27, 4)\n",
    "@show ( 27 ÷  4,  27 %  4)   \n",
    "@show (-27 ÷  4, -27 %  4 )\n",
    "@show ( 27 ÷ -4,  27 % -4);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5222cddb",
   "metadata": {},
   "source": [
    "- Eine von `RoundToZero` abweichende  Rundungsregel kann bei den Funktionen als optionales 3. Argument angegeben werden. \n",
    "- `?RoundingMode` zeigt die möglichen Rundungsregeln an. \n",
    "- Für die Rundungsregel `RoundDown` („in Richtung  minus unendlich\"), wodurch der dazugehörige Rest dasselbe Vorzeichen wie der Divisor `b` bekommt, gibt es auch die Funktionen `fld(a,b)` *(floored division)* und `mod(a,b)`:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be29f9d4",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show divrem(-27, 4, RoundDown)\n",
    "@show (fld(-27,  4), mod(-27,  4))\n",
    "@show (fld( 27, -4), mod( 27, -4));"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84e604f7",
   "metadata": {},
   "source": [
    "Für alle Rundungsregeln gilt:\n",
    "```\n",
    "div(a, b, RoundingMode) * b + rem(a, b, RoundingMode) = a\n",
    "```\n",
    "\n",
    "#### Der Datentyp `BigInt`\n",
    "\n",
    "Der Datentyp `BigInt` ermöglicht Ganzzahlen beliebiger Länge. Der benötigte Speicher wird dynamisch allokiert.\n",
    "\n",
    "Numerische Konstanten haben automatisch einen ausreichend großen Typ:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad97d8a3",
   "metadata": {},
   "outputs": [],
   "source": [
    "z = 10\n",
    "@show typeof(z)\n",
    "z = 10_000_000_000_000_000    # 10 Billiarden\n",
    "@show typeof(z)\n",
    "z = 10_000_000_000_000_000_000  # 10 Trillionen\n",
    "@show typeof(z)\n",
    "z = 10_000_000_000_000_000_000_000_000_000_000_000_000_000   # 10 Sextilliarden\n",
    "@show typeof(z);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7da46f1",
   "metadata": {},
   "source": [
    "Meist wird man allerdings den Datentyp `BigInt` explizit anfordern müssen, damit nicht modulo $2^{64}$ gerechnet wird:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "89d35c00",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show 3^300        BigInt(3)^300;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16d1d9cc",
   "metadata": {},
   "source": [
    "*Arbitrary precision arithmetic* kostet einiges an Speicherplatz und Rechenzeit.\n",
    "\n",
    "Wir vergleichen den Zeit- und Speicherbedarf bei der Aufsummation von 10 Millionen Ganzzahlen als `Int64` und als `BigInt`. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d7e4a838",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 10^7 Zufallszahlen, gleichverteilt zwischen -10^7 und 10^7 \n",
    "vec_int = rand(-10^7:10^7, 10^7)\n",
    "\n",
    "# Dieselben Zahlen als BigInts\n",
    "vec_bigint = BigInt.(vec_int)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5df8fda6",
   "metadata": {},
   "source": [
    "Einen ersten Eindruck vom Zeit- und Speicherbedarf gibt das `@time`-Macro:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e37fa19f",
   "metadata": {},
   "outputs": [],
   "source": [
    "@time x = sum(vec_int)\n",
    "@show x typeof(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a58adeb1",
   "metadata": {},
   "outputs": [],
   "source": [
    "@time x = sum(vec_bigint)\n",
    "@show x typeof(x);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17081476",
   "metadata": {},
   "source": [
    "Durch die Just-in-Time-Compilation von Julia ist die einmalige Ausführung einer Funktion wenig aussagekräftig. Das Paket `BenchmarkTools` stellt u.a. das Macro `@benchmark` bereit, das eine Funktion mehrfach aufruft und die Ausführungszeiten als Histogramm darstellt.\n",
    "\n",
    ":::{.ansitight}\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a5944727",
   "metadata": {},
   "outputs": [],
   "source": [
    "using BenchmarkTools\n",
    "\n",
    "@benchmark sum($vec_int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb32dbfa",
   "metadata": {},
   "outputs": [],
   "source": [
    "@benchmark sum($vec_bigint)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11e20069",
   "metadata": {},
   "source": [
    ":::\n",
    "Die `BigInt`-Addition ist mehr als  30 mal langsamer. \n",
    "\n",
    "\n",
    "\n",
    ":::{.content-hidden unless-format=\"xxx\"}\n",
    "Die folgende Funktion soll die Summe aller Zahlen von 1 bis n mit der Arithmetik des Datentyps T berechnen.\n",
    "Auf Grund der *type promotion rules* reicht es für `T ≥ Int64` dazu aus, die Akkumulatorvariable mit einer Zahl vom Typ T zu initialisieren.   \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad42844a",
   "metadata": {},
   "outputs": [],
   "source": [
    "function mysum(n, T)\n",
    "    s = T(0)\n",
    "    for i = 1:n\n",
    "        s += i    \n",
    "    end\n",
    "    return s\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "184457e8",
   "metadata": {},
   "source": [
    "Einen ersten Eindruck vom Zeit- und Speicherbedarf gibt das `@time`-Macro:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c92af0db",
   "metadata": {},
   "outputs": [],
   "source": [
    "@time x = mysum(10_000_000, Int64)\n",
    "@show x typeof(x);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e58be090",
   "metadata": {},
   "outputs": [],
   "source": [
    "@time x = mysum(10_000_000, BigInt)\n",
    "@show x typeof(x);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bfe545fd",
   "metadata": {},
   "source": [
    "Durch die Just-in-Time-Compilation von Julia ist die einmalige Ausführung einer Funktion wenig aussagekräftig. Das Paket `BenchmarkTools` stellt u.a. das Macro `@benchmark` bereit, das eine Funktion mehrfach aufruft und die Ausführungszeiten als Histogramm darstellt.\n",
    "\n",
    ":::{.ansitight}\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3ae14185",
   "metadata": {},
   "outputs": [],
   "source": [
    "using BenchmarkTools\n",
    "\n",
    "@benchmark mysum(10_000_000, Int64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d7513870",
   "metadata": {},
   "outputs": [],
   "source": [
    "@benchmark mysum(10_000_000, BigInt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4835712f",
   "metadata": {},
   "source": [
    "Die Berechnung von $\\sum_{n=1}^{10000000} n$  dauert auf meinem PC mit den Standard-64bit-Integern im Schnitt 2 Nanosekunden, in *arbitrary precision arithmetic* über eine Sekunde, wobei auch noch fast 500MB Speicher allokiert werden. \n",
    "\n",
    ":::\n",
    ":::\n",
    "\n",
    "\n",
    "## Gleitkommazahlen\n",
    "\n",
    "Aus _floating point numbers_  kann man im Deutschen  **[Gleit|Fließ]--[Komma|Punkt]--Zahlen** machen \n",
    "und tatsächlich kommen alle 4 Varianten in der Literatur vor. \n",
    "\n",
    "In der Numerischen Mathematik spricht man auch gerne von **Maschinenzahlen**.\n",
    "\n",
    "\n",
    "### Grundidee \n",
    "\n",
    "- Eine „feste Anzahl von Vor- und Nachkommastellen“ ist für viele Probleme ungeeignet.\n",
    "- Eine Trennung zwischen „gültigen Ziffern“ und Größenordnung wie in der wissenschaftlichen Notation\n",
    "    $$ 345.2467 \\times 10^3\\qquad 34.52467\\times 10^4\\qquad \\underline{3.452467\\times 10^5}\\qquad 0.3452467\\times 10^6\\qquad 0.03452467\\times 10^7$$ \n",
    "  ist wesentlich flexibler. \n",
    "- Zur Eindeutigkeit legt man fest: Die  Darstellung mit genau einer Ziffer vor dem Komma ist die __Normalisierte Darstellung__.\n",
    "- Bei Binärzahlen `1.01101`: ist diese Ziffer  immer gleich Eins und man kann auf das Abspeichern dieser Ziffer verzichten.\n",
    "\n",
    "\n",
    ":::{.callout-note }\n",
    " \n",
    "## Maschinenzahlen\n",
    "\n",
    "Die Menge der Maschinenzahlen  $𝕄(b, m, e_{min}, e_{max})$ ist charakterisiert durch die verwendete Basis $b$,  die Mantissenlänge $m$ und den Wertebereich des Exponenten $\\{e_{min}, ... ,e_{max}\\}$.   \n",
    "\n",
    "In unserer Konvention hat die Mantisse einer normalisierten Maschinenzahl eine Ziffer (der Basis $b$) ungleich Null vor dem Komma und $m-1$ Nachkommastellen. \n",
    "\n",
    "Wenn $b=2$ ist, braucht man daher nur $m-1$ Bits zur Speicherung der Mantisse normalisierter Gleitkommazahlen. \n",
    "\n",
    "Der Standard  IEEE 754, der von der Mahrzahl der modernen Prozessoren und Programmiersprachen implementiert wird,   definiert \n",
    "\n",
    "- `Float32` als $𝕄(2, 24, -126, 127 )$ und \n",
    "- `Float64` als $𝕄(2, 53, -1022, 1023 ).$\n",
    "\n",
    ":::\n",
    "\n",
    "### Aufbau von `Float64` nach [Standard IEEE 754](https://de.wikipedia.org/wiki/IEEE_754)\n",
    "\n",
    "\n",
    "::: {.content-visible when-format=\"html\"}\n",
    "![Aufbau einer `Float64` (Quelle:^[Quelle: <a href=\"https://commons.wikimedia.org/wiki/File:IEEE_754_Double_Floating_Point_Format.svg\">Codekaizen</a>, <a href=\"https://creativecommons.org/licenses/by-sa/4.0\">CC BY-SA 4.0</a>, via Wikimedia Commons ])\n",
    "](../images/1024px-IEEE_754_Double_Floating_Point_Format.png)\n",
    ":::\n",
    "\n",
    "::: {.content-visible when-format=\"pdf\"}\n",
    "![Aufbau einer `Float64`  \\mysmall{(Quelle: \\href{https://commons.wikimedia.org/wiki/File:IEEE_754_Double_Floating_Point_Format.svg}{Codekaizen}, \\href{https://creativecommons.org/licenses/by-sa/4.0}{CC BY-SA 4.0}, via Wikimedia Commons)}\n",
    "](../images/1024px-IEEE_754_Double_Floating_Point_Format.png){width=\"70%\"}\n",
    ":::\n",
    "\n",
    "\n",
    "- 1 Vorzeichenbit $S$\n",
    "- 11 Bits für den Exponenten, also $0\\le E \\le 2047$\n",
    "- die Werte $E=0$ und $E=(11111111111)_2=2047$ sind reserviert für die Codierung von Spezialwerten wie\n",
    "$\\pm0, \\pm\\infty$, NaN _(Not a Number)_ und denormalisierte Zahlen.\n",
    "- 52 Bits für die Mantisse $M,\\quad 0\\le M<1$, das entspricht etwa 16 Dezimalstellen\n",
    "- Damit wird folgende Zahl dargestellt:\n",
    "$$ x=(-1)^S \\cdot(1+M)\\cdot 2^{E-1023}$$\n",
    "\n",
    "Ein Beispiel:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8433da22",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 27.56640625\n",
    "bitstring(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7be4f1fc",
   "metadata": {},
   "source": [
    "Das geht auch schöner:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "98311a9b",
   "metadata": {},
   "outputs": [],
   "source": [
    "function printbitsf64(x::Float64)\n",
    "    s = bitstring(x)\n",
    "    printstyled(s[1], color = :blue, reverse=true)\n",
    "    printstyled(s[2:12], color = :green, reverse=true)\n",
    "    printstyled(s[13:end], color=:red, bold=true, reverse=true)\n",
    "    print(\"\\n\")\n",
    "end\n",
    "\n",
    "printbitsf64(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "728867d2",
   "metadata": {},
   "source": [
    "und wir können S (in blau),E (grün) und M (rot)  ablesen:\n",
    "$$\n",
    "\\begin{aligned}\n",
    "S &= 0\\\\\n",
    "E &= (10000000011)_2 = 2^{10} + 2^1 + 2^0 = 1027\\\\\n",
    "M &= (.1011100100001)_2 =  \\frac{1}{2} + \\frac{1}{2^3} + \\frac{1}{2^4} + \\frac{1}{2^5} + \\frac{1}{2^8} + \\frac{1}{2^{12}}\\\\\n",
    "x &=(-1)^S \\cdot(1+M)\\cdot 2^{E-1023}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "\n",
    "... und damit die Zahl rekonstruieren:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f4403c1c",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = (1 + 1/2 + 1/8 + 1/16 + 1/32 + 1/256 + 1/4096) * 2^4"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9346d953",
   "metadata": {},
   "source": [
    "- Die Maschinenzahlen 𝕄 bilden eine endliche, diskrete Untermenge von ℝ. Es gibt eine kleinste und eine größte Maschinenzahl und abgesehen davon haben alle x∈𝕄 einen Vorgänger und Nachfolger in 𝕄.\n",
    "- Was ist der Nachfolger von x in 𝕄? Dazu setzen wir  das kleinste Mantissenbit von 0 auf 1. \n",
    "- Die Umwandlung eines Strings aus Nullen und Einsen in die entsprechende Maschinenzahl ist z.B. so möglich:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "be8c52e0",
   "metadata": {},
   "outputs": [],
   "source": [
    "ux = parse(UInt64, \"0100000000111011100100010000000000000000000000000000000000000001\", base=2)\n",
    "reinterpret(Float64, ux)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "165a8689",
   "metadata": {},
   "source": [
    "Das kann man in Julia allerdings auch einfacher haben:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3901ed32",
   "metadata": {},
   "outputs": [],
   "source": [
    "y = nextfloat(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d8b2e005",
   "metadata": {},
   "source": [
    "Der Vorgänger von x ist:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed502537",
   "metadata": {},
   "outputs": [],
   "source": [
    "z = prevfloat(x)\n",
    "println(z)\n",
    "printbitsf64(z)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "973c7660",
   "metadata": {},
   "source": [
    "## Maschinenepsilon\n",
    "\n",
    "- Den Abstand zwischen `1` und dem Nachfolger `nextfloat(1)` nennt man [**Maschinenepsilon**](https://en.wikipedia.org/wiki/Machine_epsilon). \n",
    "- Für `Float64` mit einer Mantissenlänge von 52 Bit, ist  $\\epsilon=2^{-52}$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53a4703b",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show nextfloat(1.) - 1   2^-52   eps(Float64);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b27e5f1",
   "metadata": {},
   "source": [
    "- Das Maschinenepsilon ist ein Maß für den relativen Abstand zwischen den Maschinenzahlen und quantifiziert die Aussage: „64-Bit-Gleitkommazahlen haben eine Genauigkeit von etwa 16 Dezimalstellen.“\n",
    "- Das Maschinenepsilon ist etwas völlig anderes  als die kleinste von Null verschiedene Gleitkommazahl:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6c6247c5",
   "metadata": {},
   "outputs": [],
   "source": [
    "floatmin(Float64)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ace02061",
   "metadata": {},
   "source": [
    "- Ein Teil der Literatur verwendet eine andere Definition des Maschinenepsilons, die halb so groß ist. \n",
    "$$\n",
    "\\epsilon' = \\frac{\\epsilon}{2}\\approx 1.1\\times 10^{-16}\n",
    "$$ \n",
    "ist der maximale relative Fehler, der beim Runden einer reellen Zahl auf die nächste Maschinenzahl entstehen kann.\n",
    "- Da  Zahlen aus dem Intervall $(1-\\epsilon',1+\\epsilon']$ auf die Maschinenzahl $1$ gerundet werden, kann man $\\epsilon'$ auch definieren als: *die größte Zahl, für die in der Maschinenzahlarithmetik noch gilt: $1+\\epsilon' = 1$.*  \n",
    "\n",
    "Auf diese Weise kann man das Maschinenepsilon auch berechnen:\n",
    "\n",
    ":::{.ansitight}\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "94351455",
   "metadata": {},
   "outputs": [],
   "source": [
    "Eps = 1\n",
    "while(1 != 1 + Eps)\n",
    "    Eps /= 2\n",
    "    println(1+Eps)\n",
    "end\n",
    "Eps"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ce03b152",
   "metadata": {},
   "source": [
    "oder als Bitmuster:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de6c5e53",
   "metadata": {},
   "outputs": [],
   "source": [
    "Eps=1\n",
    "while 1 != 1 + Eps\n",
    "    Eps/= 2\n",
    "    printbitsf64(1+Eps)\n",
    "end\n",
    "Eps"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23b7914a",
   "metadata": {},
   "source": [
    ":::\n",
    "\n",
    "Die größte und die kleinste positive normalisiert darstellbare Gleitkommazahl eines Gleitkommatyps kann man abfragen:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10296953",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show floatmax(Float64)\n",
    "printbitsf64(floatmax(Float64))\n",
    "\n",
    "@show floatmin(Float64)\n",
    "printbitsf64(floatmin(Float64))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac56b395",
   "metadata": {},
   "source": [
    "## Runden auf Maschinenzahlen\n",
    "\n",
    "- Die Abbildung rd: ℝ $\\rightarrow$  𝕄 soll zur nächstgelegenen darstellbaren Zahl runden. \n",
    "- Standardrundungsregel: _round to nearest, ties to even_   \n",
    "  Wenn man genau die Mitte zwischen zwei Maschinenzahlen trifft *(tie)*, wählt man die, deren letztes Mantissenbit 0 ist. \n",
    "- Begründung: damit wird statistisch in 50% der Fälle auf- und in 50% der Fälle abgerundet und  so ein „statistischer Drift“ bei längeren Rechnungen vermieden.   \n",
    "\n",
    "##  Maschinenzahlarithmetik\n",
    "\n",
    "Die Maschinenzahlen sind als Untermenge von ℝ nicht algebraisch abgeschlossen. Schon die Summe zweier Maschinenzahlen wird in der Regel keine Maschinenzahl sein. \n",
    "\n",
    ":::{.callout-important}\n",
    "Der Standard IEEE 754 fordert, dass die Maschinenzahlarithmetik das *gerundete exakte Ergebnis* liefert:\n",
    "\n",
    "Das Resultat muss gleich demjenigen sein, das bei einer exakten Ausführung der entsprechenden Operation mit anschließender Rundung entsteht.\n",
    "$$\n",
    " a \\oplus b = \\text{rd}(a + b)\n",
    "$$ \n",
    "Analoges muss für die Implemetierung der Standardfunktionen wie \n",
    "wie `sqrt()`, `log()`, `sin()` ... gelten: Sie liefern ebenfalls die Maschinenzahl, die dem exakten Ergebnis am nächsten kommt.\n",
    ":::\n",
    "\n",
    "\n",
    "Die Arithmetik ist *nicht assoziativ*:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dbe01c9d",
   "metadata": {},
   "outputs": [],
   "source": [
    "1 +  10^-16 + 10^-16 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cd4dea94",
   "metadata": {},
   "outputs": [],
   "source": [
    "1 + (10^-16 + 10^-16)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5f1bc1c",
   "metadata": {},
   "source": [
    "Im ersten Fall (ohne Klammern) wird von links nach rechts ausgewertet:\n",
    "$$\n",
    "\\begin{aligned}\n",
    "1 \\oplus 10^{-16} \\oplus 10^{-16} &=\n",
    " (1 \\oplus 10^{-16}) \\oplus 10^{-16} \\\\  &= \n",
    "\\text{rd}\\left(\\text{rd}(1 + 10^{-16}) + 10^{-16} \\right)\\\\\n",
    "&= \\text{rd}(1 + 10^{-16})\\\\\n",
    "&= 1\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "Im zweiten Fall erhält man:\n",
    "$$\n",
    "\\begin{aligned}\n",
    "1 \\oplus \\left(10^{-16} \\oplus 10^{-16}\\right) &=\n",
    "\\text{rd}\\left(1 + \\text{rd}(10^{-16} + 10^{-16}) \\right)\\\\\n",
    "&= \\text{rd}(1 + 2*10^{-16})\\\\\n",
    "&= 1.0000000000000002\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "\n",
    "Es sei auch daran erinnert, dass sich selbst „einfache“ Dezimalbrüche häufig nicht exakt als Maschinenzahlen darstellen lassen:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "(0.1)_{10} &= (0.000110011001100110011001100...)_2 = (0.000\\overline{1100})_2 \\\\\n",
    " (0.3)_{10} &= (0.0100110011001100110011001100..)_2 = (0.0\\overline{1001})_2 \n",
    " \\end{aligned}\n",
    " $$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3d65d23",
   "metadata": {},
   "outputs": [],
   "source": [
    "printbitsf64(0.1)\n",
    "printbitsf64(0.3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4941993c",
   "metadata": {},
   "source": [
    "Folge:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c2d4306a",
   "metadata": {},
   "outputs": [],
   "source": [
    "0.1 + 0.1 == 0.2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "92671384",
   "metadata": {},
   "outputs": [],
   "source": [
    "0.2 + 0.1 == 0.3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3ae81f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "0.2 + 0.1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ea75313",
   "metadata": {},
   "source": [
    "Bei der Ausgabe einer Maschinenzahl muss der Binärbruch in einen Dezimalbruch entwickelt werden. Man kann sich auch mehr Stellen dieser Dezimalbruchentwicklung anzeigen lassen: \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17c80acc",
   "metadata": {},
   "outputs": [],
   "source": [
    "using Printf\n",
    "@printf(\"%.30f\", 0.1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3409a939",
   "metadata": {},
   "outputs": [],
   "source": [
    "@printf(\"%.30f\", 0.3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51bd9e73",
   "metadata": {},
   "source": [
    "Die Binärbruch-Mantisse einer Maschinenzahl kann eine lange oder sogar unendlich-periodische Dezimalbruchentwicklung haben. Dadurch \n",
    "sollte man sich nicht  eine „höheren Genauigkeit“ suggerieren lassen! \n",
    "\n",
    ":::{.callout-important}\n",
    "Moral: wenn man `Float`s auf Gleichheit testen will, sollte man fast immer eine dem Problem angemessene realistische Genauigkeit `epsilon` festlegen und \n",
    "darauf testen:\n",
    "\n",
    "```julia\n",
    "epsilon = 1.e-10\n",
    "\n",
    "if abs(x-y) < epsilon\n",
    "   # ...\n",
    "end\n",
    "```\n",
    ":::\n",
    "\n",
    "## Normalisierte und Denormalisierte Maschinenzahlen\n",
    "\n",
    "Zum Verständnis nehmen wir ein einfaches Modell: \n",
    "\n",
    "- Sei 𝕄(10,4,±5)  die Menge der Maschinenzahlen zur Basis 10 mit 4 Mantissenstellen (eine vor dem Komma, 3 Nachkommastellen) und dem Exponentenbereich -5 ≤ E ≤ 5.\n",
    "- Dann ist die normalisierte Darstellung (Vorkommastelle ungleich 0)\n",
    "  von z.B. 1234.0  gleich  1.234e3 und von  0.00789 gleich 7.890e-3.\n",
    "- Es ist wichtig, dass die Maschinenzahlen bei jedem Rechenschritt normalisiert gehalten werden. Nur so wird die Mantissenlänge voll ausgenutzt und die Genauigkeit ist maximal.\n",
    "- Die kleinste positive normalisierte Zahl in unserem Modell ist `x = 1.000e-5`. Schon `x/2` müsste auf 0 gerundet werden. \n",
    "- Hier erweist es sich für viele Anwendungen als günstiger, auch denormalisierte *(subnormal)* Zahlen zuzulassen und `x/2` als `0.500e-5` oder `x/20` als `0.050e-5` darzustellen. \n",
    "- Dieser *gradual underflow* ist natürlich mit einem Verlust an gültigen Stellen und damit Genauigkeit verbunden.\n",
    "\n",
    "\n",
    "Im `Float`-Datentyp werden solche *subnormal values* dargestellt durch ein Exponentenfeld, in dem alle Bits gleich Null sind:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4cf6ecae",
   "metadata": {},
   "outputs": [],
   "source": [
    "#| echo: false\n",
    "#| output: false\n",
    "flush(stdout)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "baddd014",
   "metadata": {},
   "source": [
    ":::{.ansitight}\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d0377d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "using Printf\n",
    "\n",
    "x = 2 * floatmin(Float64)   # 2*kleinste normalisierte Gleitkommazahl > 0\n",
    "\n",
    "while x != 0\n",
    "    x /= 2\n",
    "    @printf \"%14.6g  \" x\n",
    "    printbitsf64(x)\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "246e9987",
   "metadata": {},
   "source": [
    ":::\n",
    "\n",
    "## Spezielle Werte\n",
    "\n",
    "Die Gleitkommaarithmetik kennt einige spezielle Werte:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7a7ad264",
   "metadata": {},
   "outputs": [],
   "source": [
    "for x ∈ (NaN, Inf, -Inf, -0.0)\n",
    "    @printf \"%6g  \" x\n",
    "    printbitsf64(x)\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "806100b7",
   "metadata": {},
   "source": [
    "- Ein Exponentenüberlauf *(overflow)* führt zum Ergebnis `Inf` oder `-Inf`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c4381f5a",
   "metadata": {},
   "outputs": [],
   "source": [
    "2/0, -3/0, floatmax(Float64) * 1.01, exp(1300)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01836c3b",
   "metadata": {},
   "source": [
    "- Damit kann weitergerechnet werden:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f6d21942",
   "metadata": {},
   "outputs": [],
   "source": [
    "-Inf + 20, Inf/30, 23/-Inf, sqrt(Inf),  Inf * 0, Inf - Inf"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b8dd245d",
   "metadata": {},
   "source": [
    "- `NaN` *(Not a Number)* steht für das Resultat einer Operation, das undefiniert ist. Alle weiteren Operationen mit `NaN` ergeben ebenfalls `NaN`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed988054",
   "metadata": {},
   "outputs": [],
   "source": [
    "0/0, Inf - Inf, 2.3NaN, sqrt(NaN)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5cdc827",
   "metadata": {},
   "source": [
    "- Da `NaN` einen undefinierten Wert repräsentiert, ist es zu nichts gleich, nichtmal zu sich selbst. Das ist sinnvoll, denn wenn zwei Variablen `x` und `y` als `NaN` berechnet wurden, sollte man nicht schlußfolgern, dass sie gleich sind. \n",
    "- Zum Testen auf `NaN` gibt es daher die boolsche Funktion `isnan()`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c1fddee9",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 0/0\n",
    "y = Inf - Inf\n",
    "@show x==y   NaN==NaN  isfinite(NaN) isinf(NaN) isnan(x) isnan(y);  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7620f690",
   "metadata": {},
   "source": [
    "- Es gibt eine „minus Null“. Sie signalisiert einen Exponentenunterlauf *(underflow)* einer betragsmäßig zu klein gewordenen *negativen* Größe. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "448f1740",
   "metadata": {},
   "outputs": [],
   "source": [
    "@show 23/-Inf   -2/exp(1200)    -0.0==0.0;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc1c8f29",
   "metadata": {},
   "source": [
    "## Mathematische Funktionen\n",
    "\n",
    "Julia verfügt über die [üblichen mathematischen Funktionen](https://docs.julialang.org/en/v1/manual/mathematical-operations/#Rounding-functions)  \n",
    "`sqrt, exp, log, log2, log10, sin, cos,..., asin, acos,..., sinh,..., gcd, lcm, factorial,...,abs, max, min,...`,\n",
    "\n",
    "darunter z.B. die [Rundungsfunktionen](https://de.wikipedia.org/wiki/Abrundungsfunktion_und_Aufrundungsfunktion)\n",
    "\n",
    "- `floor(T,x)` = $\\lfloor x \\rfloor$\n",
    "- `ceil(T,x)`  = $\\lceil x \\rceil$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a22d5739",
   "metadata": {},
   "outputs": [],
   "source": [
    "floor(3.4),   floor(Int64, 3.5),   floor(Int64, -3.5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "175b33d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "ceil(3.4),    ceil(Int64, 3.5),    ceil(Int64, -3.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40ce370d",
   "metadata": {},
   "source": [
    "Es sei noch hingewiesen auf `atan(y, x)`, den [Arkustangens mit 2 Argumenten](https://de.wikipedia.org/wiki/Arctan2), Er ist in anderen Programmiersprachen oft als Funktion mit eigenem Namen *atan2* implementiert. \n",
    "Dieser löst das Problem der Umrechnung von kartesischen in Polarkoordinaten ohne lästige Fallunterscheidung. \n",
    "\n",
    "- `atan(y,x)`  ist Winkel der Polarkoordinaten von (x,y) im Intervall $(-\\pi,\\pi]$. Im 1. und 4. Quadranten ist er also gleich `atan(y/x)`\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "518c24db",
   "metadata": {},
   "outputs": [],
   "source": [
    "atan(3, -2),    atan(-3, 2),    atan(-3/2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f61060f4",
   "metadata": {},
   "source": [
    "##  Umwandlung  Strings $\\Longleftrightarrow$ Zahlen\n",
    "\n",
    "Die Umwandlung ist mit den  Funktionen `parse()` und `string()` möglich.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "96490c3b",
   "metadata": {},
   "outputs": [],
   "source": [
    "parse(Int64, \"1101\", base=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "814ac0ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "string(13, base=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "90b64397",
   "metadata": {},
   "outputs": [],
   "source": [
    "string(1/7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb1ebb4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "string(77, base=16)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "029d437e",
   "metadata": {},
   "source": [
    "Zur Umwandlung der numerischen Typen ineinander kann man die Typnamen verwenden. Typenamen sind auch Konstruktoren:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "df5b20f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = Int8(67)\n",
    "@show x   typeof(x);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "27314b21",
   "metadata": {},
   "outputs": [],
   "source": [
    "z = UInt64(3459)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5687c82a",
   "metadata": {},
   "outputs": [],
   "source": [
    "y = Float64(z)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "075b91d7",
   "metadata": {},
   "source": [
    "## Literatur\n",
    "\n",
    "- D. Goldberg, [What Every Computer Scientist Should Know About Floating-Point Arithmetic](http://www.validlab.com/goldberg/paper.pdf)\n",
    "- C. Vuik, [Some Disasters caused by numerical errors](http://ta.twi.tudelft.nl/users/vuik/wi211/disasters.html)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.8.5",
   "language": "julia",
   "name": "julia-1.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}