english translation started

2026-02-22 18:22:46 +01:00
parent 19c67a3c9b
commit 9e418381ca
17 changed files with 1915 additions and 1907 deletions
--- a/chapters/10_Strings.qmd
+++ b/chapters/10_Strings.qmd
@@ -11,96 +11,97 @@ import QuartoNotebookWorker
 Base.stdout = QuartoNotebookWorker.with_context(stdout)
 ```

-# Zeichen, Strings und Unicode
+# Characters, Strings, and Unicode

-## Zeichencodierungen (Frühgeschichte)
+## Character Encodings (Early History)

-Es gab - abhängig von Hersteller, Land, Programmiersprache, Betriebsssystem,... - eine große Vielzahl von Codierungen. 
+There were - depending on manufacturer, country, programming language, operating system, etc. - a large variety of encodings.

-Bis heute relevant sind:
+Still relevant today:


 ### ASCII 
-Der _American Standard Code for Information Interchange_ wurde 1963 in den USA als Standard veröffentlicht. 
+The _American Standard Code for Information Interchange_ was published as a standard in the USA in 1963.

- Er definiert $2^7=128$ Zeichen, und zwar:  
-  - 33 Steuerzeichen, wie `newline`, `escape`, `end of transmission/file`, `delete`
-  - 95 graphisch darstellbare Zeichen:
-    - 52 lateinische Buchstaben `a-z, A-Z`
-    - 10 Ziffern `0-9`
-    -  7 Satzzeichen `.,:;?!"`
-    - 1 Leerzeichen  ` `
-    - 6 Klammern  `[{()}]`
-    - 7 mathematische Operationen `+-*/<>=`
-    - 12 Sonderzeichen ``` #$%&'\^_|~`@ ``` 
+- It defines $2^7=128$ characters, namely:
+  - 33 control characters, such as `newline`, `escape`, `end of transmission/file`, `delete`
+  - 95 graphically printable characters:
+    - 52 Latin letters `a-z, A-Z`
+    - 10 digits `0-9`
+    - 7 punctuation marks `.,:;?!"`
+    - 1 space ` `
+    - 6 parentheses `[{()}]`
+    - 7 mathematical operations `+-*/<>=`
+    - 12 special characters ``` #$%&'\^_|~`@ ```

- ASCII ist heute noch der "kleinste gemeinsame Nenner" im Codierungs-Chaos.
- Die ersten 128 Unicode-Zeichen sind identisch mit ASCII.
+- ASCII is still the "lowest common denominator" in the encoding chaos.
+- The first 128 Unicode characters are identical to ASCII.

-### ISO 8859-Zeichensätze
+### ISO 8859 Character Sets

- ASCII nutzt nur 7 Bits. 
- In einem Byte kann man durch Setzen des 8. Bits weitere 128 Zeichen unterbringen. 
- 1987/88 wurden im ISO 8859-Standard verschiedene 1-Byte-Codierungen festgelegt, die alle ASCII-kompatibel sind, darunter:
+- ASCII uses only 7 bits.
+- In a byte, one can fit another 128 characters by setting the 8th bit.
+- In 1987/88, various 1-byte encodings were standardized in ISO 8859, all ASCII-compatible, including:

 :::{.narrow}
-  |Codierung | Region  | Sprachen|
+  |Encoding | Region  | Languages|
  |:-----------|:----------|:-------|
-   |ISO 8859-1 (Latin-1)  |  Westeuropa | Deutsch, Französisch,...,Isländisch
-   |ISO 8859-2 (Latin-2)  |  Osteuropa  | slawische Sprachen mit lateinischer Schrift
-   |ISO 8859-3 (Latin-3)  | Südeuropa   | Türkisch, Maltesisch,...
-   |ISO 8859-4 (Latin-4)  | Nordeuropa  | Estnisch, Lettisch, Litauisch, Grönländisch, Sami
-   |ISO 8859-5 (Latin/Cyrillic) | Osteuropa | slawische Sprachen mit kyrillischer Schrift
+   |ISO 8859-1 (Latin-1)  |  Western Europe | German, French,..., Icelandic
+   |ISO 8859-2 (Latin-2)  |  Eastern Europe  | Slavic languages with Latin script
+   |ISO 8859-3 (Latin-3)  | Southern Europe   | Turkish, Maltese,...
+   |ISO 8859-4 (Latin-4)  | Northern Europe  | Estonian, Latvian, Lithuanian, Greenlandic, Sami
+   |ISO 8859-5 (Latin/Cyrillic) | Eastern Europe | Slavic languages with Cyrillic script
   |ISO 8859-6 (Latin/Arabic) | |
   |ISO 8859-7 (Latin/Greek)  | |
   |...| | 
-   |ISO 8859-15 (Latin-9)| | 1999: Revision von Latin-1: jetzt u.a. mit Euro-Zeichen 
+   |ISO 8859-15 (Latin-9)| | 1999: Revision of Latin-1: now including Euro sign 
   
 :::
   
+
 ## Unicode    

-Das Ziel des Unicode-Consortiums ist eine einheitliche Codierung für alle Schriften der Welt.
+The goal of the Unicode Consortium is a uniform encoding for all scripts of the world.

- Unicode Version 1 erschien 1991
- Unicode Version 15.1 erschien 2023 mit 149 813 Zeichen, darunter: 
-   - 161 Schriften 
-   - mathematische und technische Symbole
-   - Emojis und andere Symbole, Steuer- und Formatierungszeichen
- davon entfallen über 90 000 Zeichen auf die CJK-Schriften (Chinesisch/Japanisch/Koreanisch)     
-   
-   
-### Technische  Details
+- Unicode version 1 was published in 1991
+- Unicode version 15.1 was published in 2023 with 149,813 characters, including:
+   - 161 scripts
+   - mathematical and technical symbols
+   - Emojis and other symbols, control and formatting characters
+- Over 90,000 characters are assigned to the CJK scripts (Chinese/Japanese/Korean)

- Jedem Zeichen wird ein `codepoint` zugeordnet. Das ist einfach eine fortlaufende Nummer.
- Diese Nummer wird hexadezimal notiert
-   - entweder 4-stellig als `U+XXXX` (0-te Ebene) 
-   - oder 6-stellig als `U+XXXXXX`  (weitere Ebenen)
- Jede Ebene geht von `U+XY0000`  bis `U+XYFFFF`, kann also $2^{16}=65\;534$ Zeichen enthalten.    
- Vorgesehen sind bisher 17 Ebenen `XY=00` bis `XY=10`, also der  Wertebereich von `U+0000` bis `U+10FFFF`.
- Damit sind maximal 21 Bits pro Zeichen nötig.
- Die Gesamtzahl der damit möglichen Codepoints ist etwas kleiner als 0x10FFFF, da aus technischen Gründen gewisse Bereiche nicht verwendet werden. Sie beträgt etwa 1.1 Millionen, es ist also noch viel Platz. 
- Bisher wurden nur Codepoints aus den Ebenen 
-     - Ebene 0 = BMP _Basic Multilingual Plane_  `U+0000 - U+FFFF`,
-     - Ebene 1 = SMP _Supplementary Multilingual Plane_  `U+010000 - U+01FFFF`,
-     - Ebene 2 = SIP _Supplementary Ideographic Plane_    `U+020000 - U+02FFFF`, 
-     - Ebene 3 = TIP _Tertiary Ideographic Plane_     `U+030000 - U+03FFFF`   und
-     - Ebene 14 = SSP _Supplementary Special-purpose Plane_ `U+0E0000 - U+0EFFFF` 
-   vergeben.
- `U+0000` bis `U+007F` ist identisch mit ASCII
- `U+0000` bis `U+00FF` ist identisch mit ISO 8859-1 (Latin-1)

-### Eigenschaften von Unicode-Zeichen
+### Technical Details

-Im Standard wird jedes Zeichen beschrieben duch
+- Each character is assigned a `codepoint`. This is simply a sequential number.
+- This number is written hexadecimally
+    - either 4-digit as `U+XXXX` (0th plane)
+    - or 6-digit as `U+XXXXXX` (further planes)
+- Each plane ranges from `U+XY0000` to `U+XYFFFF`, thus can contain $2^{16}=65\;534$ characters.
+- 17 planes `XY=00` to `XY=10` are provided so far, thus the value range from `U+0000` to `U+10FFFF`.
+- Thus, a maximum of 21 bits per character are needed.
+- The total number of possible codepoints is slightly less than 0x10FFFF, as certain areas are not used for technical reasons. It is about 1.1 million, so there is still much room.
+- So far, codepoints from the planes have been assigned only from
+      - Plane 0 = BMP _Basic Multilingual Plane_  `U+0000 - U+FFFF`,
+      - Plane 1 = SMP _Supplementary Multilingual Plane_  `U+010000 - U+01FFFF`,
+      - Plane 2 = SIP _Supplementary Ideographic Plane_    `U+020000 - U+02FFFF`,
+      - Plane 3 = TIP _Tertiary Ideographic Plane_     `U+030000 - U+03FFFF` and
+      - Plane 14 = SSP _Supplementary Special-purpose Plane_ `U+0E0000 - U+0EFFFF`
+    have been assigned.
+- `U+0000` to `U+007F` is identical to ASCII
+- `U+0000` to `U+00FF` is identical to ISO 8859-1 (Latin-1)

-  - seinen Codepoint (Nummer) 
-  - einen Namen (welcher nur aus ASCII-Großbuchstaben, Ziffern und Minuszeichen besteht) und 
-  - verschiedene Attributen wie
-    - Laufrichtung der Schrift 
-    - Kategorie: Großbuchstabe, Kleinbuchstabe, modifizierender Buchstabe, Ziffer, Satzzeichen, Symbol, Seperator,....
+### Properties of Unicode Characters

-Im Unicode-Standard sieht das dann so aus (zur Vereinfachung nur Codepoint und Name):
+In the standard, each character is described by
+
+- its codepoint (number)
+- a name (which consists only of ASCII uppercase letters, digits, and hyphens) and
+- various attributes such as
+  - script direction
+  - category: uppercase letter, lowercase letter, modifier letter, digit, punctuation, symbol, separator,....
+
+In the Unicode standard, this looks like this (simplified, only codepoint and name):
 ```
 ...
 U+0041 LATIN CAPITAL LETTER A
@@ -118,29 +119,29 @@ U+21B4 RIGHTWARDS ARROW WITH CORNER DOWNWARDS
 ...
 ```

-Wie sieht 'RIGHTWARDS ARROW WITH CORNER DOWNWARDS' aus?
+What does 'RIGHTWARDS ARROW WITH CORNER DOWNWARDS' look like?

-Julia verwendet `\U...` zur Eingabe von Unicode Codepoints.
+Julia uses `\U...` for input of Unicode codepoints.

 ```{julia}
 '\U21b4'
 ```


-### Eine Auswahl an Schriften 
+### A Selection of Scripts

 ::: {.content-visible when-format="html"}

 :::{.callout-note}
-Falls im Folgenden einzelne Zeichen oder Schriften in Ihrem Browser nicht darstellbar sind, müssen Sie geeignete 
-Fonts auf Ihrem Rechner installieren. 
+If individual characters or scripts are not displayable in your browser, you must install appropriate
+fonts on your computer.

-Alternativ können Sie die PDF-Version dieser Seite verwenden. Dort sind alle Fonts eingebunden.
+Alternatively, you can use the PDF version of this page. There, all fonts are embedded.
 :::

 :::

-Eine kleine Hilfsfunktion:
+A small helper function:
 ```{julia}
 function printuc(c, n)
    for i in 0:n-1
@@ -149,14 +150,15 @@ function printuc(c, n)
 end
 ```

-__Kyrillisch__
+__Cyrillic__


 ```{julia}
 printuc('\U0400', 100)
 ```

-__Tamilisch__
+__Tamil__
+

 :::{.cellmerge}
 ```{julia}
@@ -177,21 +179,21 @@ printuc('\U0be7',20)

 :::

-__Schach__
+__Chess__


 ```{julia}
 printuc('\U2654', 12)
 ```

-__Mathematische Operatoren__
+__Mathematical Operators__


 ```{julia}
 printuc('\U2200', 255)
 ```

-__Runen__
+__Runes__


 ```{julia}
@@ -200,12 +202,11 @@ printuc('\U16a0', 40)

 :::{.cellmerge}

-__Scheibe (Diskus) von Phaistos__
-
- Diese Schrift ist nicht entziffert. 
- Es ist unklar, welche Sprache dargestellt wird.
- Es gibt nur ein einziges Dokument in dieser Schrift: die Tonscheibe von Phaistos aus der Bronzezeit 
+__Phaistos Disc__

+- This script is not deciphered.
+- It is unclear what language is represented.
+- There is only one single document in this script: the Phaistos Disc from the Bronze Age


 ```{julia}
@@ -226,54 +227,54 @@ printuc('\U101D0', 46 )

 :::

-### Unicode transformation formats: UTF-8, UTF-16, UTF-32
+### Unicode Transformation Formats: UTF-8, UTF-16, UTF-32

-_Unicode transformation formats_ legen fest, wie eine Folge von Codepoints als eine Folge von Bytes dargestellt wird. 
+_Unicode transformation formats_ define how a sequence of codepoints is represented as a sequence of bytes.

-Da die Codepoints unterschiedlich lang sind, kann man sie nicht einfach hintereinander schreiben. Wo hört einer auf und fängt der nächste an? 
+Since the codepoints are of different lengths, they cannot simply be written down one after the other. Where does one end and the next begin?

- __UTF-32__: Das einfachste, aber auch speicheraufwändigste, ist, sie alle auf gleiche Länge zu bringen. Jeder Codepoint wird in 4 Bytes = 32 Bit kodiert.   
- Bei __UTF-16__ wird ein Codepoint entweder mit 2 Bytes oder mit 4 Bytes dargestellt. 
- Bei __UTF-8__  wird ein Codepoint mit 1,2,3 oder 4 Bytes dargestellt. 
- __UTF-8__ ist das Format mit der höchsten Verbreitung. Es wird auch von Julia verwendet. 
+- __UTF-32__: The simplest but also most memory-intensive is to make them all the same length. Each codepoint is encoded in 4 bytes = 32 bits.
+- In __UTF-16__, a codepoint is represented either with 2 bytes or with 4 bytes.
+- In __UTF-8__, a codepoint is represented with 1, 2, 3, or 4 bytes.
+- __UTF-8__ is the format with the highest prevalence. Julia also uses it.


 ### UTF-8

- Für jeden Codepoint werden 1, 2, 3 oder 4 volle Bytes verwendet. 
+- For each codepoint, 1, 2, 3, or 4 full bytes are used.

- Bei einer Codierung mit variabler Länge muss man erkennen können, welche Bytefolgen zusammengehören:
-    - Ein Byte der Form 0xxxxxxx  steht für einen ASCII-Codepoint der Länge 1.
-    - Ein Byte der Form 110xxxxx  startet einen 2-Byte-Code.
-    - Ein Byte der Form 1110xxxx  startet einen 3-Byte-Code.
-    - Ein Byte der Form 11110xxx  startet einen 4-Byte-Code.
-    - Alle weiteren Bytes eines 2-,3- oder 4-Byte-Codes haben die Form 10xxxxxx. 
+- With variable-length encoding, one must be able to recognize which byte sequences belong together:
+    - A byte of the form 0xxxxxxx represents an ASCII codepoint of length 1.
+    - A byte of the form 110xxxxx starts a 2-byte code.
+    - A byte of the form 1110xxxx starts a 3-byte code.
+    - A byte of the form 11110xxx starts a 4-byte code.
+    - All further bytes of a 2-, 3-, or 4-byte code have the form 10xxxxxx.

- Damit ist der Platz, der für den Codepoint zur Verfügung steht (Anzahl der x):
-     - Ein-Byte-Code:  7 Bits
-     - Zwei-Byte-Code: 5 + 6 = 11 Bits
-     - Drei-Byte-Code: 4 + 6 + 6 = 16 Bits
-     - Vier-Byte-Code: 3 + 6 + 6 + 6 = 21 Bits
+- Thus, the space available for the codepoint (number of x):
+     - One-byte code:  7 bits
+     - Two-byte code: 5 + 6 = 11 bits
+     - Three-byte code: 4 + 6 + 6 = 16 bits
+     - Four-byte code: 3 + 6 + 6 + 6 = 21 bits

- Damit ist jeder ASCII-Text automatisch auch ein korrekt codierter UTF-8-Text.
+- Thus, every ASCII text is automatically also a correctly encoded UTF-8 text.

- Sollten die bisher für Unicode festgelegten 17 Ebenen (= 21 Bit = 1.1 Mill. mögliche Zeichen) mal erweitert werden, dann wird UTF-8 auf 5- und 6-Byte-Codes erweitert.  
-  
-
-## Zeichen und Zeichenketten in Julia
-
-### Zeichen: `Char` 
-
-Der Datentyp `Char`  kodiert ein einzelnes Unicode-Zeichen. 
-
- Julia verwendet dafür einfache Anführungszeichen:  `'a'`.  
- Ein `Char` belegt 4 Bytes Speicher und 
- repräsentiert einen Unicode-Codepoint.
- `Char`s können  von/zu `UInt`s umgewandelt werden und 
- der Integer-Wert ist gleich dem Unicode-codepoint.
+- If the 17 planes (= 21 bits = 1.1 million possible characters) defined for Unicode so far are ever expanded, UTF-8 will be expanded to 5- and 6-byte codes.


-`Char`s können  von/zu `UInt`s umgewandelt werden.
+## Characters and Character Strings in Julia
+
+### Characters: `Char`
+
+The `Char` type encodes a single Unicode character.
+
+- Julia uses single quotes for this: `'a'`.
+- A `Char` occupies 4 bytes of memory and
+- represents a Unicode codepoint.
+- `Char`s can be converted to/from `UInt`s and
+- the integer value is equal to the Unicode codepoint.
+
+
+`Char`s can be converted to/from `UInt`s.

 ```{julia}
 UInt('a')
@@ -284,10 +285,10 @@ UInt('a')
 b = Char(0x2656)
 ```

-### Zeichenketten: `String`
+### Character Strings: `String`

- Für Strings verwendet Julia doppelte Anführungszeichen: `"a"`.
- Sie sind UTF-8-codiert, d.h., ein Zeichen kann zwischen 1 und 4 Bytes lang sein.
+- For strings, Julia uses double quotes: `"a"`.
+- They are UTF-8 encoded, i.e., one character can be between 1 and 4 bytes long.


 ```{julia}
@@ -297,21 +298,21 @@ b = Char(0x2656)



-__Bei einem Nicht-ASCII-String unterscheiden sich Anzahl der Bytes und Anzahl der Zeichen:__
+__For a non-ASCII string, the number of bytes and the number of characters differ:__


 ```{julia}
 asciistr = "Hello World!"
@show length(asciistr) ncodeunits(asciistr);
 ```
-(Das Leerzeichen zählt natürlich auch.)
+(The space, of course, also counts.)

 ```{julia}
 str = "😄 Hellö 🎶"
@show length(str) ncodeunits(str);
 ```

-__Iteration über einen String iteriert über die Zeichen:__
+__Iterating over a string iterates over the characters:__


 ```{julia}
@@ -320,76 +321,76 @@ for i in str
 end
 ```

-### Verkettung von Strings
+### Concatenation of Strings

-"Strings mit Verkettung bilden ein nichtkommutatives Monoid."
+"Strings with concatenation form a non-commutative monoid."

-Deshalb wird in Julia die Verkettung multiplikativ geschrieben.
+Therefore, Julia writes concatenation multiplicatively.
 ```{julia}
 str * asciistr * str
 ```

-Damit sind auch Potenzen mit natürlichem Exponenten definiert.
+Powers with natural exponents are thus also defined.

 ```{julia}
 str^3,  str^0
 ```

-### Stringinterpolation
+### String Interpolation

-Das Dollarzeichen hat in Strings eine Sonderfunktion, die wir schon oft in 
-`print()`-Anweisungen  genutzt haben. MAn kann damit eine Variable oder einen Ausdruck interpolieren:
+The dollar sign has a special function in strings, which we have often used in
+`print()` statements. One can interpolate a variable or expression with it:


 ```{julia}
 a = 33.4
 b = "x"

-s = "Das Ergebnis für $b ist gleich $a und die verdoppelte Wurzel daraus ist $(2sqrt(a))\n"
+s = "The result for $b is equal to $a and the doubled square root of it is $(2sqrt(a))\n"
 ```

-### Backslash escape sequences 
+### Backslash Escape Sequences

-Der _backslash_ `\` hat in Stringkonstanten ebenfalls eine Sonderfunktion. 
-Julia benutzt die von C und anderen Sprachen bekannten _backslash_-Codierungen für Sonderzeichen und für Dollarzeichen und Backslash selbst:
+The _backslash_ `\` also has a special function in string constants.
+Julia uses the backslash codings known from C and other languages for special characters and for dollar signs and backslashes themselves:


 ```{julia}
-s = "So bekommt man \'Anführungszeichen\" und ein \$-Zeichen und einen\nZeilenumbruch und ein \\ usw... "
+s = "This is how one gets \'quotes\" and a \$ sign and a\nline break and a \\ etc... "
 print(s)
 ```



-### Triple-Quotes
+### Triple Quotes

-Man kann Strings auch mit Triple-Quotes begrenzen. 
-In dieser Form bleiben Zeilenumbrüche und Anführungszeichen erhalten:
+Strings can also be delimited with triple quotes.
+In this form, line breaks and quotes are preserved:


 ```{julia}
 s = """
- Das soll
-ein "längerer"  
-  'Text' sein.
+ This should
+be a "longer"  
+  'text'.
 """

 print(s)
 ```

-### Raw strings
+### Raw Strings

-In einem `raw string` sind alle backslash-Codierungen außer `\"` abgeschaltet:
+In a `raw string`, all backslash codings except `\"` are disabled:


 ```{julia}
-s = raw"Ein $ und ein \ und zwei \\ und ein 'bla'..."
+s = raw"A $ and a \ and two \\ and a 'bla'..."
 print(s)
 ```

-## Weitere Funktionen für Zeichen und Strings (Auswahl)
+## Further Functions for Characters and Strings (Selection)

-### Tests für Zeichen
+### Tests for Characters


 ```{julia}
@@ -397,9 +398,9 @@ print(s)
@show isnumeric('½') iscntrl('\n') ispunct(';');
 ```

-### Anwendung auf Strings
+### Application to Strings

-Diese Tests lassen sich z.B. mit `all()`, `any()` oder `count()` auf Strings anwenden:
+These tests can e.g. be used with `all()`, `any()`, or `count()` on strings:


 ```{julia}
@@ -408,7 +409,7 @@ all(ispunct, ";.:")


 ```{julia}
-any(isdigit, "Es ist 3 Uhr! 🕒" )
+any(isdigit, "It is 3 o'clock! 🕒" )
 ```


@@ -416,7 +417,7 @@ any(isdigit, "Es ist 3 Uhr! 🕒" )
 count(islowercase, "Hello, du!!")
 ```

-### Weitere String-Funktionen
+### Other String Functions


 ```{julia}
@@ -438,50 +439,50 @@ count(islowercase, "Hello, du!!")


 ```{julia}
-split("π ist irrational.")
+split("π is irrational.")
 ```


 ```{julia}
-replace("π ist irrational.", "ist" => "ist angeblich")
+replace("π is irrational.", "is" => "is allegedly")
 ```

-## Indizierung von Strings
+## Indexing of Strings

-Strings sind nicht mutierbar aber indizierbar. Dabei gibt es ein paar Besonderheiten.
+Strings are immutable but indexable. There are a few special features here.

- Der Index nummeriert die Bytes des Strings. 
- Bei einem nicht-ASCII-String sind nicht alle Indizes gültig, denn
- ein gültiger Index adressiert immer ein Unicode-Zeichen.
+- The index numbers the bytes of the string.
+- For a non-ASCII string, not all indices are valid, because
+- a valid index always addresses a Unicode character.

-Unser Beispielstring:
+Our example string:
 ```{julia}
 str
 ```

-Das erste Zeichen
+The first character
 ```{julia}
 str[1]
 ```

-Dieses Zeichen ist in UTF8-Kodierung 4 Bytes lang. Damit sind 2,3 und 4 ungültige Indizes. 
+This character is 4 bytes long in UTF-8 encoding. Thus, 2, 3, and 4 are invalid indices.
 ```{julia}
 str[2]
 ```

-Erst das 5. Byte ist ein neues Zeichen:
+Only the 5th byte is a new character:

 ```{julia}
 str[5]
 ```

-Auch bei der Adressierung von Substrings müssen Anfang und Ende jeweils gültige Indizes sein, d.h., der Endindex muss ebenfalls das erste Byte eines Zeichens indizieren und dieses Zeichen ist das letzte des Teilstrings. 
+Even when addressing substrings, start and end must always be valid indices, i.e., the end index must also index the first byte of a character, and that character is the last of the substring.

 ```{julia}
 str[1:7]
 ```

-Die Funktion  `eachindex()` liefert einen Iterator über die gültigen Indizes:
+The function `eachindex()` returns an iterator over the valid indices:

 ```{julia}
 for i in eachindex(str)
@@ -490,24 +491,24 @@ for i in eachindex(str)
 end
 ```

-Wie üblich macht collect() aus einem Iterator einen Vektor.
+As usual, `collect()` makes an iterator into a vector.

 ```{julia}
 collect(eachindex(str))
 ```

-Die Funktion `nextind()` liefert den nächsten gültigen Index.
+The function `nextind()` returns the next valid index.
 ```{julia}
@show nextind(str, 1) nextind(str, 2);  
 ```

-Warum verwendet Julia einen Byte-Index und keinen Zeichenindex? Der Hauptgrund dürfte die Effizienz der Indizierung sein.
+Why does Julia use a byte index instead of a character index? The main reason is the efficiency of indexing.

- In einem langen String, z.B. einem Buchtext, ist die Stelle `s[123455]` mit einem Byte-Index schnell zu finden. 
- Ein Zeichen-Index müsste in der UTF-8-Codierung den ganzen String durchlaufen, um das n-te Zeichen zu finden, da die Zeichen 1,2,3 oder 4 Bytes lang sein können.
+- In a long string, e.g., a book text, the position `s[123455]` can be found quickly with a byte index.
+- A character index would have to traverse the entire string in UTF-8 encoding to find the n-th character, since the characters can be 1, 2, 3, or 4 bytes long.


-Einige Funktionen liefern Indizes oder Ranges als Resultat. Sie liefern immer gültige Indizes:
+Some functions return indices or ranges as results. They always return valid indices:


 ```{julia}
@@ -529,9 +530,8 @@ str2 = "αβγδϵ"^3
 n = findfirst('γ', str2)
 ```

-So kann man  ab dem nächsten nach `n=5` gültigen Index weitersuchen:
+So one can continue searching from the next valid index after `n=5`:

 ```{julia}
 findnext('γ', str2, nextind(str2, n))
 ```
-