Files
JuliaKurs23/chapters/13_IO.qmd
2026-03-05 20:09:16 +01:00

471 lines
9.4 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
engine: julia
---
```{julia}
#| error: false
#| echo: false
#| output: false
using InteractiveUtils
import QuartoNotebookWorker
Base.stdout = QuartoNotebookWorker.with_context(stdout)
```
```{julia}
#| error: false
#| echo: false
#| output: false
flush(stdout)
```
# Input and Output
```{julia}
#| error: false
#| echo: false
#| output: false
# https://github.com/JuliaLang/julia/blob/master/base/show.jl#L516-L520
# https://github.com/JuliaLang/julia/blob/master/base/show.jl#L3073-L3077
using InteractiveUtils
import QuartoNotebookWorker
Base.stdout = QuartoNotebookWorker.with_context(stdout)
myactive_module() = Main.Notebook
Base.active_module() = myactive_module()
```
## Console
The operating system typically provides three channels (_streams_) for a program:
- Standard input (`stdin`)
- Standard output (`stdout`)
- Standard error (`stderr`)
When executed in a terminal, console, or shell, the program reads keyboard input through `stdin` and outputs to the terminal via `stdout` and `stderr`.
- Writing to `stdout`: `print()`, `println()`, `printstyled()`
- Writing to `stderr`: `print(stderr,...)`, `println(stderr,...)`, `printstyled(stderr,...)`
- Reading from `stdin`: `readline()`
### Input
The _Python_ language provides an `input()` function:
```{.python}
ans = input("Please enter a positive number!")
```
It prints the prompt, waits for input, and returns a `string`.
In Julia, you can implement this function as follows:
```{julia}
function input(prompt = "Input:")
println(prompt)
flush(stdout)
return chomp(readline())
end
```
**Comments**
- Write operations are buffered by modern operating systems. `flush(stdout)` empties the buffer and forces the write operation to complete immediately.
- `readline()` returns a string ending with a newline (`\n`). The function `chomp()` removes a trailing line break from the string.
```{julia}
#| eval: false
a = input("Please enter two numbers!")
```
```{julia}
#| echo: false
a = "34 56"
```
### Processing the Input
> `split(str)` splits a string into "words", returning a string array:
```{julia}
av = split(a)
```
> `parse(T, str)` tries to convert `str` to type `T`:
```{julia}
v = parse.(Int, av)
```
`parse()` throws an error if the string cannot be parsed as type `T`. You can catch the error with
`try/catch`, or use `tryparse(T, str)`, which returns `nothing` in such cases. Test the result with `isnothing()`.
## Formatted Output with the `Printf` Macro
You often need to output numbers or strings with strict formatting: total length, decimal places, alignment, etc.
For this purpose, the `Printf` package defines the macros `@sprintf` and `@printf`, which work similarly to the corresponding C functions.
```{julia}
using Printf
x = 123.7876355638734
@printf("Output right-aligned with max. 10 character width and 3 decimal places: x = %10.3f", x)
```
The first argument is a string containing placeholders (here: `%10.3f`) for the variables, followed by the variables themselves.
Placeholders have the form:
```
%[flags][width][.precision]type
```
where entries in square brackets are optional.
**Type specifications in placeholders**
| | |
|:--|:------------|
|`%s`| `string`|
|`%i`| `integer`|
|`%o`| `integer, octal (base 8)`|
|`%x, %X`| `integer, hexadecimal (base 16), digits 0-9a-f or 0-9A-F`|
|`%f`| `floating point`|
|`%e`| `floating point, scientific notation`|
|`%g`| `floating point, %f or %e as appropriate`|
: {.striped .hover}
**Flags**
| | |
|:----|:-----|
|Plus sign| right-aligned (default) |
|Minus sign| left-aligned |
|Zero| adds leading zeros |
: {.striped .hover}
**Width**
```
Minimum number of characters used (more will be taken if necessary)
```
### Examples
```{julia}
using Printf # Load the package first
```
```{julia}
@printf("|%s|", "Hello") # string with placeholder for string
```
The vertical bars are not part of the placeholder; they indicate the output field boundaries.
```{julia}
@printf("|%10s|", "Hello") # Minimum length, right-aligned
```
```{julia}
@printf("|%-10s|", "Hello") # left-aligned
```
```{julia}
@printf("|%3s|", "Hello") # Length specification can be exceeded
# Better a badly formatted table than wrong values!
```
```{julia}
j = 123
k = 90019001
l = 3342678
@printf("j = %012i, k = %-12i, l = %12i", j, k, l) # 0-flag for leading zeros
```
`@printf` and `@sprintf` can be called like functions:
```{julia}
@printf("%i %i", 22, j)
```
or as macros, i.e., without parentheses or commas:
```{julia}
@printf "%i %i" 22 j
```
`@printf` can take a stream as its first argument; otherwise, the argument list consists of:
- format string with placeholders
- variables matching the placeholders in number and type
```{julia}
@printf(stderr, "First result: %i %s\nSecond result %i",
j, "(estimated)", k)
```
The macro `@sprintf` does not print; it returns the formatted string:
```{julia}
str = @sprintf("x = %10.6f", π );
```
```{julia}
str
```
### Formatting Floating-Point Numbers
The _precision_ value specifies:
- `%f` and `%e` formats: maximum decimal places
- `%g` format: maximum total digits (integer part + decimal places)
```{julia}
x = 123456.7890123456
@printf("%20.4f %20.4e", x, x) # 4 decimal places
```
```{julia}
@printf("%20.7f %20.7e", x, x) # 7 decimal places
```
```{julia}
@printf("%20.7g %20.4g", x, x) # 7 and 4 digits total, respectively
```
## File Operations
Files are handled by:
- Opening $\Longrightarrow$ creation of a new _stream_ object (in addition to `stdin`, `stdout`, `stderr`)
- Reading from and writing to this _stream_
- Closing $\Longrightarrow$ detachment of the _stream_ object from the file
```
stream = open(path, mode)
```
- path: filename or path
- mode:
```
"r" read, opens at file beginning
"w" write, opens at file beginning (file is created or overwritten)
"a" append, opens to continue writing at file end
```
Let's write a file:
```{julia}
file = open("myfile.txt", "w")
```
```{julia}
@printf(file, "%10i\n", k)
```
```{julia}
println(file, " second line")
```
```{julia}
close(file)
```
Let's look at the file:
```{julia}
;cat myfile.txt
```
...and now we open it again for reading:
```{julia}
stream = open("myfile.txt", "r")
```
`readlines(stream)` returns all lines of a text file as a string vector.
`eachline(stream)` returns an iterator over the file lines.
```{julia}
n = 0
for line in eachline(stream) # Read line by line
n += 1
println(n, line) # Print with line number
end
close(stream)
```
## Packages for File Formats
Julia packages for various file formats include:
- [PrettyTables.jl](https://ronisbr.github.io/PrettyTables.jl/stable/) Output formatted tables
- [DelimitedFiles.jl](https://docs.julialang.org/en/v1/stdlib/DelimitedFiles/) Read and write matrices
- [CSV.jl](https://csv.juliadata.org/stable/) Read and write CSV files
- [XLSX.jl](https://felipenoris.github.io/XLSX.jl/stable/tutorial/) Read and write Excel files
and many more...
### DelimitedFiles.jl
This package offers convenient functions for saving and reading matrices using `writedlm()` and `readdlm()`.
```{julia}
using DelimitedFiles
```
Generate a 200×3 matrix of random numbers:
```{julia}
A = rand(200,3)
```
and save it:
```{julia}
f = open("data2.txt", "w")
writedlm(f, A)
close(f)
```
The written file starts like this:
```{julia}
;head data2.txt
```
Reading it back is simple:
```{julia}
B = readdlm("data2.txt")
```
In Julia, the `do` notation is frequently utilized for file handling (see @sec-do). The `open()` function includes methods where the first argument is a `function(iostream)`. This function is applied to the stream, which is automatically closed afterward. The `do` notation allows to define this function anonymously:
```{julia}
writedlm(io, A)
end
```
### CSV and DataFrames
- The CSV format provides tables readable by MS Excel and other applications.
- Example: the weather and climate database _Meteostat_.
- The [DataFrames.jl](https://dataframes.juliadata.org/stable/) package handles tabular data conveniently.
```{julia}
using CSV, DataFrames, Downloads
# Weather data from Westerland (see https://dev.meteostat.net/bulk/hourly.html)
url = "https://bulk.meteostat.net/v2/hourly/10018.csv.gz"
http_response = Downloads.download(url)
file = CSV.File(http_response, header=false);
```
The data looks like this:
```{julia}
# https://dev.meteostat.net/bulk/hourly.html#endpoints
#
# Column 1 Date
# 2 Time (hour)
# 3 Temperature
# 5 Humidity
# 6 Precipitation
# 8 Wind direction
# 9 Wind speed
df = DataFrame(file)
```
```{julia}
#| error: false
#| echo: false
#| output: false
#| eval: false
describe(df)
```
For convenient plotting and date/time handling, we load two packages:
```{julia}
using StatsPlots, Dates
```
We create a new column that combines date (from column 1) and time (from column 2):
```{julia}
# new column combining col. 1 and 2 (date & time)
df[!, :datetime] = DateTime.(df.Column1) .+ Hour.(df.Column2);
```
```{julia}
#| error: false
#| echo: false
#| output: false
#| eval: false
@df df plot(:datetime, :Column3)
```
The resulting plot:
```{julia}
@df df plot(:datetime, [:Column9, :Column6, :Column3],
xlims = (DateTime(2023,9,1), DateTime(2024,5,30)),
layout=(3,1), title=["Wind" "Rain" "Temp"],
legend=:none, size=(800,800))
```