Engineer IT: Rust

Showing posts with label Rust. Show all posts

Thursday, August 20, 2020

Overflow and underflow in Rust

In mathematics, addition usually used over integers. The set of integers is infinite. In computer science, we don't have infinite memory for every integer, so all of the programming languages had some solution to approximate real integers as much as they can. There are two methods for this approximation:

Some (mostly interpreted) languages like JavaScript and Python have unbounded integers. (The default integer type of Python is unbounded and JavaScript has BigInt, which is also unbounded.) The problem whith unbounded integers are that they can't programmed effectively. We have to pay the cost of the possibility of infinitly huge integers (almost) after every mahtematical operation.
Most languages have bounded integer types with different bitlength. Using bounded integers are effective but they have to use modular arithmetic instead of ordinary arithmetic.

If you choose your numeric types correctly, you won't notice a difference between ordinary and modular arithmetic, but even biggest companies can make mistakes wich results overflow or underflow in commercial systems, so we shouldn't underestimate this source of errors.

Rust was designed for programming critical systems in resource-poor environment, so the designers of Rust invented a third method for dealing overflow and underflow: The Rust has bounded integer types, but they don't support modular arithmetic. If your code causes an overflow or underflow, the program panics and exists. (If you need overflow or underflow, check the Wrapping struckt.) Sadly, this only works in debug build and not in release build. Please, try to run the following code with cargo run and cargo run --release


fn main() {
    let mut a : u8 = 0;
    for _i in 0..300 { 
        a += 1;
    }
    println!("{}", a);
}

Building and running the code with cargo run produces a debug binary you can find in target/debug/<project_name>.exe It produces the following output:


PS D:\rust\draft\hello_cargo> cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.03s
     Running `target\debug\hello_cargo.exe`
thread 'main' panicked at 'attempt to add with overflow', src\main.rs:4:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `target\debug\hello_cargo.exe` (exit code: 101)

Building and running the code with cargo run --release produces an optimized, release binary you can find in release/debug/<project_name>.exe It compiles and runs without error and its output will be 44. (44 is equal to 300 in modular 256 arithmetic.) The reason behind this is that checking overflow and underflow after every arithmetic operation has a huge cost.

To explore how much is this cost, we need a disassembler. I am not particulary familiar with Asm codes (I programmed PIC in Assembly ten years ago), so if you know nothing about Asm, don't be afraid, we won't go too deep inside it and you don't really need any previous experience about it. Firstly, we will need a Disassembler. I can recommend onlinedisassembler.com, you don't have to download anything and it has some nice features. You can start the work by pressing the Start Disassembling! button. After this, you have to upload your binary, but before we do it, please change the code to the following:


fn main() {
    let mut a : u8 = 0;
    for _i in 0..300 { 
        a += 39; // 39 == 0x27
    }
    println!("{}", a);
}

The generated code will contain a lot of operations where something will be increased by one. It will be hard to find this for cycle except we use some rare magic number in it. 39 is a good choice (67 would be also a good choice, there is nothing special with 39), we will find it easly in the assembly code. (This strategy can be handy if you would like to reverse engineering anything, like communication protocols of compilers. You can use some magic number which you will find in the communication stream. So first compile this code with the command cargo run and upload the produced binary from the debug folder.

You can upload it by selecting "Upload file" from the file menu and select the your binary. After that you should press Ok, and Ok again. (You should left everything in its default value.) The online interface is too slow, so I recommend to download the asm code from File/Download Disassembly and save the asm into the project directory.

After opening the file, we can see the 74000 lines long something. Now we have to search to the number 0x27 (0x27 is the hexadecimal value of 39) and we will easly find our for cycle. Or maybe not so easly, because there are 75 appearance of 0x27. Let's check them. Luckly, there is only one line where add operand and 0x27 occures. You can verify it by changing 39 to anything else and do the disassembly again. Our for-cycle is somewhere around the red 0x27. (I put the comments there and exchanged the memory addresses for labels after every jump instruction.) If you need a reference for the Asm instructions, I can recommend you this.


  sub    $0xd8,%rsp          # Reducing rsp by 216. Rsp points to the top of the current stack frame
  movb   $0x0,0x5f(%rsp)     # This is an indirect addressing. We move 0 to memory address 0x5f after the current value of rsp. 0x5f(%rsp) is the address of our a variable.
  movl   $0x0,0x60(%rsp)     # The difference between movl and movb is that movb only moves 8 bits, movl moves 32 bits
  movl   $0x12c,0x64(%rsp)   # So this 3 instructions set bits to zero between 0x5f and 0x68 after the value of rsp
  mov    0x60(%rsp),%ecx
  mov    0x64(%rsp),%edx
  callq  0x140001970
  mov    %eax,0x58(%rsp)
  mov    %edx,0x54(%rsp)
  mov    0x58(%rsp),%eax
  mov    %eax,0x68(%rsp)
  mov    0x54(%rsp),%ecx
  mov    %ecx,0x6c(%rsp)
Label5:                      # This is where our loop starts
  lea    0x68(%rsp),%rcx
  callq  0x1400018d0
  mov    %edx,0x74(%rsp)
  mov    %eax,0x70(%rsp)
  mov    0x70(%rsp),%eax
  mov    %eax,%ecx
  test   %rcx,%rcx
  je     Label1
  jmp    Label2
Label2:
  jmp    Label3
Label1:
  mov    0x1bbd7(%rip),%rax     
  lea    0x5f(%rsp),%rcx
  mov    %rcx,0xb8(%rsp)
  mov    0xb8(%rsp),%rcx
  mov    %rcx,0xd0(%rsp)
  lea    0x199c3(%rip),%rdx     
  mov    %rax,0x48(%rsp)
  callq  0x140001120
  mov    %rax,0x40(%rsp)
  mov    %rdx,0x38(%rsp)
  jmp    Label6
  ud2   
Label3:
  mov    0x74(%rsp),%eax
  mov    %eax,0xc4(%rsp)
  mov    %eax,0xc8(%rsp)
  mov    %eax,0xcc(%rsp)
  mov    0x5f(%rsp),%cl     # we transfer the value of a to %c1
  add    $0x27,%cl      # Here is our increasing. It sets CF (carrier flag) to one if overflow occures
  
  setb   %dl                # This instruction sets %dl to 1, if the CF (carrier flag) is one.
  test   $0x1,%dl           # It sets ZF (zero flag) if the value of %dl is not equal to 1, so if there were overflow, ZF is 0.
  mov    %cl,0x37(%rsp)     # We move the modified value to a temporarly memory address (0x37 from rsp)
  jne    Label4             # This instruction jumps to an error handling rutine if overflow occured (ZF=0)
  mov    0x37(%rsp),%al     # There were no error, we move the increased value of c back to al
  
  mov    %al,0x5f(%rsp)     # and we move al back to 0x5f
  jmpq   Label5             # This is just an unconditional jump. (Like goto.)
Label6:
  mov    0x40(%rsp),%rax
  mov    %rax,0xa8(%rsp)
  mov    0x38(%rsp),%rcx
  mov    %rcx,0xb0(%rsp)
  lea    0xa8(%rsp),%rdx
  lea    0x78(%rsp),%rcx
  mov    0x48(%rsp),%r8
  mov    %rdx,0x28(%rsp)
  mov    %r8,%rdx
  mov    $0x2,%r8d
  mov    0x28(%rsp),%r9
  movq   $0x1,0x20(%rsp)
  callq  0x140001180
  lea    0x78(%rsp),%rcx
  callq  0x140006780
  nop
  add    $0xd8,%rsp
  retq   
Label4:
  lea    0x1babb(%rip),%rcx     
  lea    0x1ba94(%rip),%r8      
  mov    $0x1c,%edx
  callq  0x140016e20
  ud2

This asm block shows that to detect overflow and underflow, we need 5 instructions (the gray ones) after every arithmetic operation. This won't cause too much problem in a testing environment, but it's not acceptable in production for a hardware programming language, so the compiler has to remove the checks during optimization.

Monday, August 10, 2020

Starting with Rust - Hello Cargo, hello debug

In the pervious post, we created the source file manually, used rustc to compile and used Visual Studio Code (VSC) only for its beautiful colors. (However, we installed a Language Server which was used for nothing special.) These steps were good for a "hello world"-level program, but we should step forward.

First of all, we have to create a normal project structure, where the source code, the used libraries and the compiled binaries have their own place. Secondly, we will need some easy-to-use compile program. (We don't really want to use the command line arguments of rustc. It's not too effective.) And thirdly, we will need some tool which will handle our dependencies. (Again, doing that manually is not fun or effective.) Luckly, Rust has Cargo. (If you are familiar with Node.js, think about Cargo like npm.) I won't give you a manual for Cargo (you can find it in the pervious link), I will only show you its basic functions which are needed for creating a new project.

Please, navigate into your learning directory with a command line and type the following command:
cargo new hello_cargo
The output on the console is not too much, but this command will spare us a lot of time. This command created the whole project structure and it configures a git repository.

Now, open the newly created folder with Visual Studio Code. (File -> Open Folder)
We can see the project structure on the left pane:

The src folder contains our sources. You have to create your source codes in there. The target folder contains the binaries. It's not under version control. Cargo.toml contains the information and dependencies of the project, and Cargo.lock contains the version and dependency information. (These files can be familiar to Node.JS users, they have similar roles like package.json and package-lock.json.)

To continue the setup of our IDE, firstly we have to make our example code a little bit more complex. (Debugging a one-line long function is not so interesting.) Lets replace the content of the src/main.rs file with the following code:

  
fn main() {
    println!("Hello, world!");
    let a = 10;
    let b = 20;
    let c = a+b;
    println!("{}", c);
}

The code contains three variables (for testing the watch window) and an additional println! statement. If you paste the code into VSC and you configured it correctly, you shoud see something like this:

The gray i32 type hints are generated by the VSC Rust extension. The type system and parameter declaration of the Rust is interesting enough to dedicate a post just for it, but in this point, just accept a, b and c as 32 bit wide signed integers.

For the debugger, we will need the C/C++ plugin.

After the plugin is installed, we have to compile our new code by the cargo build command. (The command must be executed in the project folder from command line.) We can use a separate console, or we can use the console built in VSC. We can open the built-in terminal from the Terminal menu. (Terminal -> New terminal.) By default this opens a PowerShell (under Windows) in the root folder of the project. We can write the compile command (cargo build) here.
We could use cargo run, too, which compiles and runs the Rust project, but it won't debug it.

We have to enable the usage of breakpoints (if we didn't do it previously):

And there is only one more boring part to use the debugger, we have to set up the debugger. This can be done by creating a new configuration by Run -> Add configuration.
Here, we have to select C++ (Windows or GDB depending on your OS) and we have to edit the newly created launch.json. We have to modify its 11th row ("program" key), we have to write the path of our binary here ("${workspaceFolder}/target/debug/hello_cargo.exe").

  
{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "(Windows) Launch",
            "type": "cppvsdbg",
            "request": "launch",
            "program": "${workspaceFolder}/target/debug/hello_cargo.exe",
            "args": [],
            "stopAtEntry": false,
            "cwd": "${workspaceFolder}",
            "environment": [],
            "externalConsole": false
        }
    ]
}

After this, we can add some breakpoints to our code by clicking to the beginning of the code line and start debugging. (Run -> Start debugging or by F5.)

After this first two posts, you are capable of write simple Rust projects, compile them, run them and debug them in Visual Studio Code. The next post will be about the typing system and parameter declaration of Rust. It's a relatively simple topic, but it has some uniqueness compared to other languages.

Sunday, August 9, 2020

Starting with Rust - Compile Hello world

This post is the first part of my Rust programming series. (See the labels.) I will use Windows 10 but everything sould work on other platforms, too. The main goal of this series is not to create a new Rust tutorial (I couldn't do better than this), but to help you to setup your environment, give you a quick introduction into the language with a robust programming knowledge and show you the main ideas, weakness and advantages of this language.

In the fist post of this series I will show you how you can write a simple Hello World program in Rust and how to setup the Visual Studio Code for code editing.

First, we could download the Rust compiler from its page (https://www.rust-lang.org/tools/install), but we won't. We will use the rustup command. I won't explain it in details, please, check the link. Under Linux, don't forget to run the command source $HOME/.cargo/env, too.

After the install, you can run the following programs from cmd:

The compiler is the rustc.exe. After the compiler is installed, we have to create a new file called main.rs. (It is recommended to create it inside a folder called helloworld.) The file has to contain our first, classic, "hello world" program:


fn main() {
    println!("Hello, world!");
}

We can compile the file with the

rustc main.rs

command, and get the main (or main.exe) file. (Of course, you can run it if you want and and you can see the longly awaited "Hello world" message on the console.)

Rust is quite self-explanatory in this level, I won't analize this code. It's more important to setup a GUI and continue the learning there. In this tutorial, I will use Visual Studio Code, which is a generic-purpose cross-platform environment.

After installing Visual Studio Code (VSC), we can open the previously created folder where the main.rs file can be found.

VSC recognizes Rust out-of-the-box, it can color the source code.

Coloring is good and useful, but VSC can do much more for us, if we install the Rust plugin. (It should install rust-analyzer or RLS as a dependency. If it won't, you should install one of them manually.)

Now, we can compile and run a simple Rust program in console and we can edit it in Visual Studio Code with a Language Server. In the next part, we will use cargo to generate a new project, compile it and we will use VSC to debug our code.