Introduction to "Safe and Secure Coding in Rust: A Comparative Analysis of Rust and C/C++"

In the domain of systems programming, ensuring code safety and security is crucial. This book delves into the programming languages used for such tasks, with a special focus on Rust—a modern language celebrated for its ability to maintain high levels of safety and security. We explore typical issues encountered in traditional languages like C or C++, and how Rust's advanced features effectively mitigate these risks.

Rust as a Safe and Secure System Language

Rust stands out as a system programming language designed with an emphasis on safety and security. It effectively addresses many common pitfalls in systems programming, such as memory errors and concurrency bugs, which are often encountered in C/C++ environments. Through its ownership model, borrowing rules, and type system, Rust ensures memory safety and thread safety, achieving these without compromising on performance.

Understanding the Secure and Safety Concepts:

Safety in Coding aims at preventing harm to humans by the system through various measures like validating inputs for reliability, effective error management, secure data storage and transmission using methods like Cyclic Redundancy Check (CRC), and avoiding hazardous coding practices.

Security in Coding focuses on protecting the system against malicious human activities. It encompasses rigorous input validation, robust error management, strong authentication and authorization mechanisms, securing data storage and transmission through Cipher-based Message Authentication Code (CMAC), and preventing issues like memory leaks and buffer overflows.

It's crucial to note the distinction between Safe and Secure Coding practices, despite their overlapping areas. Excellence in software for dependable systems requires a fusion of both approaches.

What This Book Won't Teach You

This book is dedicated to highlighting how Rust addresses safety and security concerns in system programming, but it is not an exhaustive guide to all of Rust's features. It intentionally avoids declaring Rust as superior to C/C++ or diminishing the value of other programming languages. The focus is on showing the specific problems in C/C++ programming and how Rust's design helps in preventing these issues, rather than claiming Rust as the only solution or the best language for all programming challenges.

Through a comparative analysis of Rust and C/C++, this book aims to provide readers with a clear understanding of how Rust's safety and security features can mitigate the risks associated with systems programming, offering practical insights and guidelines for adopting safer and more secure coding practices.

Contact

If you have some remarks, questions or ideas how to improve the book, feel free to open an issue or pull request at the book's repository.

Happy Reading! Lukasz

Quick Rust Overview

Brief Overview of Rust

Rust is a systems programming language, developed by mozilla and first released in 2010, focusing on safety, speed, and concurrency. It aims to provide memory safety without garbage collection, and concurrency without data races. Rust achieves these goals through a set of ownership rules, checked at compile time, without sacrificing performance. Rust achieves its goals through a set of unique features, including:

  • Ownership and Borrowing: Rust's ownership model enforces rules at compile time that eliminate various classes of bugs found in other systems programming languages, such as dangling pointers, data races, and memory leaks.
  • Type Safety and Inference: Rust's type system prevents null pointer dereferences and guarantees thread safety, among other safety checks. Its powerful type inference allows for concise code without sacrificing expressiveness or safety.
  • Zero-Cost Abstractions: Rust provides high-level abstractions without introducing runtime overhead. This means you can write high-level code that compiles down to low-level machine code as efficient as that written in C or C++.
  • Fearless Concurrency: Rust's ownership and type systems, along with its safe abstractions, make concurrent programming more approachable and less error-prone, enabling developers to take full advantage of modern multicore processors.
  • Ecosystem and Tooling: Rust offers a growing ecosystem with Cargo, its package manager and build system, and Crates.io, a repository of libraries (crates) that extend Rust's capabilities. Rust's tooling also includes robust documentation, format, and linting tools, making development in Rust productive and enjoyable.
  • Memory Safety Without Garbage Collection: Rust achieves memory safety without needing a garbage collector, making it suitable for performance-critical applications where controlling resource use is essential.

Comparison with C/C++

  • Memory Safety: Unlike C and C++, Rust enforces memory safety at compile time. This means many of the common vulnerabilities in C/C++ programs, such as use-after-free errors and data races, are caught before the code is even run. Rust's compiler enforces ownership and borrowing rules that prevent use-after-free, double-free, and null dereference errors that are common in C/C++. This drastically reduces the potential for security vulnerabilities in Rust programs.
  • Concurrency: Rust's approach to concurrency is safer and more straightforward, thanks to its ownership model, which prevents data races at compile time. In contrast, C/C++ requires developers to manage synchronization primitives manually, which is error-prone.
  • Modern Tooling: Rust comes with Cargo, which simplifies dependency management, building, testing, and documentation. C/C++ has various build systems and package managers, but none are as integrated with the language ecosystem as Cargo.
  • Learning Curve: Rust has a steeper learning curve than C/C++, primarily due to its strict compiler checks and ownership model. However, these same features lead to fewer runtime errors and more reliable software.
  • Runtime Performance: Rust and C/C++ offer comparable runtime performance. Rust's zero-cost abstractions mean that, in theory, anything written in C/C++ could be written in Rust without sacrificing speed.
  • Community and Ecosystem: C/C++ has been around for decades, leading to a vast ecosystem and a wide range of applications, from operating systems to game development. Rust is newer but has seen rapid growth in its community and ecosystem, with increasing adoption in systems programming, web assembly, and embedded systems.

Conclusion

Rust presents a compelling alternative to C/C++ for systems programming, offering memory safety, concurrency features, and modern tooling, all without sacrificing performance. While Rust's learning curve may be steeper due to its strict compiler and unique concepts like ownership and borrowing, the benefits in terms of safety and productivity are considerable. For new projects, especially those where safety and concurrency are critical, Rust is an excellent choice.

Rust Compiler Safety Features Overview

Rust's compiler is designed with safety as a primary goal, employing several key features to prevent common bugs and security vulnerabilities that plague systems programming. These features enforce strict compile-time checks, ensuring that only safe code gets executed unless explicitly marked otherwise. Below, we explore some of Rust's compiler safety features with examples.

Ownership and Borrowing

Rust's unique approach to memory management is enforced at compile time through its ownership and borrowing system, which eliminates a wide array of bugs related to memory usage, such as dangling pointers, double frees, and memory leaks.

Example: Ownership 1

GODBOLT

fn main() {
    let a: String = String::from("Hello");
    let b = a; // a's ownership is moved to b
    println!("{}", b);
    // println!("{}", a); // This line would cause a compile-time error
}
#include <iostream>
#include <string>

int main() {
    std::string a = "hello";
    std::string b = a;  // Duplicate the data in a.
    std::cout << b << std::endl;
    std::cout << a << std::endl;
    return 0;
}

Example: Ownership 2

fn greet(name: String) {
    println!("Hello {name}")
}

fn main() {
    let name = String::from("Tom");
    greet(name);
    // greet(name);
}

In this example, the ownership of the string a is moved to b. Attempting to use a after this point results in a compile-time error, preventing use-after-move bugs.

Example: Borrowing 1

fn calculate_length(s: &String) -> usize {
    s.len()
}

fn main() {
    let a = String::from("Hello");
    let len = calculate_length(&a); // a is borrowed
    println!("The length of '{}' is {}.", a, len); // a can still be used here
}

Here, a is borrowed by calculate_length, allowing a to be used afterward because it wasn't moved but merely borrowed.

Example: Borrowing 2

fn append_world(s: &mut String) {
    s.push_str(" world"); // s is now a mutable reference, allowing us to modify the original String
}

fn main() {
    let mut a = String::from("Hello");
    append_world(&mut a); // a is mutably borrowed
    println!("The new value of 'a' is {}.", a); // a can still be used here because the mutable borrow ends at the end of the `append_world` scope
}

Here, a is mutably borrowed by append_world, allowing a to be modified inside and to be used afterward.

Lifetimes

In Rust, references have lifetimes that ensure they don't outlive the data they point to, thanks to the borrow checker. Lifetimes can be:

  • Implicit, where Rust automatically figures out the lifespan of references for you.
  • Explicit, used in complex scenarios, where you guide Rust with lifetime annotations (like 'a) to resolve ambiguities.

The compiler uses these annotations to enforce safe reference usage, preventing errors related to invalid data access. Essentially, Rust's system manages reference validity for you, stepping in only when you need to clarify lifetimes in tricky situations.

Example 0: Borrow Checker

fn main() {
    let result;                     // ---------+-- 'a
    {                               //          |
        let tmp = 42;               // -+-- 'b  |
        result = &tmp;              //  |       |
    }                               // -+       |
    println!("result: {}", result); //          |
}                                   // ---------+

In this example, the variable result is intended to have a longer lifetime, labeled 'a, extending over the entire main function. Inside a nested block, we create tmp with a shorter lifetime, 'b. We attempt to assign a reference to tmp to result. However, 'b is much shorter than 'a because tmp goes out of scope once the block ends, but result is used outside of this block.

Rust checks lifetimes at compile time and identifies that result is supposed to live longer than tmp, based on their respective scopes. Since result is a reference to tmp, which has a shorter lifespan, Rust prevents this by design, to avoid dangling references. Essentially, Rust disallows the program because the data result points to (tmp) does not exist for the entirety of result's lifetime. This ensures memory safety by preventing access to invalid or deallocated memory.

Example 1:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

fn main() {
    let string1 = String::from("abcd");
    {
        let string2 = "xyz";

        let result = longest(string1.as_str(), string2);
        println!("The longest string is {}", result);
    }
}

This function signature tells Rust that the returned reference will live as long as the shortest of the two input references, ensuring the reference is valid for the duration of its use.

Example 2:

struct User<'a> {
    username: &'a str,
}

struct Tweet<'a> {
    content: &'a str,
    author: &'a User<'a>,
}

impl<'a> Tweet<'a> {
    fn is_tweet_by_user(&self, user: &'a User) -> bool {
        self.author.username == user.username
    }
}

fn main() {
    let user = User { username: "johndoe" };
    let tweet = Tweet {
        content: "Hello, world!",
        author: &user,
    };

    if tweet.is_tweet_by_user(&user) {
        println!("This tweet is by {}", user.username);
    } else {
        println!("This tweet is not by {}", user.username);
    }
}

This example demonstrates how explicit lifetime annotations guide the Rust compiler to enforce memory safety in scenarios where relationships between data (like tweets and their authors) are managed through references. I have created a User and a Tweet structs, then use the method is_tweet_by_user to check if the tweet was authored by the user. This entire flow is safe thanks to Rust's lifetimes, ensuring the references in Tweet and User are valid when accessed.

Match Control Flow

The match control flow construct forces handling of all possible cases when used with enums, reducing the chances of bugs from unhandled cases.

Example: Match with Enums

enum Command {
    Start(String), // Contains a message
    Stop,
    Restart { delay_secs: u32 }, // Contains named fields
}

fn execute_command(command: Command) {
    match command {
        Command::Start(message) => println!("Starting: {}", message),
        Command::Stop => println!("Stopping"),
        Command::Restart { delay_secs } => println!("Restarting in {} seconds", delay_secs),
        _ => println!("Unknown Command!"),
    }
}

fn main() {
    let start_command = Command::Start(String::from("Hackathon 2024"));
    execute_command(start_command);

    let stop_command = Command::Stop;
    execute_command(stop_command);

    let restart_command = Command::Restart { delay_secs: 5 };
    execute_command(restart_command);
}

Safe Concurrency

Rust's ownership and type system ensure safe concurrency, preventing data races at compile time.

Example: Safe Concurrency

use std::sync::{Arc, atomic::{AtomicUsize, Ordering}};
use std::thread;

fn main() {
    let counter = Arc::new(AtomicUsize::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter_clone = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            // Safely increment the counter
            counter_clone.fetch_add(1, Ordering::Relaxed);
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Counter: {}", counter.load(Ordering::Relaxed));
}

This example uses Arc (Atomic Reference Counting) to safely share and access data across threads.

Conclusion

Rust's compiler safety features are central to its promise of safe systems programming, effectively addressing many of the pitfalls common in other languages. Through ownership, lifetimes, match statements, and safe concurrency, Rust empowers developers to write more reliable and secure code by default, catching potential errors early in the development cycle. These features, backed by a thorough compile-time checking system, make Rust an appealing choice for projects where safety and performance are paramount.

Rust example app, syntax overview

use std::fmt::{self, Display, Formatter};
use std::sync::mpsc::{self, Receiver};
use std::thread;
use std::time::{Duration, Instant};

// Define an enum to represent the status of an sensor, showcasing Rust's enum and pattern matching.
enum SensorStatus {
    Running,
    Stopped,
    Error(String),
}

// Implement the Display trait for SensorStatus to enable easy printing.
impl Display for SensorStatus {
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        match self {
            SensorStatus::Running => write!(f, "Sensor is running"),
            SensorStatus::Stopped => write!(f, "Sensor is stopped"),
            SensorStatus::Error(msg) => write!(f, "Sensor error: {}", msg),
        }
    }
}

// A struct representing a vehicle temperature sensor.
struct TemperatureSensor {
    name: &'static str,
    value: i32,
}

impl TemperatureSensor {
    const MIN_TEMP: i32 = -30;
    const MAX_TEMP: i32 = 150;
}

// A struct representing a vehicle speed sensor.
struct SpeedSensor {
    name: &'static str,
    value: u32,
}

impl SpeedSensor {
    const MAX_SPEED_LIMIT: u32 = 180;
}

// Implement the Display trait for Sensors, enabling descriptive output.
impl Display for TemperatureSensor {
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        write!(f, "[{}] TemperatureSensor: {}", self.name, self.value)
    }
}

impl Display for SpeedSensor {
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        write!(f, "[{}] SpeedSensor: {}", self.name, self.value)
    }
}

// Define a trait for diagnostic tools, demonstrating Rust's trait system for polymorphism.
trait DiagnosticTool {
    fn diagnose(&self) -> SensorStatus;
}

// Implement the DiagnosticTool trait for Sensors, showcasing trait implementations.
impl DiagnosticTool for TemperatureSensor {
    fn diagnose(&self) -> SensorStatus {
        if self.value > TemperatureSensor::MAX_TEMP {
            SensorStatus::Error(format!("{} sensor exceeds max limit:{} over {} [C]!", self.name, self.value, (self.value - TemperatureSensor::MAX_TEMP)))
        } else if self.value < TemperatureSensor::MIN_TEMP {
            SensorStatus::Error(format!("{} sensor reached below min limit:{} over {} [C]!", self.name, self.value, (self.value - TemperatureSensor::MIN_TEMP)))
        } else {
            SensorStatus::Running
        }
    }
}

impl DiagnosticTool for SpeedSensor {
    fn diagnose(&self) -> SensorStatus {
        if self.value > SpeedSensor::MAX_SPEED_LIMIT {
            SensorStatus::Error(format!("[{}] sensor exceeds max limit:{} over {} [km/h]!", self.name, self.value, (self.value - SpeedSensor::MAX_SPEED_LIMIT)))
        } else {
            SensorStatus::Running
        }
    }
}

// A function taking a dynamic trait object, demonstrating dynamic polymorphism.
fn run_diagnostic(tool: &dyn DiagnosticTool) -> SensorStatus {
    tool.diagnose()
}


// Define a struct for processing rear camera images, illustrating Rust's generic type parameters.
struct RearCameraImageProcessor<T> {
    data: T,
}

// Define a trait for image processing functionality.
trait ImageProcessing {
    fn process(&mut self);
}

// Implement methods for RearCameraImageProcessor, demonstrating ownership and borrowing.
impl<T: Display + Clone> RearCameraImageProcessor<T> {
    // Constructor method takes ownership of data.
    fn new(data: T) -> Self {
        RearCameraImageProcessor { data }
    }

    // Borrow self immutably to read data.
    fn read(&self) -> &T {
        &self.data
    }

    // Borrow self mutably to modify data.
    fn write(&mut self, data: T) {
        self.data = data;
    }
}

// Implement the ImageProcessing trait for RearCameraImageProcessor.
impl<T: Display + Clone> ImageProcessing for RearCameraImageProcessor<T> {
    // Sample processing method which just clones and displays the data.
    fn process(&mut self) {
        let processed_data = self.data.clone();
        println!("Processing image data: {}", processed_data);
    }
}

// Demonstrates the use of closures to modify data, showcasing Rust's closure capabilities.
fn adjust_brightness<F, T>(adjustment_closure: F, processor: &mut RearCameraImageProcessor<T>)
where
    F: Fn(T) -> T,
    T: Display + Clone,
{
    let current_data = processor.read().clone();
    println!("Current image brightness: {}", current_data);
    let adjusted_data = adjustment_closure(current_data);
    processor.write(adjusted_data);
}



fn main() {
    // Simulate a sensor that continuously sends data for 10 seconds.
    let (tx_temp, rx_temp): (mpsc::Sender<TemperatureSensor>, Receiver<TemperatureSensor>) = mpsc::channel();
    let (tx_speed, rx_speed): (mpsc::Sender<SpeedSensor>, Receiver<SpeedSensor>) = mpsc::channel();
    // Spawn a thread for the temp_sensor
    let tx_temp_clone = tx_temp.clone();
    let temp_handle = thread::spawn(move || {
        let start = Instant::now();
        while start.elapsed() < Duration::new(5, 0) {
            let value = (start.elapsed().as_secs() * 10) as i32; // Simulate increasing temp
            let sensor = TemperatureSensor { name: "Engine Temperature", value: value };
            if let Err(e) = tx_temp_clone.send(sensor) {
                eprintln!("Error sending temp data: {}", e);
                break;
            }
            thread::sleep(Duration::from_millis(500)); // Simulate data sent every 500ms
        }
    });
    // Spawn a thread for the car_speed_sensor
    let tx_speed_clone = tx_speed.clone();
    let speed_handle = thread::spawn(move || {
        let start = Instant::now();
        let mut factor: u64 = 1;
        while start.elapsed() < Duration::new(5, 0) {
            let value = (start.elapsed().as_secs() * 1 * factor) as u32; // Simulate increasing speed
            let sensor = SpeedSensor { name: "Car Speed", value: value };
            if let Err(e) = tx_speed_clone.send(sensor) {
                eprintln!("Error sending car speed data: {}", e);
                break;
            }
            factor = factor + 1;
            thread::sleep(Duration::from_millis(100)); // Simulate data sent every 100ms
        }
    });

    // Main thread acts as a diagnostic tool that processes sensors data
    let start = Instant::now();
    while start.elapsed() < Duration::new(5, 0) {
        if let Ok(sensor) = rx_temp.try_recv() {
            println!("Received {}: {}", sensor.name, sensor.value);
            // Here you can also run diagnostics on the received sensor data
            let status = run_diagnostic(&sensor);
            println!("Diagnostic result: {}", status);
        }
        if let Ok(sensor) = rx_speed.try_recv() {
            println!("Received {}: {}", sensor.name, sensor.value);
            // Here you can also run diagnostics on the received sensor data
            let status = run_diagnostic(&sensor);
            println!("Diagnostic result: {}", status);
        }
        // Simulate processing other tasks in the main thread
        thread::sleep(Duration::from_millis(100));
    }

    // Instantiate a RearCameraImageProcessor and demonstrate processing.
    let mut camera_processor = RearCameraImageProcessor::new("Initial image data".to_string());
    camera_processor.process();

    // Use a closure to adjust the brightness of the image data.
    adjust_brightness(|data| format!("{} + brightness adjusted", data), &mut camera_processor);
    println!("After adjustment: {}", camera_processor.read());

    // Join threads here
    if let Err(e) = temp_handle.join() {
        eprintln!("Error joining temperature sensor thread: {:?}", e);
    }

    if let Err(e) = speed_handle.join() {
        eprintln!("Error joining speed sensor thread: {:?}", e);
    }

    println!("Simulation completed.");
}

Memory Safety in Rust

The issue of memory safety in C++ poses significant risks, often being a root cause for numerous security vulnerabilities. This concern has even prompted advisories from entities like the U.S. government, recommending caution or alternatives when considering C++ for new projects. For C++ to maintain its relevance and applicability in modern software development, it's crucial that the language evolves to offer memory safety guarantees by default. Implementing mechanisms for explicit control over low-level operations could preserve the language's flexibility for specific use cases where direct memory manipulation is necessary.

In this context, Rust emerges as a compelling solution. Designed with memory safety as a foundational principle, Rust eliminates many common pitfalls associated with C++ development, such as use-after-free, buffer overflows, and data races. Rust achieves this through its ownership model, borrow checker, and lifetimes, which together enforce memory safety at compile time. This not only minimizes security vulnerabilities but also alleviates the burden on developers to manually manage memory safety, allowing them to focus on the logic and performance of their applications. By providing these safety guarantees without sacrificing the low-level control and performance critical in systems programming, Rust presents a viable, modern alternative for projects where security and reliability are paramount.

In the next chapter I will compare a C++ vs RUST in case of memory safety issues.

Undefined Behavior

Example 1 - Writing to not allocated memmory

GODBOLT

  • CPP
    • It would lead to undefined behavior because the algorithm attempts to write to memory locations that have not been allocated or are not owned by the dst vector.
#include <iostream>
#include <algorithm>
#include <vector>
#include <cmath>

auto f(const std::vector<int>& src) -> std::vector<int> {
    std::vector<int> dst;
    dst.reserve(src.size()); // Reserve space to avoid reallocations
     std::transform(src.begin(), src.end(), std::back_inserter(dst), [](int i) {
         return std::pow(i, 2);
     });
    std::transform(src.begin(), src.end(), dst.begin(), [](int i) {
        return std::pow(i, 2);
    });
    return dst;
}

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    auto res = f(vec);
    for (const auto& v : res) {
        std::cout << v << " ";
    }
    std::cout << std::endl;
    return 0;
}
  • RUST
    • Rust's compiler and type system prevent the kind of mistake seen in the C++ example, but for educational purposes, let's start with an attempt that might resemble the misuse of std::transform with an uninitialized destination:
    • Rust does not allow uninitialized memory access or buffer overflow by design. The language's safety guarantees and compile-time checks ensure that operations on collections like vectors are performed within the bounds of allocated memory, preventing undefined behavior related to memory access.
fn f(src: &Vec<i32>) -> Vec<i32> {
    let mut dst: Vec<i32>;
    //let mut dst: Vec<i32> = Vec::with_capacity(src.len());
    src.iter()
        .map(|&x| x.pow(2))
        .for_each(|xx| dst.push(xx));
    dst
}

pub fn main() {
    let src = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    let res = f(&src);
    println!("{:?}", res);
}

To prevent the problem in C++, according to the C++ Core Guidelines, one should not declare a variable before initializing it. The idiomatic way to write this code in C++23 would be

#include <iostream>
#include <ranges>
#include <vector>
#include <cmath>

using namespace std;

auto f(const std::vector<int>& src) -> std::vector<int> {
    return src | views::transform([](int v) { return pow(v, 2); }) | ranges::to<std::vector<int>>();
}

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
    auto res = f(vec);
    for (const auto& v : res) {
        std::cout << v << " ";
    }
    std::cout << std::endl;
    return 0;
}

It's also nicer to write this way in Rust, becuase it doesn't require dst to be mutable, and is shorter and more succint

fn f(src: &Vec<i32>) -> Vec<i32> {
    src.iter().map(|&x| x.pow(2)).collect()
}

pub fn main() {
    let src = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    let res = f(&src);
    println!("{:?}", res);
}

Example 2 - Object Slicing and Polymorphism

GODBOLT

  • CPP Object slicing occurs when an object of a derived class is assigned to a base class object, leading to the loss of the derived part of the object. This can be particularly insidious in C++ because it often doesn't prevent compilation, but it can lead to unexpected runtime behavior.
#include <iostream>
#include <memory>
#include <vector>

class Base {
public:
    virtual void print() const {
        std::cout << "Base class" << std::endl;
    }
    virtual ~Base() = default;
};

class Derived : public Base {
public:
    void print() const override {
        std::cout << "Derived class" << std::endl;
    }
};

void process(const Base& obj) {
    obj.print(); // Polymorphic call
}

int main() {
    Derived d;
    process(d); // Expected polymorphic behavior

    std::vector<Base> vec;
    vec.push_back(d); // Object slicing here
    vec.back().print(); // Calls Base::print() instead of Derived::print(), due to object slicing

     std::vector<std::unique_ptr<Base>> vec;
     vec.push_back(std::make_unique<Derived>(d)); // No object slicing, polymorphism preserved
     vec.back()->print(); // Calls Derived::print(), preserving polymorphism

    return 0;
}
  • RUST In Rust, preventing object slicing and ensuring polymorphism works as expected can be achieved by using trait objects or generics, along with smart pointers like Box, which enable dynamic dispatch
use std::fmt::Debug;

trait Printable: Debug {
    fn print(&self);
}

#[derive(Debug)]
struct Base;
impl Printable for Base {
    fn print(&self) {
        println!("Base struct");
    }
}

#[derive(Debug)]
struct Derived;
impl Printable for Derived {
    fn print(&self) {
        println!("Derived struct");
    }
}

fn process(obj: &dyn Printable) {
    obj.print(); // Polymorphic call
}

pub fn main() {
    let d = Derived;
    process(&d); // Polymorphism in action

    let mut vec: Vec<Box<dyn Printable>> = Vec::new();
    vec.push(Box::new(d)); // No object slicing, dynamic dispatch works

    vec[0].print(); // Calls Derived::print(), preserving polymorphism

    // Using the debug trait to illustrate that the whole object is preserved
    println!("{:?}", vec[0]);
}

Example 3 - Lambdas - Dangling Reference

GODBOLT

  • CPP
    • Capturing local variables by reference in a lambda that outlives the scope of those variables typically leads to a dangling reference. Accessing a dangling reference is undefined behavior because the variable it refers to no longer exists.
#include <iostream>

auto f() {
    int v = 42;
    return [&]() {
    //return [=]() mutable {
        v += 100;
        return v;
    };
}

int main() {
    auto res = f();
    std::cout << res() << std::endl;
    return 0;
}
  • RUST
    • Rust's design inherently prevents the creation of dangling references or captures by ensuring that any captured variables live as long as the closure itself. This is achieved through Rust's ownership and borrowing rules, which enforce compile-time checks for lifetimes and ownership.
fn f() -> impl Fn() -> i32 {
    let v = 42;
    || {
        // This will not compile because `v` does not live long enough
        v + 100
    }
}

// Rust forces us to use Captured Variables by Value
// fn f() -> impl Fn() -> i32 {
//     let v = 42;
//     {
//         v + 100
//     }
// }

pub fn main() {
    let res = f();
    println!("{}", res());
}

Example 4 - Dangling iterators

GODBOLT

  • CPP
    • Erasing elements from a container (e.g., using erase method) invalidates iterators pointing to the erased elements and potentially beyond, depending on the container type.
#include <iostream>
#include <vector>
#include <string>

int main() {
    std::vector<int> v = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    auto it_beg = v.begin();
    auto it = v.begin() + 4;
    auto it_last = v.end();
    v.erase(it);                  // 'it' is invalidated
    std::cout << "1) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    v.erase(it_beg);
    std::cout << "2) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    v.erase(it);
    std::cout << "3) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    // v.erase(it);
    // std::cout << "4) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    // v.erase(it);
    // std::cout << "5) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    // v.erase(it);
    // std::cout << "6) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    // v.erase(it);
    // std::cout << "7) it_beg: "<< *it_beg << " it: " << *it << " it_last: " << *it_last << std::endl;  // Accessing 'it_s' now leads to undefined behavior
    return 0;
}
  • RUST
pub fn main() {
    let mut v = vec![0, 1, 2, 3, 4, 5, 6, 7, 8, 9];

    v.remove(4); // This is analogous to v.erase(it) in C++

    // Safe accesses using `get`
    println!("1) it_beg: {:?}, it: {:?}, it_last: {:?}", v.get(0), v.get(4), v.get(v.len()));

    v.remove(0); // Removes the first element, shifting all others left
    println!("2) it_beg: {:?}, it: {:?}, it_last: {:?}", v.get(0), v.get(4), v.get(v.len()));

    // Attempt to remove an element at a now-invalid index (handled safely)
    // This line would panic if we directly indexed, but with `get` we can see it returns `None`
    match v.get(4) {
        Some(&element) => {
            v.remove(4); // Safe if element exists
            println!("Element at index 4 removed");
        },
        None => println!("No element at index 4, cannot remove"),
    }

    println!("3) it_beg: {:?}, it: {:?}, it_last: {:?}", v.get(0), v.get(4), v.get(v.len()));

    //v.remove(9); // Removes the first element, shifting all others left
}

Example 5 - Maybe not undefined but weird std::map operator [] behavior

GODBOLT

  • CPP
    • When you use the indexing operator ([]) on a std::map in C++ to access an element by its key, and if that key does not exist in the map, a new element with that key will be automatically created and initialized to its default value.
#include <iostream>
#include <map>
#include <string>

// using EngineConfigMap = std::map<std::string, int>;

// class EngineController {

// public:
//     EngineController(const EngineConfigMap& config)
//     : m_config(config)
//     {
//         std::cout << "Initializing with default config: \n"
//                 << "max_temp: " << m_config["timeout"] << std::endl
//                 << "max_rpm: " << m_config["max_rpm"] << std::endl;
//     }
// private:
//     EngineConfigMap m_config;
// };

int main() {
    std::map<std::string, int> ids_map;
    ids_map["id1"] = 12;
    std::cout << ids_map["id2"] << std::endl;

    // EngineConfigMap empty_config;
    // EngineController ec(empty_config);

    return 0;
}
  • RUST
    • If "id2" does not exist in ids_map, it will be inserted with a default value of 0, and then the value is printed. This approach is idiomatic in Rust and provides a safe and explicit way to handle potential missing keys in HashMaps.
use std::collections::HashMap;

pub fn main() {
    let mut ids_map = HashMap::new();
    ids_map.insert("id1".to_string(), 12);

    // Using entry() and or_insert() to insert a default value for "id2" if it doesn't exist
    let id2_value = ids_map.entry("id2".to_string()).or_insert(0);
    println!("{}", id2_value);
}

Buffer Overflow

Example 1:

GODBOLT

  • C - A classic buffer overflow using an array in C language. This code compiles, but accessing arr[10] is undefined behavior,leading to a potential security vulnerability.
#include <stdio.h>

int main() {
    int arr[5] = {1, 2, 3, 4, 5};
    // Accidental buffer overflow
    arr[10] = 10;
    printf("%d\n", arr[10]);
    return 0;
}
  • CPP - Undefined behavior as well
#include <iostream>
#include <array>

int main() {
    std::array<int, 5> arr = {1, 2, 3, 4, 5};
    // Accidental buffer overflow
    arr[10] = 10;
    std::cout << arr[10] << std::endl;
    return 0;
}

 // Compilation error, same behaviour like rust compiler
int main() {
    std::array<int, 5> arr = {1, 2, 3, 4, 5};
    // Compiler error due to out of bounds access
    std::get<10>(arr) = 10;
    std::cout << std::get<10>(arr) << std::endl;
    return 0;
}
  • RUST - This code will not compile, as the compiler checks array bounds at compile time and prevents out-of-bounds access.
fn main() {
    let mut arr = [1, 2, 3, 4, 5];
    // Compile-time error for out-of-bounds access
    arr[10] = 10;
    println!("{}", arr[10]);
}

Example 2:

GODBOLT

  • CPP - Attempting to print 10 elements from a 5-element array. This loop goes beyond the array's bounds, leading to undefined behavior, which could cause crashes or unpredictable outputs.
#include <iostream>
#include <array>

int main() {
    std::array<int, 5> arr = {1, 2, 3, 4, 5};
    for (int i = 0; i < 10; i++) {
        std::cout << arr[i] << std::endl;
    }
    return 0;
}

 // Runtime error, similar like in rust.
 int main() {
     std::array<int, 5> arr = {1, 2, 3, 4, 5};
     // Runtime error due to out of bounds access
     for (auto i = 0; i < 10; i++)
         std::cout << arr.at(i) << std::endl;
     return 0;
 }
  • RUST - Attempting to access arr[5] results in a runtime error because the index 5 is out-of-bounds for an array of length 5. Rust's safety mechanisms detect this at runtime and cause the program to panic, preventing it from continuing with invalid memory access.
fn main() {
    let arr = [1, 2, 3, 4, 5];
    // Causes the program to panic for out-of-bounds access
    for i in 0..=5 {
        println!("{}", arr[i]);
    }
}

Example 3: strncpy overflow

GODBOLT

  • CPP
#include <iostream>
#include <cstring>
#include <array>

int main() {
    std::array<char, 10> buf;
    const char* input = "This is way too long for the buffer";

    strncpy(buf.data(), input, strlen(input));
    std::cout << buf.data() << std::endl;

    return 0;
}
  • RUST
fn main() {
    let mut buf = [0u8; 10];
    let input = "This is way too long for the buffer".as_bytes();

    buf.copy_from_slice(&input[..input.len()]);
    println!("{:?}", &buf);
}
//https://doc.rust-lang.org/src/core/slice/mod.rs.html#3648-3650

Example 4: Buffer overflow- undefined behavior

GODBOLT

  • CPP
#include <iostream>
#include <cstring>
#include <array>

constexpr size_t k_bufSize = 5;
const std::array<char, k_bufSize> buf = {'A', 'B', 'C', 'D', 'E'};

bool exists_in_buffer(char v)
{
    // return true in one of the first 4 iterations or UB due to out-of-bounds access
    for (auto i = 0; i <= k_bufSize; ++i) {
        if (buf[i] == v)
            return true;
    }

    return false;
}


int main() {
    std::cout << exists_in_buffer('\0') << std::endl;
    return 0;
}
  • RUST
const K_BUF_SIZE: usize = 5;
const BUF: [char; K_BUF_SIZE] = ['A', 'B', 'C', 'D', 'E'];

fn exists_in_buffer(v: char) -> bool {
    // Iterate over each element in the buffer
    for i in 0..K_BUF_SIZE+1 {
        if BUF[i] == v {
            return true;
        }
    }
    false
}

pub fn main() {
    println!("{}", exists_in_buffer('\0'));
}

This is an example of Rust’s memory safety principles in action. In many low-level languages, this kind of check is not done, and when you provide an incorrect index, invalid memory can be accessed. Rust protects you against this kind of error by immediately exiting instead of allowing the memory access and continuing.

Example 5: std::io

GODBOLT

  • CPP
#include <iostream>

int main() {
    char buffer[10];
    std::cin >> buffer;
    std::cout << "Input: " << buffer << std::endl;
    return 0;
}
  • RUST
use std::io::{self, Read};

fn main() -> io::Result<()> {
    let mut buffer = [0; 10];
    let n = io::stdin().read(&mut buffer)?;
    println!("Input: {}", String::from_utf8_lossy(&buffer[..n]));
    Ok(())
}

//fn main() -> io::Result<()> {
   //let mut input = String::new();
   //io::stdin().read_line(&mut input)?;
   //// Trim the newline from the end of the input
   //let input = input.trim_end();
   //println!("Input: {}", input);
   //Ok(())
//}

Rust's standard library is designed to prevent from buffer overflow here by ensuring that read only reads up to the buffer's capacity.

Example 6: DLT deamon buffer overflow issue

The use of fscanf(handle, "%s", str1) without specifying a limit for the number of characters to read into str1 and apid/ctid arrays poses a risk of buffer overflow. If the file contains a string longer than the buffer size (DLT_COMMON_BUFFER_LENGTH and DLT_ID_SIZE), it will overflow, leading to undefined behavior, which can be exploited.

  • CPP
DltReturnValue dlt_filter_load(DltFilter *filter, const char *filename, int verbose)
{
    if ((filter == NULL) || (filename == NULL))
        return DLT_RETURN_WRONG_PARAMETER;
    FILE *handle;
    char str1[DLT_COMMON_BUFFER_LENGTH];
    char apid[DLT_ID_SIZE], ctid[DLT_ID_SIZE];
    PRINT_FUNCTION_VERBOSE(verbose);
    handle = fopen(filename, "r");
    if (handle == NULL) {
        dlt_vlog(LOG_WARNING, "Filter file %s cannot be opened!\n", filename);
        return DLT_RETURN_ERROR;
    }
    /* Reset filters */
    filter->counter = 0;
    while (!feof(handle)) {
        str1[0] = 0;

        if (fscanf(handle, "%s", str1) != 1)
            break;

        if (str1[0] == 0)
            break;
        printf(" %s", str1);
        if (strcmp(str1, "----") == 0)
            dlt_set_id(apid, "");
        else
            dlt_set_id(apid, str1);

        str1[0] = 0;

        if (fscanf(handle, "%s", str1) != 1)
            break;

        if (str1[0] == 0)
            break;
        printf(" %s\r\n", str1);
        if (strcmp(str1, "----") == 0)
            dlt_set_id(ctid, "");
        else
            dlt_set_id(ctid, str1);
        if (filter->counter < DLT_FILTER_MAX) {
            dlt_filter_add(filter, apid, ctid, verbose);
        }
        else {
            dlt_vlog(LOG_WARNING,
                     "Maximum number (%d) of allowed filters reached, ignoring rest of filters!\n",
                     DLT_FILTER_MAX);
        }
    }
    fclose(handle);
    return DLT_RETURN_OK;
}

Use After Free types of bugs

"Use after Free" (UAF) is a memory corruption issue where a program tries to access memory that has already been freed. This typically happens due to programming errors when referencing memory that was deallocated. It can lead to crashes, data corruption, or even security vulnerabilities. Detection is challenging, and mitigation involves proper memory management practices and programming language features that enforce memory safety, such as Rust's ownership system.

Example 1: Use After Free

GODBOLT

  • C - This code compiles but leads to undefined behavior by accessing memory that has been freed.
#include <iostream>

int main() {
    int *ptr = new int(10);
    delete ptr;
    // Undefined behavior: use after free
    std::cout << *ptr << std::endl;
    return 0;
}
  • CPP - Usage of std::unique_ptr prevents from UAF
#include <iostream>
#include <memory>

int main() {
    auto ptr = std::make_unique<int>(10);
    // delete ptr is impossible, so use after free is impossible.
    std::cout << *ptr << std::endl;
    *ptr = 11;
    std::cout << *ptr << std::endl;
    return 0;
}
  • RUST
fn main() {
    let ptr = Box::new(10);
    // Memory is automatically cleaned up when `ptr` goes out of scope
    // Attempting to use `ptr` after this point would result in a compile-time error
    println!("{}", ptr);
    // Rust's ownership system prevents "use after free" by design
}

Example 2: Memory bounds, dangling pointer or even use after free

GODBOLT

  • CPP
#include <iostream>
#include <vector>
#include <optional>
#include <exception>

// Function to modify the vector by adding a new value
void modify(std::vector<int>& vec, int value) {
    vec.push_back(value);
}


// Function to safely get a value from the vector by index, returns std::optional
std::optional<int> get(const std::vector<int>& vec, size_t index) {
    if (index < vec.size()) { // You need to write THIS
        return vec[index];
    } else {
        return std::nullopt;
    }
}


int unsafe_get(const std::vector<int>& vec, size_t index) {
    return vec[index];
}

int safe_get(const std::vector<int>& vec, size_t index) {
    return vec.at(index);
}

int main() {
    std::vector<int> data = {1, 2, 3};

    //// 4) Get an internal data reference
    // auto &x = data[0];
    // //auto *x = &data[0];
    // std::cout << "x(1) = " << x << ", &data[0] = " << &data[0] << std::endl;
    //// `modify (vec.push_back)` will might cause the backing storage of `data` to be reallocated. Dangling pointer and use after free issue, Does this compile in CPP ?
    // data.push_back(4);

    // 1) Modifying the vector
    modify(data, 10);
    std::cout << "Modified data: ";
    for (auto& n : data) {
        std::cout << n << " ";
    }
    std::cout << "\n";

    // 2) Attempt to get a value from the vector by index
    const size_t index = 42;
    auto result = get(data, index);
    if (result.has_value()) {
        std::cout << "Valid data returned: " << result.value() << std::endl;
    } else {
        std::cout << "No data exists for index: " << index << std::endl;
    }

    // 3) Use automatic bounds checked access employing exceptions
    try {
        safe_get(data, index);
    }
    catch (std::out_of_range &e) {
        std::cout << "No data exists for index: " << index << std::endl;
    }

    // 4) Will this fail ?
    std::cout << unsafe_get(data, index) << std::endl;

    //// 5) Modify the reference
    // x = 11;
    // std::cout << "x(2) = " << x << ", &data[0] = " << &data[0] << std::endl;
    // std::cout << "Modified data once again: ";
    // for (auto& n : data) {
    //     std::cout << n << " ";
    // }
    // std::cout << "\n";

    return 0;
}
  • RUST
fn modify(vec: &mut Vec<i32>, value: i32) -> () {
    vec.push(value);
    ()
}

fn get(vec: &Vec<i32>,index: usize) -> Option<&i32> {
    vec.get(index)
}

fn not_best_get(vec: &Vec<i32>,index: usize) -> i32 {
    vec[index]
}


fn main() {

    let mut data = vec![1, 2, 3];

    //// 4) Get an internal reference
    //let x = &data[0];
    //// This does not compile in rust
    //data.push(4);

    // 1) Modifying the vector
    modify(&mut data, 10);
    println!("{:?}", data);
    // 2) Attempt to get a value from the vector by index
    let index: usize = 42;
    match get(&data, index) {
        Some(x) => {
            println!("Valid data returned: {}", x);
        },
        None => {
            println!("No data exists for index:{}", index);
        }
    }

    // 3) Will this fail ?
    //println!("{}", not_best_get(&data, index));

    //// 4) Modify the reference
    //x = 11;
    //println!("{}", x);
    //println!("{:?}", data);
}

Problems with pointers

Pointers are a powerful feature in programming languages like C and C++, providing the flexibility to directly manipulate memory addresses. They are essential for a range of programming tasks, from creating efficient data structures to interfacing with hardware and operating systems. However, their power comes with significant complexity and potential pitfalls. Improper use of pointers can lead to memory leaks, dangling pointers, null pointer dereferences, and undefined behavior, making programs unstable, insecure, and prone to crashes.

Example 1: Dangling Pointer

GODBOLT

  • C
#include "stdio.h"

int main() {
    int* a = NULL;
    {
        int b = 5;
        a = &b;
    }
    int c = 10;
    // At this point, b goes out of scope, but the memory allocated to it does not
    printf("a: %d\n", *a);

    return 0;
}
  • CPP
#include <iostream>
#include <memory>

int main() {
    std::unique_ptr<int> a;
    {
        auto b = std::make_unique<int>(5);
        a = std::move(b);  // Move the ownership of the unique_ptr<int> from 'b' to 'a'
    } // 'b' goes out of scope here, but its value is safely stored in 'a'

    int _c = 10;
    // At this point, 'b' has gone out of scope, but its value is safely stored in 'a'
    std::cout << *a << std::endl;

    return 0;
}
    
  • RUST
pub fn main() {
    let a: Box<i32>;
    {
        let b = Box::new(5);
        a = b; // Move the ownership of the Box<i32> from 'b' to 'a'
    } // 'b' goes out of scope here, but its value is safely stored in 'a'

    let _c = 10;
    // At this point, 'b' has gone out of scope, but its value is safely stored in 'a'
    println!("a: {}", a);
}


// pub fn main() {
//     let a: *const i32;
//     {
//         let b = 5;
//         a = &b as *const i32; // Assign the address of 'b' to 'a'
//     } // 'b' goes out of scope here, but the memory allocated to it does not

//     let c = 10;
//     // At this point, 'b' has gone out of scope, so 'a' is a dangling pointer
//     unsafe {
//         println!("a: {}", *a); // Unsafe: accessing potentially invalid memory
//     }
// }

Example 2: Null Pointer Dereference

GODBOLT

  • C
#include "stdio.h"

void process(int* ptr) {
    // Unsafe: dereferencing a null pointer leads to undefined behavior.
    printf("Data:%d\n", *ptr);
}

int main() {
    int* ptr = NULL;
    process(ptr);
    return 0;
}
  • CPP
#include <iostream>
#include <memory>
#include <optional>
#include <exceptions>

void process1(std::unique_ptr<int> ptr) {
    std::cout << "Data: " << *ptr << std::endl;
}

void process2(std::optional<int> ptr) {
    int v = ptr.value();
    std::cout << "Data: " << v << std::endl;
}

void process3(std::optional<int> ptr) {
    auto msg = ptr.transform(
            [](auto p){ return std::format("Data: {}", p); }
        ).value_or(
            "No value"
        );
    std::cout << msg << std::endl;
}

int main() {
    std::optional<int> a;
    try {
        process2(a);  // will throw an exception
    }
    catch (std::bad_optional_access &ex) {
        std::cout << "Received a null pointer (None value)." << std::endl;
    }

    process3(a); // will work as expected with no exceptions.

    std::unique_ptr<int> b;
    process1(std::move(b)); // will panic.

    return 0;
}
  • RUST
fn process(ptr: Option<&i32>) {
    match ptr {
        Some(val) => println!("Data:{}", val),
        None => println!("Received a null pointer (None value)."),
    }
}

fn main() {
    let ptr: Option<&i32> = None;
    process(ptr);
}

Example 3: Dangling Pointer

GODBOLT

  • C
#include "stdio.h"
#include "stdlib.h"

int* dangling_pointer() {
    int value = 42;
    return &value; // Returning address of the local variable, which will be deallocated
}

int main() {
    int* ptr = dangling_pointer();
    printf("Data: %d\n", *ptr); // Undefined behavior: accessing a deallocated stack frame
    return 0;
}
  • CPP
#include <iostream>
#include <memory>

std::unique_ptr<int> dangling_pointer() {
    return std::make_unique<int>(42);
}

int main() {
    auto val = dangling_pointer();
    std::cout << "Data: " << *val << std::endl;  // Safe: `val` owns the data directly.
}

  • RUST
fn dangling_pointer() -> Box<i32> {
    let value = Box::new(42);
    value
}

pub fn main() {
    let val = dangling_pointer();
    println!("Data:{}", *val); // Safe: `val` owns the data directly.
}

Example 4: std::unique_ptr

GODBOLT

  • CPP
#include <iostream>
#include <memory>

void process(std::unique_ptr<int> ptr) {
    std::cout << "1) Data: " << *ptr << std::endl;
}

int main() {
    auto ptr = std::make_unique<int>(10);
    process(std::move(ptr)); // Ownership is transferred to process()

    // ptr is now moved; accessing *ptr would result in undefined behavior
    std::cout << "2) Data: " << *ptr << std::endl;

    return 0;
}

This example demonstrates std::unique_ptr for managing dynamic memory and transferring ownership. When ptr is passed to process, its ownership is moved, preventing ptr from being accidentally used after the transfer, which would lead to undefined behavior.

  • RUST
fn process(ptr: Box<i32>) {
    println!("1) Data: {}", ptr);
}

fn main() {
    let ptr = Box::new(10);
    process(ptr); // Ownership is moved to process()

    // Rust's compiler will prevent us from using ptr here since its ownership has been moved
    // Compile-time error: value borrowed here after move
    println!("2) Data: {}", ptr);
}

Rust naturally avoids these issues through its ownership system. Once a value's ownership is moved, the original variable cannot be used, preventing dangling pointers or undefined behavior. This is enforced at compile time, making Rust programs safer by design.

Example 5: std::shared_ptr

GODBOLT

  • CPP
#include <iostream>
#include <memory>

void process(std::shared_ptr<int> ptr) {
    std::cout << "Data: " << *ptr << " (count: " << ptr.use_count() << ")" << std::endl;
}

int main() {
    auto ptr = std::make_shared<int>(10);
    process(ptr); // Shared ownership allows ptr to be used after being passed

    std::cout << "Main still owns ptr with data: " << *ptr << " (count: " << ptr.use_count() << ")" << std::endl;

    return 0;
}

This example illustrates the use of std::shared_ptr for shared ownership scenarios. The reference count mechanism ensures that the memory is only freed when the last owner goes out of scope, avoiding premature deallocation.

  • RUST
use std::rc::Rc;

fn process(ptr: Rc<i32>) {
    println!("Data: {} (count: {})", ptr, Rc::strong_count(&ptr));
}

fn main() {
    let ptr = Rc::new(10);
    process(ptr.clone()); // The Rc type allows for shared ownership through reference counting

    println!("Main still owns ptr with data: {} (count: {})", ptr, Rc::strong_count(&ptr));
}

Rust's Rc<T> type provides shared ownership with reference counting, similar to std::shared_ptr. It ensures that the memory is deallocated only when the last reference goes out of scope. Rust further prevents data races by ensuring Rc<T> is only used in single-threaded scenarios, with Arc<T> available for multi-threaded contexts.

Example 6: Moving std::shared_ptr

GODBOLT

  • CPP
#include <iostream>
#include <memory>

void process(std::shared_ptr<int> ptr) {
    std::cout << "Data: " << *ptr << " (count: " << ptr.use_count() << ")" << std::endl;
}

int main() {
    auto ptr = std::make_shared<int>(10);
    process(ptr); // Shared ownership allows ptr to be used after being passed

    std::cout << "Main still owns ptr with data: " << *ptr << " (count: " << ptr.use_count() << ")" << std::endl;

    std::shared_ptr<int> moved = std::move(ptr);
    std::cout << "Moved use count: " << *moved << " (count: " << moved.use_count() << ")" << std::endl;
    std::cout << "Original after move: "<< *ptr << " " << ptr.use_count() << std::endl; // ptr is now nullptr, UB

    return 0;
}

This example demonstrates moving a std::shared_ptr in C++. Moving transfers ownership of the managed object to another std::shared_ptr, effectively nullifying the original pointer without altering the reference count. This operation is useful for avoiding unnecessary atomic operations associated with incrementing and decrementing the reference count, improving performance in certain scenarios.

  • RUST
use std::rc::Rc;

fn process(ptr: Rc<i32>) {
    println!("Data: {} (count: {})", *ptr, Rc::strong_count(&ptr));
}

pub fn main() {
    let ptr = Rc::new(10);
    process(Rc::clone(&ptr)); // Simulates shared ownership by increasing the reference count

    println!("Main still owns ptr with data: {} (count: {})", *ptr, Rc::strong_count(&ptr));

    // Note: Direct move in Rust transfers ownership and makes the original variable inaccessible
    let moved = ptr.clone();

    println!("Moved use count: {} (count: {})", *moved, Rc::strong_count(&moved));
    // Note: This will NOT compile, ptr is not longer accessible
    //println!("Original after move: {} (count: {})", *ptr, Rc::strong_count(&ptr));
}

In Rust, the concept of moving a Rc<T> doesn't directly translate from C++ because Rust's ownership model ensures safety by preventing access to moved values. Cloning an Rc<T> increases the reference count, simulating shared ownership similar to std::shared_ptr. However, Rust's compile-time checks prevent the use of moved values, avoiding the risk of null pointer dereferences and undefined behavior, showcasing Rust's approach to memory safety.

Example 1: Simple Memory Leak

  • CPP
#include <iostream>
#include <memory>

bool is_limit_reached(int speed_limit) {

    int current_speed = 0;

    for (int factor = 0; factor < 100; ++factor) {
        // std::unique_ptr<int> ptr = std::make_unique<int>(10); // Correct way to aloacte memory with automatic deallocation
        int* ptr = new int(2);
        std::cout << "Allocated memory at address: " << ptr << std::endl;
        current_speed += (factor + *ptr);
        // delete ptr; // Correct way to deallocate memory
    }

    int* ptr = new int(10);
    current_speed += *ptr;
    if (current_speed > speed_limit) {
        // delete ptr; // Correct way to deallocate memory
        return true;
    }
    return false;
}

int main() {
    if (is_limit_reached(120)) {
        std::cout << "Speed Limit exceeded." << std::endl;
    }
    return 0;
}
  • RUST
fn is_limit_reached(speed_limit: i32) -> bool {

    let mut current_speed: i32 = 0;

    for factor in 0..100 {
        let ptr = Box::new(2); // Dynamically allocate memory with automatic deallocation
        current_speed += factor + *ptr;
    }

    let ptr = Box::new(10); // Dynamically allocate memory with automatic deallocation
    current_speed += *ptr;
    if current_speed > speed_limit {
        return true;
    }
    return false;
    // Memory pointed to by `ptr` is automatically deallocated here
}

fn main() {
    if is_limit_reached(120) {
        println!("Speed Limit exceeded.");
    }
}

Double Free

Example 1: Classic double free problem

GODBOLT

  • CPP
#include <iostream>

void cause_double_free() {
    int* ptr = new int(10); // Allocate memory on the heap
    delete ptr; // Correctly free memory
    *ptr = 11;
    std:: cout << "ptr = " << *ptr << std::endl;
    delete ptr; // Double free error: undefined behavior
    *ptr = 12;
    std:: cout << "ptr = " << *ptr << std::endl;
}

int main() {
    cause_double_free();
    return 0;
}
  • RUST
fn cause_double_free() {
    let mut ptr = Box::new(10); // Allocate memory on the heap
    *ptr = 11;
    println!("ptr = {}", *ptr);
    // Memory is automatically freed when `ptr` goes out of scope
}

pub fn main() {
    cause_double_free();
    println!("No double free error occurred.");
}

Example 2: Double free because of the manual memory management implementation issue

GODBOLT

  • CPP
#include <iostream>
#include <algorithm>
#include <cstring>

class DynamicArray {
public:
    DynamicArray(size_t size)
    : size(size)
    , data(new int[size]) {}

    ~DynamicArray() {
        std::cout << "DTOR called on data: " << std::hex << data << std::endl;
        delete[] data;
    }

    // Copy constructor for deep copy
    DynamicArray(const DynamicArray& other)
    : size(other.size)
    , data(new int[other.size]) {
        std::memcpy(data, other.data, size * sizeof(int));
    }

    // Incorrect copy assignment operator - shallow copy
    DynamicArray& operator=(const DynamicArray& other) {
        if (this != &other) {
            // First problem: Previous data is not deleted, leading to a memory leak
            // delete[] data;
            size = other.size;
            data = other.data; //%// Second problem: This leads to sharing the same data pointer
            // data = new int[other.size];
            std::memcpy(data, other.data, size * sizeof(int));
        }
        return *this;
    }

    void fillWith(int value) {
        std::fill(data, data + size, value);
    }

    void print() const {
        for (int i = 0; i < size; ++i) {
            std::cout << data[i] << " ";
        }
        std::cout << "\n";
    }

//private:
    int* data;
    size_t size;
};


// class DynamicArray {
// public:
//     std::vector<int> data;

//     // Constructor initializes the vector to a specific size with a default value
//     DynamicArray(size_t size, int initialValue = 0) : data(size, initialValue) {}

//     // Method to fill the vector with a specific value
//     void fillWith(int value) {
//         std::fill(data.begin(), data.end(), value);
//     }

//     // Method to print the contents of the vector
//     void print() const {
//         for (int item : data) {
//             std::cout << item << " ";
//         }
//         std::cout << "\n";
//     }
// };



void cause_double_free() {
    DynamicArray arr1(10);
    arr1.fillWith(11);

    DynamicArray arr2 = arr1; // Copy constructor - deep copy
    arr2.fillWith(22); // Modifies arr1 after it has been assigned to arr2
    std::cout << "arr1: "; arr1.print(); // Expected to print values from arr1
    std::cout << "arr2: "; arr2.print(); // Expected to print values from arr2

    DynamicArray arr3(5);
    arr3 = arr2; // Copy assignment operator
    // Both arr1 and arr2 now share the same `data` pointer.
    arr3.fillWith(33); // Modifies arr1 after it has been assigned to arr2

    std::cout << "Adresses:" << std::hex << " arr1:" << arr1.data << " arr2:" << arr2.data << " arr3:" << arr3.data  <<  std::endl;
    std::cout << "arr1: "; arr1.print(); // Expected to print values from arr1
    std::cout << "arr2: "; arr2.print(); // Expected to print values from arr2
    std::cout << "arr3: "; arr3.print(); // Expected to print values from arr3
}

int main() {
    cause_double_free(); // This will lead to a double free error when arr2 and arr3 are destructed.
    return 0;
}
  • RUST
struct DynamicArray {
    data: Vec<i32>,
}

impl DynamicArray {
    fn new(size: usize) -> Self {
        DynamicArray {
            data: vec![0; size],
        }
    }

    fn fill_with(&mut self, value: i32) {
        for item in &mut self.data {
            *item = value;
        }
    }

    fn print(&self) {
        println!("{:?}", self.data);
    }
}

pub fn main() {
    let mut arr1 = DynamicArray::new(10);
    arr1.fill_with(11);

    let arr2 = arr1; // Ownership is moved to arr2, arr1 is no longer valid

    // Trying to use `arr1` here would result in a compile-time error
    //arr1.fill_with(0); // Uncommenting this line will not compile

    // arr2 is safely used
    arr2.print();
}

Example 1: Format String Issue

  • C
#include <cstdio>
#include <cstring>
#include <cstdlib>


#define MAX_USERNAME_LENGTH 32
#define MAX_PASSWORD_LENGTH 32
#define BCRYPT_HASHSIZE 61

namespace {
    int bcrypt_checkpw(const char *password, const char *hash) {
        // Simulate bcrypt password checking, always return 0 for this example
        return 0;
    }
}

// Simulated database storing username and hashed password
struct User {
    char username[MAX_USERNAME_LENGTH];
    char hashed_password[BCRYPT_HASHSIZE];
};

// Example users array. In a real application, you would dynamically query a secure database.
struct User database[] = {
    {"admin", "aaaaaaaa"},
    {"lukas", "hashed_lukasPass"},
    {"greg", "hashed_gregPass"}
};


// Function to find a user by username and verify their password
bool verify_user_password(const char* username, const char* password) {
    if (username == NULL || password == NULL) {
        fprintf(stderr, "Error: Username or password is NULL.\n");
        return false; // Early return if input is NULL
    }

    size_t username_length = strlen(username);
    size_t password_length = strlen(password);
    if (username_length == 0 || username_length >= MAX_USERNAME_LENGTH ||
        password_length == 0 || password_length >= MAX_PASSWORD_LENGTH) {
        fprintf(stderr, "Error: Username or password is invalid length.\n");
        return false; // Check for valid input length
    }

    // Simulate querying a database for the user
    for (size_t i = 0; i < sizeof(database) / sizeof(database[0]); ++i) {
        if (strncmp(database[i].username, username, strlen(database[i].username)) == 0) {
            // Simulating using bcrypt to compare the password with the stored hash
            if (bcrypt_checkpw(password, database[i].hashed_password) == 0) {
                return true; // Password matches
            } else {
                return false;
            }
        }
    }
    return false; // User not found
}

// Securely zeroize sensitive data in memory
void secure_zeroize(void *ptr, size_t len) {
    volatile unsigned char *p = (unsigned char *)ptr;
    while (len--) {
        *p++ = 0;
    }
}

// Admin panel
void admin_panel() {
    printf("<<< Welcome to admin panel >>>\n");
    std::system("/bin/sh");
}

// Handle admin authentication
bool authenticate_admin() {
    char entered_name[MAX_USERNAME_LENGTH];
    char entered_password[MAX_PASSWORD_LENGTH];
    secure_zeroize(entered_name, sizeof(entered_name));
    secure_zeroize(entered_password, sizeof(entered_password));

    printf("Enter username:\n");
    fgets(entered_name, sizeof(entered_name), stdin);

    printf("Enter password:\n");
    fgets(entered_password, sizeof(entered_password), stdin);
    entered_password[strcspn(entered_password, "\n")] = '\0'; // Remove newline character

    if (verify_user_password(entered_name, entered_password)){
        printf("\n------------------------------------------------------------\n");
        printf("Password matched, Authenticated succesfully for the user:");
        printf(entered_name);
        printf("\n------------------------------------------------------------\n");
        return true;
    }
    printf("Password mismatch for the user: %s\n", entered_name);
    return false;
}

int main() {

    // Simulate reading from database ...
    if (authenticate_admin()) {
        admin_panel();
    } else {
        printf("Authentication failed!\n");
    }

    return 0;
}

The line printf(entered_name); in the code is potentially vulnerable to a format string vulnerability.

In C, printf interprets its format string argument (entered_name in this case) and expects subsequent arguments to match the placeholders in the format string. If the contents of entered_name contain format specifiers (like %s, %d, etc.) and those are not intended for formatting, but rather as part of the data itself, it can lead to unexpected behavior.

Consequences of a format string vulnerability include:

  1. Information Disclosure: Attackers can exploit format string vulnerabilities to read arbitrary memory contents, potentially exposing sensitive information like stack values, function return addresses, or other variables in memory.

  2. Arbitrary Memory Modification: Format string vulnerabilities can also be leveraged to write data to arbitrary memory locations. This can lead to a variety of security issues, including code execution, denial of service, or corruption of critical data.

To mitigate this vulnerability, you should use format specifiers properly with printf, ensuring that user-controlled input is not directly used as the format string. Instead, use format specifiers like %s to print strings safely. In this specific case, it would be safer to use printf("%s", entered_name); to ensure that entered_name is treated as a string and not as a format specifier. Additionally, consider validating and sanitizing user input to prevent malicious input from being interpreted as format specifiers.

  • RUST Format string vulnerabilities are avoided because Rust's println! and format! macros do not interpret format strings from user input as format specifiers, preventing attackers from manipulating the program's behavior through format strings.

Type Safety and Error Handling - No Null, No Problem

Safe Rust code is guaranteed to avoid undefined behavior.

Type safety is a key element to reliability. We ensure that safe Rust code is free of "undefined behavior", which is the term that compiler authors use to refer to things like segfaults, data races, and out-of-bounds memory accesses.

Rust's Tiered Error Handling: A Technical Overview

In Rust, error handling is designed to be both explicit and flexible. It uses a tiered approach based on the severity and recoverability of errors, offering different mechanisms for each level:

1. Option:

  • Purpose: Represents a value that may or may not be present.
  • Use cases: Ideal for situations where a value might be missing due to user input, network issues, or optional data structures.
  • Example: Checking if a file exists using fs::metadata(path).ok().

2. Result:

  • Purpose: Represents either a successful outcome (Ok) or an error (Err).
  • Use cases: Handling recoverable errors like I/O operations, parsing, or database interactions.
  • Example: Reading data from a file using fs::read_to_string(path).expect("Failed to read file").

3. Panic:

  • Purpose: Signals an unrecoverable error that requires program termination.
  • Use cases: Internal logic errors, resource exhaustion, or unexpected system failures.
  • Example: Panicking with panic!("Invalid data format") when encountering corrupted data.

4. Program termination:

  • Purpose: Occurs due to catastrophic events like memory exhaustion or segmentation faults.
  • Use cases: Unforeseen circumstances beyond the program's control.
  • Example: Out-of-memory error leading to program termination.

Benefits:

  • Clarity: Explicit error handling improves code readability and maintainability.
  • Safety: Catching errors early prevents them from propagating and causing further issues.
  • Flexibility: Developers can choose the appropriate mechanism based on the error's severity and recoverability.
  • Performance: Rust's error handling is designed to be efficient and have minimal runtime overhead.

Additional points:

  • Rust's ownership system and type system also play a crucial role in preventing and handling errors.
  • The match expression and the ? operator provide convenient ways to work with Result values.
  • For custom error types, you can define your own enum variants and implement the Error trait.

Example 1: Exception Handling

  • CPP - In C++, exceptions provide a way to react to exceptional circumstances (like runtime errors) in programs by transferring control to special functions called handlers.
#include <iostream>
#include <stdexcept>

void riskyFunction() {
    bool errorOccurred = true; // Simulate an error
    if (errorOccurred) {
        throw std::runtime_error("Failed to execute risky operation");
    }
}

int main() {
    try {
        riskyFunction();
    } catch (const std::runtime_error& err) {
        std::cout << "Caught an error: " << err.what() << std::endl;
    }
    return 0;
}
  • RUST - Rust uses the Result type for error handling, which can either be Ok, indicating success, or Err, indicating an error.
fn risky_operation() -> Result<(), &'static str> {
    let error_occurred = true; // Simulate an error
    if error_occurred {
        return Err("Failed to execute risky operation");
    }
    Ok(())
}

fn main() {
    match risky_operation() {
        Ok(_) => println!("Operation succeeded."),
        Err(e) => println!("Caught an error: {}", e),
    }
}

Example 2: Exception Handling for File I/O

GODBOLT

  • CPP
#include <iostream>
#include <fstream>
#include <stdexcept>

void readFile(const std::string& filePath) {
    std::ifstream file(filePath);
    if (!file) {
        throw std::runtime_error("Unable to open file");
    }
    std::cout << "File opened successfully" << std::endl;
    // Read file contents...
}

int main() {
    try {
        readFile("example.txt");
    } catch (const std::runtime_error& err) {
        std::cout << "Caught an error: " << err.what() << std::endl;
    }
    return 0;
}

There is a similar mechanism to rust Result available since C++23: std::unexpected.

  • RUST
use std::fs::File;
use std::io::{self, Read};

fn read_file(file_path: &str) -> Result<String, io::Error> {
    let mut file = File::open(file_path)?;
    let mut contents = String::new();
    file.read_to_string(&mut contents)?;
    Ok(contents)
}

fn main() {
    match read_file("example.txt") {
        Ok(contents) => println!("File contents: {}", contents),
        Err(e) => println!("Caught an error: {}", e),
    }
}

Example 3: Result

  • RUST
#[derive(Debug)]
enum CopyError {
    LengthMismatch { src_len: usize, dst_len: usize },
}

fn safe_copy_from_slice(dst: &mut [u8], src: &[u8]) -> Result<(), CopyError> {
    if dst.len() != src.len() {
        Err(CopyError::LengthMismatch {
            src_len: src.len(),
            dst_len: dst.len()
        })
    } else {
        dst.copy_from_slice(src);
        Ok(())
    }
}

fn main() {
    let input = "This is way too long for the buffer".as_bytes();
    let mut buf = [0u8; 10];

    match safe_copy_from_slice(&mut buf, &input[0..buf.len()]) {
        Ok(()) => println!("Copy successful: {:?}", &buf),
        Err(CopyError::LengthMismatch { src_len, dst_len }) => {
            println!("Failed to copy: source length ({}) does not match destination length ({}).", src_len, dst_len);
        }
    }
}

Example 4: Option

The Option type in Rust and its equivalent pattern in C++ are used to represent the possibility of absence of a value

  • CPP
#include <iostream>
#include <optional>

std::optional<double> divide(double numerator, double denominator) {
    if (denominator == 0.0) {
        return std::nullopt;
    } else {
        return numerator / denominator;
    }
}

int main() {
    auto result = divide(10.0, 2.0);
    if (result.has_value()) {
        std::cout << "Result: " << result.value() << std::endl;
    } else {
        std::cout << "Cannot divide by zero" << std::endl;
    }
}
  • RUST
fn divide(numerator: f64, denominator: f64) -> Option<f64> {
    if denominator == 0.0 {
        None
    } else {
        Some(numerator / denominator)
    }
}

fn main() {
    let result = divide(10.0, 2.0);
    match result {
        Some(value) => println!("Result: {}", value),
        None => println!("Cannot divide by zero"),
    }
}

Example 5: Option - Fetching a Config Value

  • CPP
#include <iostream>
#include <optional>
#include <string>

std::optional<std::string> get_config_value(const std::string& key) {
    if (key == "timeout") {
        return "100";
    } else {
        return std::nullopt;
    }
}

int main() {
    auto timeout = get_config_value("timeout");
    if (timeout.has_value()) {
        std::cout << "Timeout is set to " << timeout.value() << std::endl;
    } else {
        std::cout << "Timeout not specified" << std::endl;
    }
}
  • RUST
fn get_config_value(key: &str) -> Option<String> {
    match key {
        "timeout" => Some("100".to_string()),
        _ => None,
    }
}

fn main() {
    let timeout = get_config_value("timeout");
    match timeout {
        Some(value) => println!("Timeout is set to {}", value),
        None => println!("Timeout not specified"),
    }
}

Types conversions

Example 1:

GODBOLT

  • CPP
#include <iostream>

int main() {
    // Implicit conversion (no casting required)
    int num_int = 10;
    double num_double = num_int; // Implicit conversion from int to double

    // C-style casting
    double a = 5.5;
    int b = (int)a; // C-style cast: double to int

    // Static cast
    double c = 10.7;
    int d = static_cast<int>(c); // Static cast: double to int

    // Reinterpret cast
    int e = 10;
    char* ptr = reinterpret_cast<char*>(&e); // Reinterpret cast: int pointer to char pointer

    // Const cast
    const int f = 20;
    int* g = const_cast<int*>(&f); // Const cast: const int pointer to int pointer (use with caution)

    // Dynamic cast (used in polymorphic classes)
    class Base {
    public:
        virtual void print() {
            std::cout << "Base class" << std::endl;
        }
    };

    class Derived : public Base {
    public:
        void print() override {
            std::cout << "Derived class" << std::endl;
        }
    };

    Base* base_ptr = new Derived();
    Derived* derived_ptr = dynamic_cast<Derived*>(base_ptr); // Dynamic cast: Base pointer to Derived pointer

    // Output the results
    std::cout << "Implicit conversion: " << num_double << std::endl;
    std::cout << "C-style casting: " << b << std::endl;
    std::cout << "Static cast: " << d << std::endl;
    std::cout << "Reinterpret cast: " << *ptr << std::endl;
    std::cout << "Const cast: " << *g << std::endl;
    if (derived_ptr) {
        derived_ptr->print();
    } else {
        std::cout << "Dynamic cast failed" << std::endl;
    }

    return 0;
}

  • RUST
pub fn main() {
    // Implicit conversion (no casting required)
    let num_int: i32 = 10;
    let num_double: f64 = num_int as f64; // Implicit conversion from i32 to f64

    // C-style casting (not recommended in Rust)
    let a: f64 = 5.5;
    let b: i32 = a as i32; // C-style cast: f64 to i32

    // Static cast (not directly available in Rust)
    let c: f64 = 10.7;
    let d: i32 = c as i32; // Rust uses as keyword for static cast

    // Reinterpret cast
    let e: i32 = 10;
    let ptr: *const i32 = &e;
    let ptr_cast: *const u8 = ptr as *const u8; // Reinterpret cast: i32 pointer to u8 pointer

    // Const cast (not directly available in Rust)
    let f: i32 = 20;
    let g: *const i32 = &f;
    let g_mut: *mut i32 = g as *mut i32; // Const cast: const i32 pointer to mut i32 pointer

    // Dynamic cast (not directly available in Rust)
    trait Base {
        fn print(&self);
    }

    struct Derived;

    impl Base for Derived {
        fn print(&self) {
            println!("Derived class");
        }
    }

    let base_ref: &dyn Base = &Derived;

    // Output the results
    println!("Implicit conversion: {}", num_double);
    println!("C-style casting: {}", b);
    println!("Static cast: {}", d);
    // unsafe {
    //     println!("Reinterpret cast: {:?}", *ptr_cast);
    //     *g_mut = 30; // Safe because we own the memory and it's not const anymore
    //     println!("Const cast: {}", *g_mut);
    // }

    println!("{:?}", base_ref.print());
}

Uninitialized Variables

Example 1

[AR, Rule 9.1] The value of an object shouldn't be read if it hasn't been written GODBOLT

  • CPP
#include <cstdio>

bool foo() {
    int var;

    if (var > 0) {
        return true;
    }
    return false;
}

int main() {
    printf("%d\n", foo());
}
  • RUST
fn foo() -> bool {
    let var: isize;

    if var > 0 {
        return true;
    }
    return false;
}

pub fn main() {
    println!("{}\n", foo());
}

Variable Overflow

Example 1:

GODBOLT

  • CPP
#include <iostream>
#include <cstdint>

int main() {
    uint8_t a = 200;
    uint8_t b = 100;
    uint8_t c = a + b;

    std::cout << "c: " << static_cast<int>(c) << std::endl;

    return 0;
}
  • RUST
pub fn main() {
    let a: u8 = 200;
    let b: u8 = 100;
    // Default addition, which will panic in debug mode due to overflow
    // In release mode, this will wrap around according to Rust's overflow semantics
    let c: u8 = a + b;
    println!("c: (default_add):{}", c);

    let c = a.wrapping_add(b); // Replaces the overflow line with safe wrapping behavior
    println!("c (wrapping_add): {}", c);

    // Checked addition - returns an Option, None if there's overflow
    match a.checked_add(b) {
        Some(value) => println!("c (checked_add): {}", value),
        None => println!("c (checked_add): Overflow detected"),
    }

    // Saturating addition - saturates at the numeric bounds instead of overflowing
    let c_saturating = a.saturating_add(b);
    println!("c (saturating_add): {}", c_saturating);
}

Concurrency Without Data Races

Rust fully embraces concurrent programming by leveraging operating system threads, mutexes, and channels. Its robust type system is central to transforming numerous potential runtime concurrency errors into compile-time issues. This approach, often termed "fearless concurrency," empowers software engineers to develop concurrent applications with confidence, trusting the compiler to enforce runtime correctness.

Threads

Example 1: Thread panics

GODBOLT

  • CPP
    • Unlike Rust, C++ does not have a built-in mechanism for recovering from a thread that has thrown an exception uncaught within that thread. Therefore, it's important to catch exceptions within any thread to ensure proper cleanup and to prevent the entire program from crashing.
#include <iostream>
#include <iostream>
#include <thread>
#include <chrono>
#include <exception>

void thread_function() {
    for (int i = 1; i < 10; ++i) {
        std::cout << "Counting from the other thread: " << i << "!" << std::endl;
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
        if (i == 5) {
            throw std::runtime_error("The other thread panicked!");
        }
    }
}

int main() {
    std::thread t([&]() {
        // try {
            thread_function();
        //} catch (const std::exception& e) {
        //    std::cerr << "Exception from the thread: " << e.what() << std::endl;
        //}
    });

    for (int i = 1; i < 10; ++i) {
        std::cout << "Counting from the main thread: " << i << "!" << std::endl;
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }

    // Wait for the thread to complete
    t.join();

    return 0;
}
  • RUST
    • Threads are all daemon threads, the main thread does not wait for them.
    • Thread panics are independent of each other
use std::thread;
use std::time::Duration;

pub fn main() {

    thread::spawn(|| {
        for i in 1..10 {
            println!("Counting from the other thread: {i}!");
            thread::sleep(Duration::from_millis(1));
            if i == 5 {
                panic!("The other thread panicked!")
            }
        }
    });

    for i in 1..10 {
        println!("Counting from the main thread: {i}!");
        thread::sleep(Duration::from_millis(1));
    }
}

Example 2: Data Race

GODBOLT

  • CPP
    • In C++, concurrent access to the shared variable counter without synchronization leads to a data race, resulting in undefined behavior.
#include <iostream>
#include <vector>
#include <thread>
#include <mutex>
#include <memory>

int main() {
    auto counter = 0;
    //auto counter = std::make_shared<int>(0); // Shared counter
    std::mutex mutex;

    std::vector<std::thread> handles;

    for (int i = 0; i < 2; ++i) {
        //auto counter_copy = counter;
        std::thread handle([&counter, &mutex]() {
            for (int j = 0; j < 1000000; ++j) {
                //std::lock_guard<std::mutex> lock(mutex); // Lock the mutex
                counter += 1;
            }
        });
        handles.push_back(std::move(handle)); // Store the thread handle
    }

    // Wait for all threads to complete
    for (auto& handle : handles) {
        handle.join();
    }

    // Safely access the counter one last time to print the final value
    std::lock_guard<std::mutex> lock(mutex);
    std::cout << "Final counter value: " << counter << std::endl;

    return 0;
}
  • RUST
    • In Rust, Mutex (Mutual Exclusion) ensures that only one thread can access the data at any time, preventing data races.
use std::sync::{Arc, Mutex};
use std::thread;

pub fn main() {
    //let counter = Arc::new(Mutex::new(0)); // Mutex for safe concurrent access
    let counter = 0;
    let mut handles = vec![];

    for _ in 0..2 {
        //let counter = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            //let mut num = counter.lock().unwrap(); // Lock the mutex
            for _ in 0..1000000 {
                //*num += 1;
                counter += 1;
            }
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    //println!("Final counter value: {}", *counter.lock().unwrap());
    println!("Final counter value: {}", counter);
}

Example 3: Deadlock

It demonstrates a simple deadlock, where each thread locks a mutex and then attempts to lock the other, but neither can proceed. GODBOLT

  • CPP
#include <iostream>
#include <mutex>
#include <thread>

std::mutex mutex1, mutex2;

void thread1() {
    std::lock_guard<std::mutex> lock1(mutex1);
    std::this_thread::sleep_for(std::chrono::milliseconds(1)); // Simulate work (and ensure deadlock)
    std::lock_guard<std::mutex> lock2(mutex2);
}

void thread2() {
    std::lock_guard<std::mutex> lock2(mutex2);
    std::this_thread::sleep_for(std::chrono::milliseconds(1));
    std::lock_guard<std::mutex> lock1(mutex1);
}

int main() {
    std::thread t1(thread1);
    std::thread t2(thread2);

    t1.join();
    t2.join();

    std::cout << "Finished without deadlock" << std::endl;

    return 0;
}
  • RUST
use std::sync::{Arc, Mutex};
use std::thread;

pub fn main() {
    let mutex1 = Arc::new(Mutex::new(0));
    let mutex2 = Arc::new(Mutex::new(0));

    let m1_clone = Arc::clone(&mutex1);
    let m2_clone = Arc::clone(&mutex2);

    let handle1 = thread::spawn(move || {
        let _lock1 = m1_clone.lock().unwrap();
        std::thread::sleep(std::time::Duration::from_millis(10));
        let _lock2 = m2_clone.lock().unwrap();
    });

    let handle2 = thread::spawn(move || {
        let _lock2 = mutex2.lock().unwrap();
        std::thread::sleep(std::time::Duration::from_millis(10));
        let _lock1 = mutex1.lock().unwrap();
    });

    handle1.join().unwrap();
    handle2.join().unwrap();
    println!("Finished without deadlock");
}

Message passing to transfer data between threads

Example 1:

GODBOLT

  • CPP
    • C++ does not have a direct equivalent to Rust's channels in the standard library, but we can achieve similar message passing between threads using a combination of mutexes, condition variables, and a queue.
#include <iostream>
#include <thread>
#include <vector>
#include <string>
#include <queue>
#include <mutex>
#include <condition_variable>
#include <chrono>

class SafeQueue {
public:
    void push(std::string value) {
        std::lock_guard<std::mutex> lock(m_mutex);
        m_queue.push(std::move(value));
        m_cond.notify_one(); // Notify one waiting thread
    }

    std::string pop() {
        std::unique_lock<std::mutex> lock(m_mutex);
        m_cond.wait(lock, [this] { return !m_queue.empty(); }); // Wait until the queue is not empty
        auto value = m_queue.front();
        m_queue.pop();
        return value;
    }

private:
    std::queue<std::string> m_queue;
    mutable std::mutex m_mutex;
    std::condition_variable m_cond;
};

int main() {
    SafeQueue queue;

    // Producer thread
    std::thread producer([&queue]() {
        std::vector<std::string> vals = {
            "111",
            "222",
            "333",
            "444",
        };

        for (auto& val : vals) {
            queue.push(val);
            std::this_thread::sleep_for(std::chrono::seconds(1));
        }
    });

    // Consumer thread
    std::thread consumer([&queue]() {
        for (int i = 0; i < 4; ++i) { // Expecting 4 messages
            auto received = queue.pop();
            std::cout << "Got: " << received << std::endl;
        }
    });

    producer.join();
    consumer.join();

    return 0;
}
  • RUST
use std::thread;
use std::sync::mpsc;
use std::time::Duration;

pub fn main() {
    let (producer, consumer) = mpsc::channel();

    thread::spawn(move || {
        let vals = vec![
            String::from("111"),
            String::from("222"),
            String::from("333"),
            String::from("444"),
        ];

        for val in vals {
            producer.send(val).unwrap();
            thread::sleep(Duration::from_secs(1));
        }
    });

    thread::spawn(move || {
        for received in consumer {
            println!("Got: {}", received);
        }
    }).join().unwrap(); // Wait for the consumer thread to finish processing
}

Is std::shared_pointer thread safe ?

Example 1:

  • CPP
    • Thread Safety of std::shared_ptr: The std::shared_ptr named counter is passed safely to multiple threads, demonstrating the thread-safe nature of creating and destroying std::shared_ptr copies. The reference count is managed correctly, ensuring the Counter object's lifetime is managed safely across threads.
    • Lack of Thread Safety in Object Access: The Counter::increment method is called concurrently by multiple threads without synchronization. Since incrementing the value member variable is not an atomic operation, this leads to a race condition, and the final value of counter is likely to be less than the expected 100,000 due to missed increments.
#include <iostream>
#include <memory>
#include <vector>
#include <thread>
#include <mutex>

class Counter {
public:
    void increment() {
        //std::lock_guard<std::mutex> guard(mutex); // Protect access to value
        ++value; // This operation is not thread-safe.
    }

    int getValue() const {
        //std::lock_guard<std::mutex> guard(mutex); // Protect access to value
        return value;
    }

private:
    //mutable std::mutex mutex; // mutable allows modification in const methods
    int value = 0;
};

void incrementCounter(std::shared_ptr<Counter> counter) {
    for (int i = 0; i < 10000; ++i) {
        counter->increment();
    }
}

int main() {
    constexpr auto num_of_threads = 10;
    auto counter = std::make_shared<Counter>();
    std::vector<std::thread> threads;

    // Create multiple threads that increment the shared counter.
    for (int i = 0; i < num_of_threads; ++i) {
        threads.emplace_back(incrementCounter, counter);
    }

    // Wait for all threads to complete.
    for (auto& thread : threads) {
        thread.join();
    }

    std::cout << "Expected value: " << num_of_threads * 10000 << std::endl;
    std::cout << "Actual value  : " << counter->getValue() << std::endl;

    return 0;
}
  • RUST
    • This Rust implementation ensures thread safety through the use of Mutex for data access synchronization and Arc for shared ownership among threads, similar to the thread safety mechanisms used in the provided C++ example.
// Common
use std::sync::{Arc, Mutex};
use std::thread;



// 1)
struct Counter {
    value: i32,
}

impl Counter {
    fn new() -> Self {
        Counter {
            value: 0,
        }
    }

    fn increment(&mut self) {
        self.value += 1;
    }

    fn get_value(&self) -> i32 {
        self.value
    }
}

pub fn main() {
    const num_of_threads: usize = 10;
    let counter = Arc::new(Mutex::new(Counter::new()));
    let mut threads = vec![];

    for _ in 0..num_of_threads {
        let mut counter_clone = Arc::clone(&counter);
        let handle = thread::spawn(move || {
            for _ in 0..10000 {
                counter_clone.lock().unwrap().increment();
            }
        });
        threads.push(handle);
    }

    for handle in threads {
        handle.join().unwrap();
    }

    println!("Expected value: {}", num_of_threads * 10000);
    println!("Actual Value:   {}", counter.lock().unwrap().get_value());
}



// 2)
// struct Counter {
//     value: Mutex<i32>,
// }

// impl Counter {
//     fn new() -> Self {
//         Counter {
//             value: Mutex::new(0),
//         }
//     }

//     fn increment(&self) {
//         let mut value = self.value.lock().unwrap();
//         *value += 1;
//     }

//     fn get_value(&self) -> i32 {
//         let value = self.value.lock().unwrap();
//         *value
//     }
// }

// fn main() {
//     const num_of_threads: usize = 10;
//     let counter = Arc::new(Counter::new());
//     let mut threads = vec![];

//     for _ in 0..10 {
//         let counter_clone = Arc::clone(&counter);
//         let handle = thread::spawn(move || {
//             for _ in 0..10000 {
//                 counter_clone.increment();
//             }
//         });
//         threads.push(handle);
//     }

//     for handle in threads {
//         handle.join().unwrap();
//     }

//     println!("Expected value: {}", num_of_threads * 10000);
//     println!("Actual Value:   {}", counter.get_value());
// }




// 3) This will NOT compile
// /*
//  * Explanation of the Compilation Error in this code:
//  * - The error in code here occurs because we are trying to mutate data inside an `Arc`
//  *   without using a synchronization primitive like `Mutex`. `Arc` does not implement
//  *   `DerefMut`, which means we cannot obtain a mutable reference to its contents directly.
//  *   Rust's ownership rules ensure that data can either have multiple owners (`Arc`) or be
//  *   mutable, but not both simultaneously without explicit synchronization mechanisms like `Mutex`.
//  *
//  * - The `increment` method requires mutable access to the `Counter` (`&mut self`), which
//  *   is not allowed through an `Arc` because `Arc` is designed for shared ownership of immutable
//  *   data. When we try to call `increment` through an `Arc`, Rust enforces its safety guarantees
//  *   and prevents we from potentially causing data races or other undefined behavior.
//  *
//  * -  MORE: https://fongyoong.github.io/easy_rust/Chapter_59.html
//  */

// struct Counter {
//     value: i32,
// }

// impl Counter {
//     fn new() -> Self {
//         Counter {
//             value: 0,
//         }
//     }

//     fn increment(&mut self) {
//         self.value += 1;
//     }

//     fn get_value(&self) -> i32 {
//         self.value
//     }
// }

// pub fn main() {
//     const num_of_threads: usize = 10;
//     let counter = Arc::new(Counter::new());
//     let mut threads = vec![];

//     for _ in 0..num_of_threads {
//         let counter_clone = Arc::clone(&counter);
//         let handle = thread::spawn(move || {
//             for _ in 0..10000 {
//                 counter_clone.increment();
//             }
//         });
//         threads.push(handle);
//     }

//     for handle in threads {
//         handle.join().unwrap();
//     }

//     println!("Expected value: {}", num_of_threads * 10000);
//     println!("Actual Value:   {}", counter.get_value());
// }

Safe Abstraction of Unsafe Code

Introduction

Writing safe abstractions over unsafe code is a common pattern in systems programming, where performance and control over low-level details are critical. Both C++ and Rust allow programmers to write such code, but they approach safety, unsafety, and abstraction differently. C++ offers a lot of freedom with implicit trust in the programmer, while Rust provides a more structured approach, making unsafe operations explicit and encapsulating them within safe abstractions. Rust is designed to be safe by default, it acknowledges that unsafe operations are sometimes necessary for low-level systems programming. Rust requires that such operations be explicitly marked with the unsafe keyword, isolating unsafe code and making it easier to review and audit.

C++: Managing Safety Manually

In C++, safety often relies on the programmer's discipline and conventions. The language offers mechanisms like RAII (Resource Acquisition Is Initialization) to manage resources safely but leaves it to the programmer to use these mechanisms consistently.

Example: Manual Memory Management

#include <iostream>

class SafeIntArray {
private:
    int* array;
    size_t size;

public:
    SafeIntArray(size_t size): size(size), array(new int[size]) {}

    ~SafeIntArray() {
        delete[] array;
    }

    int& operator[](size_t index) {
        // Bounds check for safety
        if (index >= size) throw std::out_of_range("Index out of range");
        return array[index];
    }
};

int main() {
    SafeIntArray arr(10);
    arr[0] = 42; // Safe access
    std::cout << arr[0] << std::endl;

    // arr[10] = 3; // This would throw an exception, preventing undefined behavior
    return 0;
}

This C++ class SafeIntArray is a simple example of providing a safe interface to unsafe raw pointer operations. It manually manages memory with new and delete, encapsulating unsafe array access within a class that checks bounds.

Rust: Explicit Unsafe with Safe Abstractions

Rust requires any unsafe operation to be explicitly marked with the unsafe keyword. This makes it clear which parts of the codebase could potentially lead to undefined behavior, encouraging the encapsulation of unsafe blocks within safe interfaces.

Example: Safe Abstraction Over Unsafe Code

struct SafeIntArray {
    array: Vec<i32>,
}

impl SafeIntArray {
    fn new(size: usize) -> Self {
        SafeIntArray { array: vec![0; size] }
    }

    fn set(&mut self, index: usize, value: i32) {
        // Safe due to Rust's ownership and borrowing rules
        if index >= self.array.len() {
            panic!("Index out of range");
        }
        // Unsafe block encapsulated within a safe function
        unsafe {
            *self.array.as_mut_ptr().add(index) = value;
        }
    }

    fn get(&self, index: usize) -> i32 {
        if index >= self.array.len() {
            panic!("Index out of range");
        }
        // Unsafe block encapsulated within a safe function
        unsafe {
            *self.array.as_ptr().add(index)
        }
    }
}

fn main() {
    let mut arr = SafeIntArray::new(10);
    arr.set(0, 42); // Safe API
    println!("{}", arr.get(0));

    // arr.set(10, 3); // This would panic at runtime, preventing undefined behavior
}

In this Rust example, SafeIntArray provides a safe interface to an underlying vector. Rust's vector provides safety guarantees, but for demonstration, we've used unsafe operations to manipulate memory directly, simulating what might be necessary for interfacing with low-level system components or optimizing critical paths.

Conclusion

Both C++ and Rust offer mechanisms to write high-performance, low-level code safely. C++ trusts the programmer to manage safety, while Rust enforces safety at the language level, requiring any escape from these guarantees to be explicit. This explicitness in Rust aids in creating clear boundaries between safe and unsafe code, making it easier to maintain and audit for safety while still allowing for the performance benefits of low-level programming.

Usage of rust's unsafe keyword

Rust's unsafe keyword permits operations that could potentially lead to undefined behavior, such as dereferencing raw pointers or calling functions written in another language. The beauty of Rust lies in its ability to encapsulate these unsafe operations within safe interfaces, providing the best of both worlds: the control and performance of low-level programming with the safety guarantees of high-level languages.

Example 0: Raw pointers

Raw pointers (*) and references (&T) in Rust serve similar purposes, but references are inherently safe due to Rust's borrow checker ensuring they always point to valid data. In contrast, dereferencing raw pointers requires an unsafe block, acknowledging potential risks of accessing potentially invalid data.

fn main() {
    let num = 42;
    let raw_pointer: *const u32 = &num;

    unsafe {
        println!("*raw_pointer = {}", *raw_pointer);
    }
}

Example 1: Calling an Unsafe C Function from Rust

DEMO

Suppose you have a C library with the following function:

// In a file `library.c`
#include <stdio.h>

void print_hello_from_c() {
    printf("Hello from C!\n");
}

You can call this function from Rust, safely encapsulating the unsafe foreign function interface (FFI) call:

// Assuming you have linked the C library appropriately
extern "C" {
    fn print_hello_from_c();
}

fn safe_print_hello() {
    unsafe {
        print_hello_from_c(); // Unsafe FFI call
    }
}

fn main() {
    safe_print_hello(); // Safe to call
}

This example demonstrates how Rust can interact with C code. The unsafe block is necessary because calling foreign code can't be checked by Rust's safety guarantees, but wrapping it in a safe function allows you to control where and how these interactions occur.

Example 2: Calling an Unsafe System Library (libcrypt) C Function from Rust

DEMO

extern "C" {
    pub fn crypt(phrase: *const c_char, setting: *const c_char) -> *mut c_char;
}

fn safe_crypt(input: &str, salt: &str) -> String {
    let c_input = std::ffi::CString::new(input).expect("CString::new failed for input");
    let c_salt = std::ffi::CString::new(salt).expect("CString::new failed for salt");

    let result_ptr = unsafe { crypt(c_input.as_ptr(), c_salt.as_ptr()) };

    assert!(!result_ptr.is_null(), "crypt returned a null pointer");

    let result_cstr = unsafe { std::ffi::CStr::from_ptr(result_ptr) };
    result_cstr.to_string_lossy().into_owned()
}

fn main() {
    let input = "hello world";
    let salt = "somesalt"; // Example for SHA-512 based on Linux's glibc
    let encrypted = safe_crypt(input, salt);
    println!("Encrypted: {}", encrypted);
}

Example 3: Safe Wrapper for a Raw Pointer

Raw pointers (*const T and *mut T) are often used in Rust for low-level memory manipulation, but they are inherently unsafe to dereference. Here's an example of a simple safe wrapper around a raw pointer:

struct SafePtr<T> {
    ptr: *mut T,
}

impl<T> SafePtr<T> {
    fn new(t: &mut T) -> Self {
        SafePtr { ptr: t as *mut T }
    }

    fn read(&self) -> &T {
        unsafe { &*self.ptr }
    }

    fn write(&mut self, value: T) {
        unsafe { *self.ptr = value; }
    }
}

fn main() {
    let mut num = 10;
    let mut safe_ptr = SafePtr::new(&mut num);

    println!("Before: {}", safe_ptr.read());
    safe_ptr.write(20);
    println!("After: {}", safe_ptr.read());
}

In this example, SafePtr is a wrapper that provides a safe API to read from and write to a location in memory. The unsafe operations are contained within the implementation of SafePtr, making the public interface safe to use.

Example 4: Interfacing with Unsafe Code for Performance

Sometimes, for performance reasons, you might choose to use unsafe code to avoid the overhead of certain safety checks. Here's an example that manipulates a vector in an unsafe manner to avoid bounds checks:

fn sum_elements(slice: &[i32]) -> i32 {
    let mut sum = 0;
    unsafe {
        for i in 0..slice.len() {
            sum += *slice.get_unchecked(i); // Unsafe to avoid bounds checking
        }
    }
    sum
}

fn main() {
    let nums = vec![1, 2, 3, 4, 5];
    println!("Sum: {}", sum_elements(&nums));
}

get_unchecked is an unsafe method because it does not perform bounds checking. If used incorrectly, it could lead to undefined behavior. However, by carefully controlling its use within a safe function, we can leverage its performance benefits while minimizing risk.

These examples illustrate Rust's approach to combining low-level control with high-level safety. By requiring unsafe operations to be explicitly marked and encouraging their encapsulation within safe abstractions, Rust helps prevent many common programming errors related to memory safety and concurrency, fostering the development of robust, efficient software.

Example 5: Inline Assembly

GODBOLT Rust's asm! macro allows embedding custom assembly code directly within Rust programs. This is mainly used for performance-critical tasks or when accessing low-level hardware features, such as in kernel development, where Rust's abstractions may not suffice.

use std::arch::asm;

fn main() {
    let msg = "Hello, world!\n";
    let len = msg.len();
    let fd = 1; // File descriptor 1 for stdout

    unsafe {
        asm!(
            "syscall",
            in("rax") 1,         // syscall number for write
            in("rdi") fd,        // first argument: file descriptor
            in("rsi") msg.as_ptr(), // second argument: pointer to message
            in("rdx") len,       // third argument: message length
            options(nostack)
        );
    }
}

Rust Ecosystem

Here's content for a short overview of rust ecosystem mainly on Rustup and Cargo, including command examples. These sections will give readers a concise overview of these tools and how to use them effectively.

It employs an LLVM backend for compilation, resulting in highly optimized machine code. Compilation times can be longer due to exhaustive checks.

Rustup (rust_ecosystem/rustup.md)


Rustup is the Rust toolchain installer. It manages Rust versions and associated tools, making it easy to switch between stable, beta, and nightly compilers and ensure that you have the latest updates.

Installation:

To install Rustup and the default Rust toolchain, you can run:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

This command downloads a script and starts the installation process, which includes the Rust compiler (rustc), the Rust package manager (cargo), and the standard library.

Managing Toolchains:

To list installed toolchains:

rustup toolchain list

To install a specific version of the Rust toolchain:

rustup toolchain install stable
rustup toolchain install nightly

To switch the default toolchain:

rustup default nightly

Updating Rust:

To update all installed toolchains:

rustup update

Cross-compilation:

To add a target for cross-compilation:

rustup target add x86_64-unknown-linux-gnu

Uninstallation:

To uninstall Rust and Rustup:

rustup self uninstall

Cargo

Cargo is Rust's build system and package manager. It handles downloading libraries, compiling packages, and more.

Creating a New Project:

To create a new Rust project:

cargo new my_project
cd my_project

This creates a new directory called my_project with a Cargo.toml file (describing the project and its dependencies) and a src directory.

Building Your Project:

To compile your project:

cargo build

To compile and run your project:

cargo run

Adding Dependencies:

To add a dependency, edit your Cargo.toml file and include the library under [dependencies]. For example, to add the serde library:

[dependencies]
serde = "1.0"

After adding a dependency, run cargo build, and Cargo will download and compile the new dependency.

Updating Dependencies:

To update your project's dependencies:

cargo update

Testing:

To run tests defined in your project:

cargo test

Documentation:

To build and view documentation for your project's dependencies:

cargo doc --open

Publishing a Crate:

To publish a crate to crates.io:

cargo publish

(Note: You'll need to create an account on crates.io and obtain an API token first.)


Testing with Cargo:

cargo test runs all unit tests, integration tests, and documentation tests in your Rust project. Rust makes it easy to write tests by annotating functions with #[test], and cargo test automatically finds and executes these tests.

cargo test

This command compiles your code in test mode and runs the specified tests. To run a subset of tests, you can specify their name as an argument:

cargo test test_name

Linting with Clippy:

cargo clippy is a helpful linting tool that catches common mistakes and suggests improvements to make your Rust code more idiomatic. Clippy extends the compiler's linting capabilities and provides a vast collection of lint checks.

First, you might need to install clippy if you haven't already:

rustup component add clippy

Then, to run clippy on your project:

cargo clippy

Automatically Fixing Issues with Cargo Fix:

cargo fix automatically applies fixes to your code for warnings or errors identified by the Rust compiler. This tool is incredibly useful for automatically resolving certain types of compiler warnings and for easing the transition when upgrading to a new Rust edition.

To run cargo fix: cargo

cargo fix

Formatting Code with Rustfmt:

cargo fmt uses Rustfmt to format your Rust code according to style guidelines. This tool ensures that your code is not only stylistically consistent but also adheres to the community-recommended style practices.

First, ensure rustfmt is installed:

rustup component add rustfmt

Then, to format your project:

cargo fmt

This command will automatically format all .rs files in your project according to the Rust style guide.

Summary:

Together, these Cargo commands enhance your Rust development workflow by ensuring that your code is clean, idiomatic, and well-tested. By integrating these tools into your daily development practices, you can improve the quality and maintainability of your Rust projects.


Rust Ecosystem Security Features

Dependency Auditing

The Rust compiler, combined with Cargo, Rust's package manager, provides tools for auditing dependencies for known vulnerabilities. This is crucial for maintaining the security of Rust applications, given the extensive use of external crates.

cargo audit

  • DEMO: Run run.sh audit in DEMO The cargo audit command checks your Cargo.lock file against the RustSec Advisory Database to find vulnerable package versions, helping you keep dependencies up-to-date and secure.
# Install
cargo install cargo-audit --features=fix
# Run
cargo audit
cargo audit fix

cargo auditable

  • DEMO: Run run.sh auditable in DEMO cargo auditable - is a Rust tool that enhances security by embedding dependency information directly into compiled binaries. This allows for auditing Rust binaries for known vulnerabilities without needing the original source code or Cargo.lock file. By including auditable as a dependency in your Cargo.toml, the compilation process automatically incorporates a summary of all project dependencies into the resulting binary. This works by embedding data about the dependency tree in JSON format into a dedicated linker section of the compiled executable (.dep-v0). Linux, Windows and Mac OS are officially supported.
# Install
cargo install cargo-auditable
# Build your project with dependency lists embedded in the binaries
cargo auditable build --release
# Scan the binary for vulnerabilities
cargo audit bin target/release/car_project

# Check for the `dep-v0` section
readelf -S target/release/car_project
readelf -p .dep-v0 target/release/car_project
# Decompress zlib section content
objdump -s -j .dep-v0 target/release/car_project | grep '^ ' | cut -c7-42 | xxd -r -p | python3 -c "import sys, zlib; sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))"

Compiler

Sanitizers

  • DEMO: Run run.sh in DEMO Rust compiler supports use of one of following sanitizers:

  • AddressSanitizer: A memory error detector. It can detect the following types of bugs:

    • Out of bound accesses to heap, stack and globals
    • Use after free
    • Use after return (runtime flag ASAN_OPTIONS=detect_stack_use_after_return=1)
    • Use after scope
    • Double-free, invalid free
    • Memory leaks
  • ControlFlowIntegrity (CFI): LLVM's Control Flow Integrity provides forward-edge control flow protection, preventing unauthorized code paths from being executed.

  • HWAddressSanitizer: Similar to AddressSanitizer, this tool uses partial hardware assistance for detecting memory errors. It's particularly useful for catching complex memory corruption bugs with minimal overhead.

  • KernelControlFlowIntegrity (KCFI): An extension of LLVM's Control Flow Integrity aimed at operating system kernels, providing robust forward-edge control flow protection at the kernel level.

  • LeakSanitizer: A runtime memory leak detector that helps identify and report memory leaks in applications, facilitating easier memory management debugging.

  • MemorySanitizer: Specialized in detecting uninitialized memory reads, this tool helps prevent undefined behaviors arising from the use of uninitialized memory.

  • MemTagSanitizer: Leveraging the Armv8.5-A Memory Tagging Extension, this tool offers fast and efficient detection of memory errors, enhancing application security with hardware support.

  • SafeStack: Implements backward-edge control flow protection by segregating the application's stack into safe and unsafe regions, thus protecting against stack-based attacks.

  • ShadowCallStack: Provides backward-edge control flow protection on aarch64 architectures by maintaining a separate, secure call stack, further mitigating the risk of return-oriented programming (ROP) attacks.

  • ThreadSanitizer: A data race detector that quickly identifies threading issues in applications, promoting safer concurrent programming practices.

To enable a sanitizer compile with the following flags:

-Zsanitizer=address
-Zsanitizer=cfi
-Zsanitizer=hwaddress
-Zsanitizer=leak
-Zsanitizer=memory
-Zsanitizer=memtag
-Zsanitizer=shadow-call-stack
-Zsanitizer=thread.
# Add also:
--target
-Zbuild-std
  • ASAN example:
export RUSTFLAGS=-Zsanitizer=address RUSTDOCFLAGS=-Zsanitizer=address
cargo run -Zbuild-std --target x86_64-unknown-linux-gnu
  • ASAN example:
export RUSTFLAGS=-Zsanitizer=address RUSTDOCFLAGS=-Zsanitizer=address
cargo run -Zbuild-std --target x86_64-unknown-linux-gnu

Rust Compiler exploit mitigations

The Rust programming language offers memory and thread safety through features like ownership, references, borrowing, and slices. However, Unsafe Rust introduces constructs such as unsafe blocks, functions, methods, traits, and types, which bypass Rust's safety guarantees.

Certain parts of the Rust standard library are built on top of unsafe code, potentially leading to memory corruption vulnerabilities. Moreover, Rust encourages creating safe abstractions over unsafe code, which may give a false sense of security if the unsafe code isn't thoroughly reviewed and tested.

Unsafe Rust introduces features that lack memory and thread safety guarantees, making programs or libraries susceptible to memory corruption (CWE-119) and concurrency issues (CWE-557). To address this, Rust compiler needs to support exploit mitigations similar to those found in modern C and C++ compilers. This part details these exploit mitigations and their application in Rust.

Summary of exploit mitigations supported by the Rust compiler:

  • Position-independent executable (enabled by default)
  • Integer overflow checks
  • Non-executable memory regions
  • Stack clashing protection
  • Read-only relocations and immediate binding
  • Heap corruption protection
  • Stack smashing protection
  • Forward-edge control flow protection
  • Backward-edge control flow protection (e.g., shadow and safe stack)

Rust cybersecurity utilities

The Rust programming language, known for its emphasis on safety and performance, has a growing ecosystem of libraries and tools for cybersecurity applications. Rust's memory safety features, combined with its speed and concurrency capabilities, make it an excellent choice for developing secure applications and utilities. Here are a few notable cybersecurity utilities and libraries in the Rust ecosystem

Fuzzing in Rust

Fuzz testing, or fuzzing, is a powerful automated software testing technique that involves providing invalid, unexpected, or random data as input to a program. The goal is to discover bugs or potential vulnerabilities that might not be caught through traditional testing methods, particularly those that could be exploited for security breaches. Rust, known for its focus on safety and performance, supports fuzzing through several tools and libraries designed to integrate seamlessly with its ecosystem. Here are some notable fuzzing tools and libraries in the Rust ecosystem:

1. cargo-fuzz

  • DEMO
  • GitHub: https://github.com/rust-fuzz/cargo-fuzz
  • Description: cargo-fuzz is a command-line tool for fuzzing Rust code. It is built on top of libFuzzer, which is a library for in-process, coverage-guided evolutionary fuzzing of other libraries. cargo-fuzz makes it easy to start fuzzing a Rust project by integrating with Cargo, Rust's package manager and build system. It automatically sets up the fuzzing target and provides a straightforward way to run the fuzzer on your code. This project requires the nightly compiler since it uses the -Z compiler flag (-Zsanitizer=address) to provide address sanitization.
  • Usage: Based on: https://rust-fuzz.github.io/book/introduction.html
cd fuzzing
# Set up rust nightly version
rustup default nightly
# Install cargo-fuzz
cargo install cargo-fuzz
cargo fuzz init
cargo fuzz add fuzz_wc_tool
# Modify `fuzzing/fuzz/fuzz_targets/fuzz_wc_tool.rs`
# Start fuzzing
cargo fuzz run fuzz_wc_tool
# Set back to stable version
rustup default stable

2. afl.rs

  • DEMO
  • GitHub: https://github.com/rust-fuzz/afl.rs
  • Description: afl.rs is a Rust wrapper around American Fuzzy Lop (AFL), one of the most popular fuzzers available. AFL is known for its efficiency in generating test cases that uncover deeply hidden bugs. afl.rs makes AFL's capabilities available to Rust projects, enabling developers to leverage AFL's fuzzing techniques to improve the security and reliability of their Rust code.
  • Usage: Based on: https://rust-fuzz.github.io/book/introduction.html
cd fuzzing
# Install cargo-afl
cargo install cargo-afl
cargo new --bin wc-tool-fuzz-afl-target
cd wc-tool-fuzz-afl-target
# Modify `fuzzing/wc-tool-fuzz-afl-target/Cargo.toml` by adding:
# [dependencies]
# afl = "*"
# url = "*"

# Modify `fuzzing/wc-tool-fuzz-afl-target/src/main.rs`
# Build project
cargo afl build
# Start fuzzing
cargo afl fuzz -i in -o out target/debug/wc-tool-fuzz-afl-target

3. honggfuzz-rs

  • GitHub: https://github.com/rust-fuzz/honggfuzz-rs
  • Description: honggfuzz-rs allows Rust developers to use honggfuzz, a security-oriented fuzzer with powerful analysis options, to fuzz their Rust code. It provides features like automatic crash detection, memory leak detection, and coverage-guided fuzzing to help identify vulnerabilities. honggfuzz-rs integrates with Rust projects to make the fuzzing process as straightforward as possible.

4. proptest

  • GitHub: https://github.com/AltSysrq/proptest
  • Description: While not a fuzzer in the traditional sense, proptest is a property testing library for Rust inspired by the Hypothesis framework for Python. It allows developers to specify the properties that a program should have, then automatically generates test cases to try and falsify these properties. proptest can be used in conjunction with fuzzers to create more comprehensive testing strategies.

5. Arbitrary

  • GitHub: https://github.com/rust-fuzz/arbitrary
  • Description: The arbitrary library provides a trait for defining how to generate arbitrary instances of data from structured input. This is particularly useful in fuzzing scenarios where you want to generate a wide variety of inputs to test your program's behavior under unexpected or edge-case conditions. It's often used in combination with fuzzers like cargo-fuzz to provide structured, yet randomized, input data for testing.

Fuzzing in Rust aims to leverage the language's type safety and memory safety guarantees while uncovering potential issues that static analysis tools might miss. These tools and libraries make fuzzing accessible to Rust developers, helping to identify and fix bugs early in the development process and enhance software security.

Cryptography utils written in RUST

RustCrypto

  • GitHub: https://github.com/RustCrypto
  • Description: The RustCrypto project provides a collection of cryptographic algorithms implemented in pure Rust. It includes libraries for hashes, MACs, ciphers, AEAD, RNGs, public key cryptography, and more. The focus on Rust ensures that these cryptographic primitives and utilities benefit from Rust's compile-time safety and efficiency, which is crucial for security-sensitive applications.

RustTLS

  • GitHub: https://github.com/rustls/rustls
  • Description: A modern TLS library in Rust. Rustls is a TLS library that aims to provide a good level of cryptographic security, requires no configuration to achieve that security, and provides no unsafe features or obsolete cryptography by default. Rustls implements TLS1.2 and TLS1.3 for both clients and servers

RustScan

  • GitHub: https://github.com/RustScan/RustScan
  • Description: RustScan is a modern port scanning tool designed to be fast and efficient. It automates the process of scanning IP addresses and ports, significantly speeding up the process compared to traditional tools like Nmap when used in conjunction. RustScan can be integrated into a cybersecurity workflow to quickly identify open ports and potential vulnerabilities in target systems.
  • Demo:
docker pull rustscan/rustscan:2.1.1
docker run -it --rm --name rustscan rustscan/rustscan:2.1.1 -b 10000 -a 127.0.0.1

Suricata

  • GitHub: https://github.com/OISF/suricata
  • Description: Suricata is a high-performance Network IDS, IPS, and Network Security Monitoring engine. While Suricata itself is not written in Rust, it incorporates Rust for many of its parsers and operations, leveraging Rust's memory safety features to enhance its security posture. Suricata is widely used for real-time intrusion detection (IDS), inline intrusion prevention (IPS), network security monitoring (NSM), and offline pcap processing.

Wasmer

  • GitHub: https://github.com/wasmerio/wasmer
  • Description: Wasmer is a fast and secure WebAssembly runtime that enables super lightweight containers to run anywhere from Desktop to the Cloud, Edge, and IoT devices. It is written in Rust and allows embedding WebAssembly in various programming languages. WebAssembly's sandboxing features, combined with Rust's safety, make Wasmer an exciting tool for securely running untrusted code.

ripgrep

  • GitHub: https://github.com/BurntSushi/ripgrep
  • Description: While not a cybersecurity tool per se, ripgrep is a line-oriented search tool that recursively searches the current directory for a regex pattern. It's incredibly fast and respects your .gitignore files. Security professionals often use tools like ripgrep for searching through codebases, logs, or configurations for sensitive data leaks or patterns indicative of security issues.
  • Demo:
time grep -R "main" ~/
time rg "main" ~/

Conclusion: Safe and Secure Coding in Rust vs. C++

In the evolving landscape of software development, the choice of programming language significantly impacts the safety, security, and efficiency of the resulting applications. This presentation has embarked on a comparative analysis of Rust and C++, two powerful systems programming languages, with a particular focus on their approaches to safe and secure coding.

Rust, a language born out of the need for memory safety without sacrificing performance, introduces revolutionary concepts like ownership, borrowing, and lifetimes. These features are not mere additions to the programmer's toolkit but are deeply integrated into the language's core, enforced at compile time to eliminate a wide array of common bugs that plague systems programming, including data races, null pointer dereferences, and buffer overflows. Rust's "fearless concurrency" enables developers to write highly parallel, safe code with confidence, addressing one of the most complex challenges in modern software development.

C++, with its rich history and immense flexibility, offers programmers close-to-the-metal control over system resources, which is both its strength and its Achilles' heel. The power of C++ comes with the responsibility to manually manage memory and adhere to best practices to avoid security vulnerabilities and undefined behavior. Modern C++ has introduced smart pointers, move semantics, and other features aimed at making safe programming more accessible. However, these features are opt-in rather than enforced, and the burden of safe usage ultimately falls on the developer.

The comparative analysis highlights a fundamental difference in philosophy between Rust and C++: Rust opts to enforce safety at the language level, making it the default state and thus elevating the baseline of secure software development. C++, while capable of achieving similar levels of safety through disciplined use of modern features and external tools, requires a more significant investment in developer education and codebase vigilance.

In conclusion, the choice between Rust and C++ is not merely a technical decision but a strategic one that encompasses team expertise, project requirements, and the prioritization of safety and security in the project's goals. Rust's guarantees of memory safety, thread safety, and its ecosystem designed around security make it an appealing choice for new projects where safety and concurrency are paramount. C++, with its unmatched ecosystem and performance, remains a viable choice, especially in contexts where existing codebases and expertise dictate its use.

As we look to the future of systems programming, the lessons learned from both Rust and C++ inform a broader movement towards safer, more secure coding practices across all languages. By leveraging the strengths of these languages and understanding their weaknesses, developers can make informed decisions that lead to more reliable, secure, and efficient software systems.

Sources And Further Reading