Filtering in rust

In the past days, I’ve been busy with the migration of my old server. My new server requires some sort of monitoring. I’m using graphite and this works best with collectd. After a quick installation I went through the different modules and data collectd is sending.

I love seeing traffic data. However, the interface module is insufficient for my needs. It shows errors and packages that where send. However, for another project I programmed something in rust to get these data: Netstat. It does however add all entries into the graphite server. So I’ve deployed it and was good to go.

However, the next day a problem appeared. The data have filled up the entry disk. Why? Turns out, that using docker causes quite a lot of interfaces to be created and because every docker interfaces causes much over head. Particular when you restart container more often, it causes the creation of new interfaces.

After deleting the unnecessary interface data I had to do something. The main problem The program has no filter.

It was on my to-do list anyway, so let add some simple filter to reduce the noise of the program.

pub fn read_traffic_file() -> Vec<String> {
  // Do we need to re-open the file each time or can we just get updates from here?
  let mut f = File::open("/proc/net/dev").unwrap();

  let mut data_to_return: Vec<String> =  Vec::new();
  let mut buffer = String::new();
  let mut _new_data: String = String::new();
  f.read_to_string(&mut buffer).unwrap();
  // This can be done by the line() function
  // https://doc.rust-lang.org/std/string/struct.String.html#method.lines
  let mut data: Vec<&str> = buffer.split("\n").collect();

  // Cleanup the Vector, remove unnecessary lines
  data.reverse();
  data.pop();
  data.pop();
  data.reverse();
  data.pop();

  // At this point out Vec only contains interface parameters.
  for line in 0..data.len()  {
    _new_data = clean_up(data[line]).clone();
    data_to_return.push(_new_data);
  }
  data_to_return
}

Here is the code that read though the list of interfaces, and we’re going to add the filter for the interfaces.

I’m not sure whether you could to this better or not, however, it was my first code to read lines. Hence, it isn’t the best code for doing it so. But when I had the time to fix it, it was focus on fixing the bug instead of improving the code quality. We update the function parameters to handle some input interfaces, that will be stored in a Vec.

Writing this, I was thinking about this type str. This works well because I define it within the code. My initial ideas was a String, which makes sense when you would read the file out of a parameter from cmd or a configuration file. Anyway, it works for the moment.

Let move on to filter lo first.

pub fn read_traffic_file(filter_interfaces: &Vec<&str>) -> Vec<String> {
  // this should read the /prc/net/dev file and extracs interfaces
  // TodO: adding filter here
  // Do we need to re-open the file each time or can we just get updates from here?
  // ToDo: Allow custom files for testing
  let mut f = File::open("/proc/net/dev").unwrap();


  // Debug that we have the list of filter interfaces 
  for interface in filter_interfaces {
      println!("interface: {}", interface);
  }

  let mut data_to_return: Vec<String> =  Vec::new();
  let mut buffer = String::new();
  let mut _new_data: String = String::new();
  f.read_to_string(&mut buffer).unwrap();
  // This can be done by the line() function
  // https://doc.rust-lang.org/std/string/struct.String.html#method.lines
  let mut data: Vec<&str> = buffer.split("\n").collect();

  // Cleanup the Vector, remove unnecessary lines
  data.reverse();
  data.pop();
  data.pop();
  data.reverse();
  data.pop();

  // At this point out Vec only contains interface parameters.
  for line in 0..data.len()  {
    _new_data = clean_up(data[line]).clone();
    println!("Interface data: {}", _new_data);
    // Here we do the filter magic: we skip the data addation when we have a postive match
    let mut _a = _new_data.split(" ");
    let _interface = _a.next();
    //println!("{:?}", _interface.unwrap());
    if _interface.unwrap() != filter_interfaces[0] {
        println!("True");
        data_to_return.push(_new_data);
        }
  }
  data_to_return
}

This works well for a single entry, however, I also have to add arbitrary amount of interfaces. So we put this statement into a loop. At this point I run into the follow error:

no implementation for `str == &str`

Problem here because the compare would try to compare a str with a &str. So what now? I guess the way I loop through Vec, so Instead I’ve tried to use iter().

But after working through the line, I noticed that I just build the same compare with a different means… The first mistake was not know how the range loop was working. I had the python range() in mind, however, in rust we need to do this via 0..n. Something easier, but I just forgot.

Afterwards the loop was working. However, next was that there was an issue adding the _new_data. What did I do wrong? Tinkering around with the if did not solve. So I though maybe that the if was not well and tried to use it with match instead.

I’m not yet comfortable with the match function of rust. Somehow it does feel that well, even when it seems better than only relying on if statements.

That was nice at first. What happened here was that however, _new_data would borrowed, meaning that I have to care for the own ship. For such a small code block that seems a bit too much. Instead, I add a helper variables check that would check before the data are added to the list.

The completed function looks like this:

  for line in 0..data.len()  {
    _new_data = clean_up(data[line]).clone();

    let mut _a = _new_data.split(" ");
    let _interface = _a.next();
    let mut check: bool = true; 
    for n_filter_interface in 0..filter_interfaces.len() {
        if _interface.unwrap().contains(filter_interfaces[n_filter_interface]) {
                check = false;
        }
      }

    if check == true {
        data_to_return.push(_new_data);
    }
    check = true;

  }
  data_to_return

The most important lesson her is the use of .contains(). It is a bit like the python if a in b and does match for the pattern, but not exactly. This way i do not need to add any more complex regular expression. It allows a simple and easy filter. It might turn into a problem when the string I try to filter is too ambiguous, but for the moment this should be more than enough.

Deploying this change also included a bit of a problem. After build and installing it on the VM some strange error happens.

~# ./NetStats 
./NetStats: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by ./NetStats)
./NetStats: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./NetStats)

What just happened? Searching for this error message does not offer much on the internet. Took me some time to figure it out: I was build on my local laptop for a newer version of glibc than the target system has to offer. I’m uncertain if cargo can define a target glibc, instead, I made it the old school way. Copy the data onto VM and build it there.

Now the data that are generate only are for existing interfaces and not bridges, docker interfaces of a container. Also, I excluded lo. It reduced that amount of data massively.

This also why I need to have a Ci/CD to create a release for different systems.

So far,
akendo

€ḑit note: I’ve found a good explanation for the match() statement on Stack Overflow[0]. The key take away is that match() is used to pattern matching. In my understanding it allows to compare to values disregard of it type. The typical if x == y, however, relies on the same data type, as seen in the error message show above.

[0] https://stackoverflow.com/a/49889545