8.13.2015

Rust and the most elegant FSM

Finite state machines should be a foundation upon which Software Engineers build complex systems. In this post I want to show how Rust's enum type supports building FSM's in a very simple and elegant manner.


FSM, what?

For anyone who has studied Computer Science, finite state machines are drilled into your head without mercy (or should have been, go ask for you $'s back if these were not drilled into your head). The first CS class I took in college was basically all FSM's, DFA's (deterministic finite automaton) and Turing machines. I don't want to get into the difference between all of these, so I'm going to just refer to FSM's in general, of which Turing machines and DFA's are subsets. In this example I'm more specifically defining a DFA, as all of the states are known, and there is a begin and end.

Ok, but why is this important? FSM's can simplify problems where you have a known input and known outputs. Specifically what I want to use one for in this example is parsing, boring old parsing. It actually wasn't until I was in the industry that I truly began to appreciate how much simpler and more maintainable your code became if you took the time to define state machines to determine how to move from one state to another.

Background

Firstly, I'm still somewhat new to the Rust language, so when a "real" Rust person reads this, they'll probably complain that I've done xyz wrong, or should do it another way. Let me know! I'd love to understand what I could be doing better. I've been searching for a meaty project to work on to hone my skills in this language, and then I finally found one; It seems like there are security advisories that come out around DNS and specifically Bind all the time. Here is the list of all known Security Advisories in Bind9. Reading through some of the issues made me think, wtf, I'll just write a new DNS server in Rust, b/c you know, why not and it will be safe implicitly, right? While doing that, I was writing the parsers for the binary record data from the DNS rfc's and fell on a really nice way of doing FSM's in Rust.

Now in Java I've used lots of different FSM generators, because I wanted strong language guarantees, for reference, my favorite right now was written by a friend of mine and is annotation based called Tron. This is useful because Java's enums and other basic constructs are not quite as expressive as some of the expressions you can make in Rust.

On to the FSM!

The definition of what we need to parse from RFC1035:

3.1. Name space definitions
Domain names in messages are expressed in terms of a sequence of labels.
Each label is represented as a one octet length field followed by that
number of octets. Since every domain name ends with the null label of
the root, a domain name is terminated by a length byte of zero. The
high order two bits of every length octet must be zero, and the
remaining six bits of the length field limit the label to 63 octets or
less.
To simplify implementations, the total length of a domain name (i.e.,
label octets and label length octets) is restricted to 255 octets or
less. Although labels can contain any 8 bit values in octets that make up a
label, it is strongly recommended that labels follow the preferred
syntax described elsewhere in this memo, which is compatible with
existing host naming conventions. Name servers and resolvers must
compare labels in a case-insensitive manner (i.e., A=a), assuming ASCII
with zero parity. Non-alphabetic codes must match exactly.

There are some more sections that provide details on things like pointers, you can read the spec if you want. Here are the states (I used http://madebyevan.com/fsm/ to build this, which doesn't have edge avoidance so I added some extra states to keep the lines in order):

This state diagram basically represents the above, so I decided to translate this (minus the 'offset' and 'store' states) to Rust enums.

This ended up being pretty elegant. but first a side note on enums in Rust.

'enum' really?

So I've noticed this come up in discussions online a few times now. With questions like, "how do you enumerate a Rust enum?" (which has a really funny answer, IMO): You can't! Wait what? You can't enumerate a Rust enum? Why is it called an enum? No, seriously, I don't have an answer to this, why was it called 'enum', when you can't enumerate it's values (other then writing them in order, you can't in code treat them as an array or series as you can in other languages, Java/C/C++ to name a few).

I'm sure someone has an answer to that, but in the mean time I needed to clarify them in my mind. I think they should have been called 'union's because that is what they are most closely related to. In C a union is a data type which which occupies enough memory space to only hold the largest member of the union, but what C doesn't give you is a way to know which thing in that union it really is! (read more here: C unions)

What's cool in Rust is that it does tell you what's stored in the, ahem, enum. So you can write matching (or destructuring) logic like this:

if let MyEnum::Type1 = my_var {
   do_something_really_awesome();
}
Which is very powerful and cool.

FSM Now!

Ok so now for the super awesomeness of Rust. My enum is going to have four states, LabelLengthOrPointer == start, Label, Pointer, Root == end.

/// This is the list of states for the label parsing state machine
enum LabelParseState {
  LabelLengthOrPointer, // basically the start of the FSM
  Label(u8),   // storing length of the label
  Pointer(u8), // location of pointer in slice,
  Root,        // root is the end of the labels list, aka null
}

Ok, now for the code to run the state machine:

/// parses the chain of labels
/// this has a max of 255 octets, with each label being less than 63.
/// all names will be stored lowercase internally.
/// This will consume the portions of the Vec which it is reading...

pub fn parse(slice: &mut Vec<u8>) -> Result<Name, FromUtf8Error> {
  let mut state: LabelParseState = LabelParseState::LabelLengthOrPointer;
  let mut labels: Vec<String> = Vec::with_capacity(3); // www.example.com
  
  // assume all chars are utf-8. We're doing byte-by-byte operations,
  //   no endianess issues...
  // reserved: (1000 0000 aka 0800) && (0100 0000 aka 0400)
  // pointer: (slice == 1100 0000 aka C0), then 03FF & slice = offset
  // label: 03FF & slice = length; slice.next(length) = label
  // root: 0000

  loop {
    state = match state {
      LabelParseState::LabelLengthOrPointer => {
        // determine what the next label is
        match slice.pop() {
          Some(0) | None => LabelParseState::Root,
          Some(byte) if byte & 0xC0 == 0xC0 => 
                                           LabelParseState::Pointer(byte & 0x3F),
          Some(byte) if byte <= 0x3F => LabelParseState::Label(byte),
          _ => unimplemented!(),
        }
      },
      LabelParseState::Label(count) => {
        labels.push(try!(util::parse_label(slice, count))); 
        // reset to collect more data
        LabelParseState::LabelLengthOrPointer
      },
      LabelParseState::Pointer(offset) => {
        // lookup in the hashmap the label to use
        unimplemented!()
      },
      LabelParseState::Root => {
        // technically could return here...
        break;
      }
    }
  } 
  Ok(Name { labels: labels })
}
code here.

So I'll just point out a couple of things that Rust does for us here: 1) guarantee that all states are considered 2) that each state has a result because of the assignment to the state. You'll notice that there are some unimplemented!()'s in there, that's because this is a work in progress, but I was so excited about how easy it was to write an FSM in Rust with just the standard language semantics, that I just had to share.

Basically, the sweet sauce here is that the 'state' can carry context implicitly in the enum as part of it's tuple definition. You can obviously make that more complex than what I did here, but this was such a simple and elegant solution to a common problem.

Conclusion

Rust continues to impress me while learning it. There are definitely some oddities like enum's being called enum's when they really are something else. Also, I continue to fight the compiler on mutability around ownership, etc. It's not that I don't get the ownership model, I do... it's that after working with a GC in Java for so long, I have to train myself to think about it each time I pass a reference to something, and as far as I can tell 90% of the time I'm getting it wrong the first time.

Anyway, getting back to hacking DNS now.


6.23.2015

Software engineers are lazy bastards (pt. 3)

I decided to make this a three part piece. The first one concerns componentization and following good practices when building software as an argument for Software Engineering as a legitimate Engineering field. The second covers proper testing that all Software Engineers should be following. In this post I'll talk about Development and Operations, i.e. DevOps.

Ok so DevOps became a thing. If I understand it properly, which I'm sure lots of people will say I do not, it's the idea that proper development practices are brought to bear on operations problems. This means things like using source code management systems (Git), having code reviews, following a process around work that needs to be done. These are all great things, and it's way better than way that people did this before. But here's the irony, and something I find kinda sad, computers and software were built to automate things. They were built to run factories, to make it easier to compute flight paths, to help with order fulfillment; I'm not going to list out all the places they help, I hope you get the point. So the irony is that DevOps is needed to help automate the systems that we (Software/Hardware/Computer People) built to help automate away other issues. I can't be the only one who sees this as ironically sad. To really make a point here, I'm going to start with an analogy, trains.

Remember how complicated it was to drive a steam train?

Yeah, me too. You had to fill it with coal or wood, or whatever combustible thing was available at hand. And it was really hot, always sweating, that's why I stopped being and engineer and became a lazy software engineer. Ok, not funny, but still steam trains were really hard to operate. Look at all these dials and levers:


I'm going to guess that the big red lever is the brake (no, no, the one on the right not the left), Oh, and the dial on the left makes it go faster (the heavily used looking one). But seriously, I have no idea. The only thing I'm pretty sure of from this photo is where you put the fuel, and I'm guessing coal.

So, why am I talking about trains as a Software Engineer who knows little to nothing about trains? Because that picture above is what us lazy software engineers have been giving the operations folks for years. In fact, operations became so complicated that the operations folks needed to adopt development methodologies to manage systems at scale, and thus was born DevOps. Which is a great thing, using good standards in operations like SCM, releases of tools, declarative systems, etc. these are all great advancements, but why haven't they always been there? And why are they needed? Because as developers we've given operators so many dials to properly run our software (think on every config your require someone to write, every command line option...).

DevOps should not be necessary

I blame Java for the state of the world (don't get me wrong, I still love Java though Scala is growing on me as a JVM language). Java and many of the interpreted languages out there put us in a state where the developer actively divorced themselves from the system they were building software for. This was great for being able to build things faster, test it once and expect it to work the same on any platform you deploy to after that. I remember when I had to build a common client/server system in C++ with a network stack that was portable between Windows, Linux and SunOS on x86, amd64, and sparc. Having to build something that worked on all of those platforms and think about how it would run under COM in Windows, daemontools in Linux/SunOS was a major pain and caused a ton of bugs where some needed to be debugged after shipping the software to the customer. I do not want to go back to that, so these VM based languages are always going to have a place for portable code, but I do ask myself this question all the time; Do I need portability? I have only shipped software on one platform for the past 11 years, Linux. But I do think one day I'll probably end up needing to support BSD, so let's just say that I have no intention of supporting a non-POSIX OS from here on in my development career (enter interest in Rust). When software engineers experience the pain of running their own code, they will change it so that it's easier to operate. Why? Because at heart, they are lazy, they want to get back to writing code, not supporting this crappy thing in production. As a comparison, look at modern train controls, even I could probably figure out how to stop this or make it go faster:


The point is that Software Engineers have gotten lazy in how their code is deployed and run. There have been great advancements over the last four years in the operations world. Everyone has heard of Docker, some rkt, appc, LXC, LXD, etc, all of these are giving us lazy people easy methods of packaging our software. It's easier to implement actual deployment tests upfront now than any point in the past. You don't need to chroot, it's done for you; you don't need to do any port mapping, you can get a unique IP per container or VM; you don't need to guess about installation, you can use the installation tools to create the container image. What this means is that as a developer there is no reason anymore to hand these jobs off to anyone else to validate the functionality of your system. As a Software Engineer today you should know exactly how your software is going to be deployed, run, executed, what your data persistence requirements are, etc. You can not ignore this, and it's never been easier with all the tools that exist out there now. There are lots of methods, you should research what you want. I'll say this, I've been using LXC for over four years, and containers are the way to go. I can't be happier about the OpenContainer spec that was recently announced.

Where is this train going?

I'm watching these things: CoreOS, rkt, Atomic, Nix. I think that Nix is probably going to be the standard way of declaring system dependencies in containers. I think Nix or Atomic (with OSTree) will be the standard way of managing the BaseOS, but CoreOS is probably good enough for now. Work with your operations or DevOps teams to make this possible, as it will help immensely down the road. In other words, the operating system and application environment should be completely declarative.


Software Engineers are not actually lazy

I know I called software engineers lazy bastards, and that probably stopped half of them from reading these posts, but I actually don't think we're lazy bastards. We got caught up in the methods and complexities of the day and forgot to stay grounded. Basically if you want to be lazy, it's important to be lazy after your software is built and functioning properly. Which means to be a good Software Engineer you need to keep these three areas in mind when building software:
  • Modular and Component based systems (pt. 1)
  • Testing without relying on others (pt. 2)
  • Design your software to be easily deployed and managed (pt. 3, this one)
If you make good decisions in each of those categories, most likely you'll get to go back to being lazy and working on what you really want, and isn't that ultimately everyone's goal?

6.12.2015

Software engineers are lazy bastards (pt. 2)

I decided to make this a three part piece. The first one is here if you're interested in reading it. It concerns componentization and following good practices when building software as an argument for Software Engineering as a legitimate Engineering field. In this post I'm going to cover proper testing that all Software Engineers should be following. The final post is on DevOps

Let me start off by explaining why I am calling Software Engineers lazy. This stems from the general principle that people are going to generally do the least amount of work possible to accomplish a given task. Software Engineers are no different; Computer Scientists on the other hand are perfectionists always looking for the most elegant solution to a problem. Perhaps even creating a new theorem or axiom, it's my job as a Software Engineer to understand and utilize these new advancements (and perhaps one day I will create some, I did get a degree in Computer Science after all). Like the CAP theorem or Raft consensus protocol, it's definitely my responsibility to understand the theories behind these and be able to implement them if necessary, but I look at these as things like I beams in construction that can be bought off the shelf. My job as a Software Engineer is to take disparate pieces of technology and put them together to build a larger system. But why are we inherently lazy? Corners will be cut in order to ship software, systems will have components that aren't complete, because we have limited time and money which constrains our abilities to be perfect.

In other worlds, the real world intervenes and so what we need to do is find a way to mitigate issues down the road. This is why I'm so adamant about componentization or modularization in my own code, and any team that I'm leading. If that piece isn't perfect, go back and refactor. This post isn't about that, what it's about is how to make each of those components as high quality as possible. But before that let's discuss the quality of airplanes.

We need some chicken guns!

Chicken guns, created in the 1950s to test the strength of different components of an airplane.
F16 canopy birdstrike test
They're used to check the strength of both engines and windshields in the event that a large bird hits the airplane. It's exactly what it sounds like, a large pressure gun that shoots chickens at a high velocity to simulate a midair strike (like that potato gun you used to accidentally break the window of the garage at your friends house, no, not me...).  This is obviously important, you don't want to discover during flight with 150+ people on board that the windshield or engines can't withstand something that's fairly common in the air. So the chicken gun is used to make sure we're all safe in case of a mid-air strike with a bird. Cool right?

The wingbend test is used to make sure that the aircraft wings aren't going to fall off or apart in mid-flight. It's that fear you have in the back of your mind when the plane is bouncing around in crazy turbulence that one of the wings is just going to pop off and you all go spiraling down to your deaths. Aren't you happy that the wings will bend up to a degree that is so mindbogglingly extreme that you could try a 900 like Tony Hawk on them? But even before this test, they already have subjected the aluminum composites being used to understand their tensile strengths. Testing all the way down to each panel before it's attached to the plane.

There are tons of tests that are run through before a plane can even fly. Checking that all the electrical systems, the flaps, etc., are fully functional. Then they actually take the plane into the air for a battery of flight tests. These tests are critical to determine the safety of a plane. Now imagine for a moment that the tensile strength or longevity of the metal wasn't known before the flight test, that the pressures of the initial flight test shows that the internal ribbing of the plane needs to be replaced. How difficult would that be? How much more expensive is it to replace that at the end, than to realize it at the beginning and just choose a different material in the first place? Code quality should be treated in the same way.

What is quality code?

This question is always answered in a lot of different ways. Years ago I was working on a project in which I needed to modify some of DJB's code. He writes a quality of code that is impressive. C that's platform independent, big-endian and little-endian compatible. It was a really cool experience to see that amount of thought put into writing software, but I wouldn't say it's easy to grok. While I have the highest respect for what he's done, in my case what I think is just as important is writing code which is easily maintainable. Components help with this, but so do tests. A friend and former coworker of mine introduced me to this idea of a testing pyramid. For a basic overview of this, here's an ok blog post. I'm going to modify that in some important ways, especially since I'm not a UI developer, I won't go into UI testing at all (though I would suggest you need this same structure on the UI side, with actual selenium tests or similar at the top of the pyramid). I'm a backend systems architect, <sarcasm>when do I need UI</sarcasm>. But in the vein of fighting the lazy software engineer, I think it's important that we software engineers have built tests around each of these areas, here is the pyramid that I implement my code by:



I'll run through each of these sections, but the important thing is that what this represents is that most of your test coverage occurs at the bottom of the pyramid, with ever smaller numbers of tests flowing to the top. From a code complexity standpoint though, the tests at the top of the pyramid are more difficult to write and maintain, therefor there should be fewer of them. I'll start from the bottom up, in fact I'll spend more time describing things bottom up too.

Unit test your code, fool!

If you're not writing unit tests around every class you write, you are a fool. Yes, you are, I know you're saying, "but that's just a simple function that adds two numbers and returns the result". But if a function as simple as:

int myAdd(int i, int j) {
  int k = i + j;
  return k;
}

still could have a bug! Hell, technically speaking you have an overflow problem that this isn't dealing with where i and j could be max ints and overflow the size of int. Also, it's just as easy to accidentally write 'return i;' and cause a bug in your most trivial code. I'm not suggesting to go crazy, just write a quick sanity test.

There is one rule that you must follow though when writing Unit tests, they must be clean of any external dependencies. Don't get stuck initializing a massive tree of dependencies, use Mocks for this stuff (if your a Scala or Java dev and you don't know Mockito, then learn it, right now, it will save your life). If you have to launch a DB to make your unit tests work, those are not unit tests (they are more likely functional or integration tests). So don't do that! You just slow down your development speed because you've made it harder to run your tests, and it takes longer to setup your development environment. I follow these rules on unit tests:

  • No external dependencies (i.e. DBs, or other remote services)
  • Don't require shared state between tests (or as little as possible)
  • Always make sure your tests can be run in your IDE with no other actions needed than compiling
  • Test all expected code paths in a method (i.e. you don't have to go overboard with exception cases)
  • If it's expected to be multi-threaded, write some threaded tests (these are fun, and will teach you a lot about making your code testable)

By-the-way, a target of 80% code coverage is generally good for unit tests, more than that and you start making reformatting code more difficult. Also, more than this and there are diminishing returns on the tests themselves, you're aiming for checks to make sure your code is going to function properly in production, with tests based on real world issues. Unit tests are your base set of tests, they test the tensile strength of the materials which your later going to use to put together your larger project. Like the aluminum in the plane, if you discover issues this early in your development, your saving yourself a lot of pain (and time) down the road, and your lazy, right? You don't want to spend any extra time writing code than you need to...

(Quick aside, I dissed Ruby for it's speed in my first post on this, but to be fair the Rubyists really nailed tests living with your code. Rust has taken testing to another level where you can actually include tests in comments, and then have examples that are always confirmed to be uptodate with the code, super cool) 

But functional tests are way better!


Ok, not really. Functional tests are the next order of tests. As an example, this would be where you test your SQL against your DB (or NOSQL against your non-ACID whatever). Functional tests should only test one thing. I was working on a large project before that had a lot of various external components to work with, one being DNS. It used the dynamic DNS protocol to update the DNS records. First the DNS module was a distinct component in the code such that it had one responsibility, talk to DNS. The functional tests only tested this one component, they did not have any other side-affects, e.g. the DB code et. al. were not dependencies for the tests on this component. From our plane example, this is like the wing-bend test. The plane is put together, but specific functions are being tested to make sure it will be ok in the air, sadly I don't think Tony Hawk could skate on any of the code I've written.

Cool, but functional tests have side-affects, don't they? For DBs and DNS or similar persistent state systems, how do you make sure your tests are starting from a known state? Virtualization! VMs are cool, use them. Use Vagrant to setup your VM, use installation (RPM, DEB, Chef, Docker, etc.) scripts to install the packages, cache the installed VM (so that those steps aren't long in the future), and then clean up all persistent data in the VM before running your tests. It sounds like a lot of pain, but the advantages are awesome; you get to figure out at the beginning of your dev work how your going to install and bootstrap each system, and your guaranteeing that everyone testing is starting from a known point. Pretty awesome and saves you a ton of time later.

My rules for functional tests:
  • One component is being tested
  • Always start from a known state by cleaning up persistent information each time (at the beginning, not the end!)
  • Install your external components now the way you plan to in the future (save yourself some time and heartache later, hell you might decide it's so ridiculously complex that you changed your mind about using it)
  • Only test high level functionality, the unit tests cover code paths, these tests make sure that your assumptions about the thing your using are correct.
What's great about these tests is that when you discover bugs in the interaction between your code and the external system (yes, there will be bugs this doesn't catch), you get to come back to these and replicate the bugs here. Isolated directly to just this one component without having to worry about other interactions, overtime hardening this component to ridiculous degrees.

OMG, all my inter-team dependencies just got easier!


If you're following along, then you'll notice something that you just got for free here. Scrum teams are all the rage these days in large development organizations. If you follow these nice boundaries that your functional tests are helping to guarantee, separate scrum teams can actually work on separate components that will be more likely to work together in the end. It allows those inter-dependent teams to pull in each others tests early and use them to help with their integration tests early. Great for helping deliver code faster in larger organizations. This will make your manager really happy, because they get to focus on all that HR stuff instead of technical problems (which is what they love, right?). And happy managers => promotions => more money => happy and lazy software engineer.

Ok, but I still don't know if my whole system works


Yeah, this is getting repetitive. Integration tests are similar to functional, but this is where you bring things together to make sure they all work. Essentially this just makes sure that when you put your DB in place and your DNS service in place that calling your REST API endpoint actually flows through each component and performs the action you expect. And why is this easy to set up? I know it sounds tedious, but you already have your VMs from your functional tests, reuse them! And again, you get to validate your assumptions about how all of these things will interoperate and what connectivity you need between them before production. Again cool, right?

Rules are the same as for functional tests, but for this you are writing a very minimal set of tests. This is the only place where you do end-to-end tests, in the plane this is akin to the full system tests performed before the flight tests.

Do we get to fly yet?

Ok, yes, you can fly now. Smoke tests, these validate that your system, after being deployed in production, is working as expected. I used to think that it's enough to run these post deployment only, but then I realized something, there are all sorts of monitoring pieces that need to run constantly to make sure your application is actually working. Use your smoke tests for your monitoring needs! What's the difference between these and integration tests you ask? Not much. The only rule here is that your smoke tests should not damage your production systems, so design your system and the tests to not over time take down your production instances. Oh, and guess what, you can virtualize your system just like you did for your integration tests to test these locally as well. Virtualization is an excellent way of validating your stuff before ever hitting a real deployment, and you can do it while sitting on the beach disconnect from a network on your laptop. If your not using VMs to test your deployment steps, then your just making it harder on yourself to verify this since you'll need to share time on actual full test beds or production.

But I'm a lazy bastard, remember?

So here's the key to getting to be lazy. What's harder, replacing the ribbing on the plane after it's built, or when you actually choose that material initially? It's the same here, if you have to track down crazy bugs because you never figured out if your stuff actually does the right thing at the most basic levels, debug time in production is much harder. You also didn't give yourself an easy place to replicate production issues to verify a fix before releasing. We've all been there, logging into production systems at 3am trying to figure out what's wrong with the damn system, definitely not lazy, compound that with an enterprise system where you as a developer are not even allowed to touch the production system, so you have to ask someone with production access to check stuff for you, all over the phone (or similar remote technology, because you were happily sleeping at home when this random problem just caused the entire site to crash and of course it was on your on-call night and you'd much rather get back to that dream and sleeping in, and oh my god now I have kids and they'll be up to less that two hours because debugging a system through someone else has already wasted an hour and you've only just gotten the heapdump, because it had to be uploaded to a safe directory so that you could download it, because you're not allowed to log into production! This is definitely not lazy). Remember, at each layer of the pyramid it is more difficult (and thus affects your ability to be lazy) to debug and fix issues as you get higher up the pyramid's stack. So the moral? Don't be lazy at the beginning and skip your testing, be lazy at the end when it all just works. And yes, even after all of this, eventually there will be a battery that ended up catching on fire for (initially) no obvious reason. At least it wasn't the engine blowing up mid-flight...

In the third part of this series I'll get into development's relationship with operations, yes, DevOps.

6.08.2015

Software engineers are lazy bastards (pt. 1)

I decided to make this a three part piece. This one is about components and modularization, the next about testing, and then I will have a final one on development and operations.

Not too long ago, 2000 let's say, it was common for developers of software to be so lazy that they didn't even know if their software worked. They wouldn't write unit tests, they wouldn't even bother testing their code. They would throw it over the fence to quality assurance engineers and expect them to say, "Oh my god this is the greatest code ever, and it works perfectly". With the exception of the most simple program, this has never actually been the case, hell even Grace Murray Hopper (go Brewers!) couldn't account for literal bugs in the system. This has often caused great angst among real engineers out there and even today people in the Software industry continue to say that Software engineering is not "Real Engineering". This is too easy of a copout and let's people off the hook for designing bad software, or using bad techniques.

It's not like we're building Bridges!

This is a classic argument. Bridges are ridged and have to be designed up front to meet the engineering requirements to span what they are being built for. They are static entities that never change, right? Well, that's not totally accurate, even great engineering feats aren't perfect and need to be fixed, look at the new eastern span of the Bay Bridge. There are a bunch of issues that weren't properly accounted for and now need to be fixed. So, bridges aren't perfect, and need to go back and be fixed, sounds a lot like software. To try and match the way that buildings and bridges are engineered, the Waterfall Method was created (I remember this being taught to me in college as the new cool awesomeness for development). It really seemed like the greatest thing ever.

Years later I started working at a large company, when I got there they hadn't shipped their planned release for over a year. It was stuck in this cycle of Dev -> QA -> Dev -> Product -> Dev -> QA that was never ending. So this was first hand experience of this awesome technique that I was taught in college, and it's abject failure to deal with real world issues. What's the problem? Software has a million moving parts. It's dynamic by nature. This is why the Bridge analogy breaks down so quickly, but it's just the static bridges that don't match this. Not all bridges are made this way, check out the living bridges in India.

What's my point? The issue with Software Engineering isn't that it's not actual engineering, it's that it's Dynamic Engineering. It's constantly adapting and changing to new uses and needs. When you built that website tool it only expected one user at a time, then it blew up and you had like a whopping ten, so you decided it needed to run in parallel. Did you go back and rewrite the entire thing? No, you went back and made it parallel so that all those ten people weren't waiting on each other when they went to your site. Now of course we deal with thousands of connections at a time in an application and so that simple synchronization you did needs to give way to optimistic locking and atomic reads/writes to drastically increase the performance of your application. No locking; again, did you rewrite the entire thing? No, you fixed the weak spots (well unless you wrote it in Ruby and realized that you needed it to actually run fast).

Along came TDD

I remember when Test Driven Development came around. If you know all your inputs and outputs, you can just write the tests for it and then the code will just flow to match your test cases. The problem with this is it shoves Software Engineering back into this idea that it is static and non-changing. The one case I've found it useful is parsing and finite-state-machines, you know all the inputs and expected outputs, so it really does help you validate and write your code faster (let me know if you know of other really good fits). In most cases though, you actually don't know what your inputs are until you design the interface and then try to write the code behind it, only to discover that the interface you designed doesn't allow for nice code to be written after that. In fact, I have found TDD to actually be about as slow as Waterfall in terms of delivery. Ok, so TDD doesn't work, but it did pass on something really important, Test Oriented Development. This term is rising in popularity, but what it means is that the code you write is easily tested through the use of Mocks and Unit tests. So why is this an important development in the field? It shows that there is a discipline to coding, and technique which allows for software to be verified to be correct within whatever bounds you've declared. Ok, now we're getting closer to becoming engineers.

Dude, where's my Bridge?

In the American Civil War (War of Northern Aggression for you Southerners) the Confederate army burned many railroad bridges to try and slow the advance of the Union armies. The Trestle bridge construction was used to build bridges quickly and reliably in order to get the Union trains moving. So this is where we come to software as components in a larger system. Object-Oriented design, code reuse, I'll avoid Microservices for right now, maybe a future post. The point is that Components, like the trestle, allow for systems (bridges), to be built faster. The nice thing about this is that like an individual trestle, any component that fails or needs to be changed can be done so without rewriting the entire system. Components must have strong interfaces by which it is known which is in use in software (one reason why I've been attracted to OSGi in the past). So we can build bridges with individual components because each one can be verified individually before inserting into the whole. Unit tests on your components that cover what the expected input and outputs of the components mean that now you have a easy method of verifying a rewrite of that component, and any future rewrites after that.

So am I an engineer?

I build things, I follow strong rules about how I build these things, and I verify that the things I build are correct and fully functional. The entire point of writing this is that I find it annoying that people keep saying that we software engineers are not engineers, that at best we're designers or worst hacks. This is also used as an excuse by lazy developers to not put in proper design and controls in the software they build. This is not true and it's possible to build software both quickly and with a regard to future changes when required. Agility is key and the organization needs to be amenable to Agile development methodologies, with constant iteration to hone in on the ideal system. I follow these rules when building software:
  • Use Components to create strong Separation of Concerns and boundaries
  • Unit and Functional tests to verify those Components
  • All data is Immutable by default for ease of threading
  • Any system operation or high level operation should be idempotent
What these allow me to do is follow fast iterative design. Create a quick top level design, use that to guide the development of each component down the stack. When guiding large architectures across teams, this is essential. Get initial code out as quickly as possible such that the real world stresses can be discovered and fixed as issues come up, and because you have test harnesses and good boundaries around your component it's easy (ish) to replace. I am a Software Engineer, I follow strong engineering practices around what I do, do you?

What I've left to the reader, proper test design (checkout TestNG for Java and Mockito) and methodology, good Agile development methodologies (pick one). Do your own research.

I decided to make this a three part piece. The next one deals with testing if you're interested in reading it. The final is about DevOPs

5.14.2015

Rust from a Java dev

A little background


Let me say first off, I am not one of those developers that jumps on the next cool language. I have a very specific reason why I am interested in Rust. Rust represents an evolution of systems languages of which our options have been limited now for decades. A systems language in my opinion, and I'm sure there is lots of disagreement about this, but the traits important to me are:
  • Constant and predictable runtime
  • Direct access to OS primitives and interfaces
  • Low overhead for the language runtime

System software is computer software designed to operate and control the computer hardware, and to provide a platform for running application software. System software includes software categories such as operating systems, utility software, device drivers, compilers, and linkers.
That definition seems weak to me, but the line between systems and application languages had been blurred long ago. When choosing a language for any project, there should be specific criteria by which you determine what you need based on what you need to accomplish. Most languages will allow you to accomplish any task you want, but they offer features that may make what you want to do either easier or harder. Picking the right language is important, how many external libraries are available to support what you're building, how many developers exist to support you if you need to hire/contract work, how many unknown bugs or issues remain undiscovered in the language. For most of my use cases Java has always been the right language, but this post isn't about Java, it's about Rust. Also, I don't want to get into defense of Java/JVM over other languages like Go, Ruby, Scala, Clojure, etc. I've been developing in Java since 1997, and I stopped using C/C++ for anything major after 2002.

I discovered Rust while exploring Go. This was only a few months ago, which make me a late-comer to this game. Remember the two bullet points above; constant and predictable runtime, direct access to OS primitives. Go has always confused me, is it a systems language? It's got a Garbage Collector (GC means unpredictable runtime) and it's awkward access to C structures means that it can not easily interface with the OS. It was the latter issue that brought me to Rust which allows for direct access to C structures through it's FFI (Foreign Function Interface). For you Java devs who've always hated working with JNI this is a godsend (Yes, JNA made it easier and Panama should help immensely). Go ends up being better at potentially replacing Java and the JVM.

On to Rust

What makes Rust unique in this landscape is that it's a new language with no GC, but it at the same time it handles the memory management for you. It allows for the same direct access to C as C++ does, with no significant hurdles. Coming from a Java landscape where there are a multitudes of libraries in addition to the massive JDK standard library, what this means is you have access to not just Rust elements, but also the massive C library out there. I think of Rust as having two core features, memory safety and management. But by no means is this the limit to what Rust provides. As a Java developer not having to worry about memory allocation and deallocation is nice. I've gotten quite used to a GC which makes sure that I don't have to worry about seg faults and memory leaks any more (yes, people create memory leaks in Java too, but you have to make some very poor choices for that).

Option<T>


Rust is trying to position it self as a replacement for C, and I've become convinced that unlike C++ and some other languages out there, it actually can. This is because of the fact that Rust gives you access to all the pointer primitives you have in C, but guarantees that accessing the data behind those pointers is always valid. One cool thing that I really like about Rust is that it does not allow for null, this was a little mind blowing, though after building a small project in Swift I learned how nice it is to know that the data you're accessing is either there or not.

Option is a primitive in Rust that is used to return or pass around data which is optionally there or not. It's an enum (union) type in Rust. When in Java we gave up segmentation faults, what we replaced them with were NullPointerExceptions, the only advancement that gives us in the end is the stacktrace. Rust simply says, this data may or may not exist behind the Option type. This tells the developer immediately that you need to handle those two cases. It also gives you some handy functions at your disposal to handle the null case elegantly:

fn unwrap_or(self, def: T) -> T
Which returns the value T if there is nothing, i.e. the Option is None. In Java 1.8 there is now the Optional type which acts the same way. I for one am going to start using this in all my future Java code as this is a very nice feature. Scala has had this from it's foundations as well.

Enums in Rust are cool

Enumerations in Rust are cool, they are a union type. There's not a great corollary to this in Java. The first thing to understand is that an enum in Rust allows you to define values that are stored with the enum, this is done very efficiently in the compiled code, but I won't get into that.

Look at the Option enum:

pub enum Option<T> {
    None,
    Some(T),
}
The first thing you'll notice as a Java dev, is that enums can be Genericized. I've wanted this so many times in Java and end up doing funky stuff to work around it. Now when using this you need to do something that was unfamiliar to me, destructuring, the unwrap() function looks like this:

fn unwrap(self) -> T {
    if let Some(v) = self {
        return v;
    }
    panic!("This is None, can't unwrap()");
}
v is a locally scoped variable defined to unwrap the enum where the enum is of type Some. This type destructuring is neat and allows for very flexible usage of enums in Rust. Ok, this introduces panic! which is essentially similar to throwing a RuntimeError in Java (the ! says that this is a macro). This introduces an issue I have with the language, no exceptions (well, except for panic!). Basically, Rust allows you to throw one type of Exception, and comparing it to Errors in Java is good. Errors should not be caught in Java, they are non-recoverable. Panic is similar, and similarly should almost never be used. But they are there, and at Rust deals the the stack unwinding properly so we don't need to worry about dangling pointer issues. I'm torn as to what I'd prefer to see, and this article is a great writeup on the issue.

Errors in Rust are a little juvenile, but still really neat

So this is where you can see the roots of the desire for Rust to be a bit like C. In C there are no Exceptions, there are return codes. What makes Rust's errors better than C is that since they are enums, there is no question to if the result is good or bad. Objects are generally passed in by reference and then the data returned through that referred parameter. Rust is similar, but instead uses an enum for the result:

enum Result<T, E> {
    Ok(T),
    Err(E),
}
Where there might be an error you return Result. I actually like this, I've noticed you're code becomes much more condensed than the try{} finally {} blocks in Java, though the AutoCloseable stuff has made that nicer. There is no finally, it seems that there has been a long debate in the Rust community about this. If you need finalization, then you need to implement the trait Drop for your struct.

The reason I say errors are juvenile in Rust is that it becomes hard to write functions which join different error types and return those errors. In Java it's easy to wrap any Exception and pass that up the call stack as a cause of the Exception. In Rust this still feels very awkward. In Rust you can't extend objects, all structs are final. What this means is that if you want to create a new Error type in rust, you need to write a bunch of code, or maybe I'm doing it wrong. I imagine this will get better over the coming releases.

Ownership and Lifetimes

This is where Rust get's a little difficult. For Java devs it's important to think about the lifetime of objects because your always concerned with how complex the Garbage Collector will have to work, or not make dumb mistakes that lead to memory leaks. Ownership is something that you have to be concerned with when working with Threads in Java. It's also something to make sure where you don't have any dangling references. In Rust it's not just something you need to be cognizant of, you have to abide by it's strict rules related to these two concepts.

Rust introduced me to a concept of tracking Lifetimes. Initially reading the docs and specs, I thought this was going to be the hard thing to wrap my head around, but in the end it's the Ownership rules that I find to be the difficult piece. See this question I posted in Stack Overflow to get a bit of an understanding to the complexities of Ownership rules.

LLVM

Rust compiles to the LLVM byte code. The LLVM is a really awesome concept and reality. I've been following it since Apple decided to start using it in their OS, and it's clear to mean that this is itself a game changer. The concept behind the LLVM is basically to produce architecture agnostic intermediate code which is then compiled to for the specific architecture on which the code needs to run. This is powerful because it allows for all high level languages only worry about parsing, but the machine code generation is only written once for each architecture. The GCC started doing something similar a while back, but the LLVM has, IMO, done a better job.

As an example, asm.js is a new JavaScript spec that defines a subset of JS which can be optimized by the interpreter to be run like machine code. As a consequence there is no need to do Garbage Collection and some of the numbers I've seen show asm.js applications running at 50% native speeds. But, it comes at a cost, asm.js is complex, writing it by hand is difficult, but this is where the LLVM shines, the Emscripten tool is a compiler for translating C and C++ into asm.js. What this means is that for us people who like strongly typed languages, we will be able to use them instead of JS to wrote for the Web, which is really awesome. By extension, this means that Rust too could be used for writing web code. 

When should you start using Rust?

Now, depending on what you need it for. For me, having a predictable runtime (no GC) is important to a project I'm working on. Having that also run at native C like speed is also desirable. If you have a project that also needs this, Rust seems like the perfect fit. If on the other hand you have a high application to build and time critical deadlines to get it into production I'd still stick with Java or Scala to deliver that. Rust doesn't have the same amount of libraries and hasn't been beaten around in production to discover all it's bugs yet. It's also possible that the language will change a bit over time, but this will probably all calm down after the 1.0 release.