Computer Science, Math and Clever Dolphins: February 2012

Friday, February 24, 2012

BIOS primer

Most of us know what a BIOS looks like, and have a some bits and pieces of an idea about what its supposed to be doing. This needs to rememdied, so, read the rest of this article!

What happens when you hit the power button on a computer?

It doesn't actually go directly to the CPU, first, the BIOS code is loaded, because that's what's able to hoist up the CPU, hard drive, display, etc.

The BIOS is contained on an external chip (that's why some of you see your motherboard's manufacturer's name when you see the BIOS).

A BIOS not only lets you set the boot order (which was was most of us have used it for), it has a couple of jobs.

First of all, it configures your hardware.

Some hardware is dependent on others, has specific settings, etc.
All of this is handled by the BIOS, so that is ready for the bootloader.

Also, all (well, nearly all) computers have a system clock, which doesn't actually tell you the "time" (again, it could), it counts ticks from a certain date, such as the epoch. The BIOS sets this up.

It also selects what devices it can use as bootable, because there are certain types of storage devices that a BIOS cannot load of off (for example, random storage, volatile memory), and it identifies which are bootable.

Then, comes the main job of the BIOS. It calls a bit of code that resides on the selected storage devices on the first 512 bytes of the first sector.

In the early days of computers, 512 bytes was enough code to load the operating system, and things worked wonderfully.

Of course, this is no longer the case considering that the Linux kernel is almost 15 million lines of code.

So, these 512 bytes (known as the MBR) usually call *another* piece of code which then loads your operating system. Or, it could hold a list of sectors on your hard drive from which another piece of is then called.

The BIOS also provides a small API for the MBR to use to write to the screen, make some interrupts, etc.

Its pretty cool stuff...

Sunday, February 19, 2012

C Programmers: Don't write macros

Take pretty much any large C project today.

If you look through a couple of files, you'll notice some things are written in CAPS LOCK AS IF THEY'RE YELLING AT YOU.

These are macros.

They're not as powerful as Lisp macros; they just replace themselves with a bit of code.

C Macros are actually quite nice if they're used in moderation, because they allow you to type less.

But when you have sixteen macros in one line, you really need to stop.

Please.

So, this is my plea to all of you C programmers, please stop writing more macros, because you're going to regret them when you have to go hire a new guy and he'll spend 19 weeks trying to figure out what all the YELLING ACTUALLY MEANS.

Monday, February 13, 2012

My stance on JVM Languages

There have been JVM languages popping up all the time; there's Scala, Clojure, Groovy, etc.

These are all excellent languages, without a doubt they are. I use them all the time, and I have come to enjoy them a lot.

But, sometimes, I wonder about the very long-term (well, at least in the software world) consequences on using these languages.

Java, as many of us consider it these days, is an "uncool" language. It is too verbose, too complicated and has too much over engineered crap that no one uses.

It lacks the simplicity of something like C (notice I did not say C++ :P), where the manual for the ENTIRE LANGUAGE was a hundred something pages long.

But, there is a ton of existing code in Java that no one wants to let go of, because people spent so many years trying to develop that code, and it took take many more years to replicate it in another language.

So, people created JVM languages which run on the Java Virtual Machine, and they can interface pretty easily with Java. Then, you can write all your new code in a fast, hip language like Scala, but, you can still keep all of your legacy stuff around, and you don't have to rewrite a thing!

This works wonderfully, except...

Java, in my opinion, is slowly turning into another COBOL. Heavily used in large, high capital industries where a change is made every few centuries (roughly speaking), but, not used much elsewhere because its just too obsolete.

So, if one creates languages on top of such a language, you're still hooking into the same old architecture. That's like writing a language that compiles to a COBOL virtual machine, people would laugh at you.

Does this change anything for us, right now?

Absolutely not.

The term we're talking about (for Java to turn into COBOL), is just so long, that most companies won't even live to face the day.

I'm still going to keep using JVM languages, because they're just plain awesome, but, I just wanted to put my thoughts out there.

Sunday, February 12, 2012

Graph databases explained quickly

Graph databases seem to be pretty important these days, with a social network popping up every time someone says "Facebook", so, I decided I'd do a small article on them.

Take a traditional, SQL database.

What's it good at?

Its good at specifying tables in which specific types of data lie.

Okay, what's it bad at?

Well, a lot of things, one of which being that having one piece of data "linked" to others is a complete pain.

What does that mean?

If you're a person, and and you have 500 friends, you need to be linked to 500 other people, all of which, in turn, are linked to you AND a bunch of other people.

Now, let's consider a graph database.

In a graph database, there are things called nodes, properties and edges.

Nodes are close like objects/structs to the OOP people, in that they are entities that can represent people, accounts, etc.

Properties are information that nodes have, for example, if one of the nodes was "Person" its properties might be "Name", "Age" and "Address".

Edges are the things that connect the nodes together, which are really the most important, since in a graph database, it matters a lot more about how the nodes are related than what the nodes actually contain.

Hopefully that made sense, if not, drop a comment below!

Follow if you liked it :)

Saturday, February 4, 2012

What's referential transparency?

You've probably heard about it before from all those crazy functional programming people, but, you don't *really* know what it means... I'm here to fix that!

Consider a procedure/method written in an imperative language.

In such a procedure, the program can do anything it wants!

Which means it could multiply two numbers, crash your car, and then save those numbers to a database.

The things that it does that affect the outside world are called "side effects".

In a purely functional programming language, side effects don't exist.

But, in "almost" purely functional programming languages (in which you can actually do something practical stuff like writing programs that do things), the things that have "side effects" are confined to a couple of functions that interact with the outside, but, the rest of the functions (called pure functions) are entirely separated from the outside world, which means that they can be executed anywhere can we can expect the same results.

Consider math for second.

Mathematics is a purely functional language, because there's no way it can affect the "outside world" (atleast, mainstream math can't).

Say you define a function called $f$:

$f(x) = x^2$

Now, what do you get when you compute $f(2)$?

$f(2) = 4$

Okay, so, what's so important about that?

That simple statement means that no matter what the rest of the world is in, if you give the function $f$ a value of $2$, you can bet the farm that it will return $4$.

The same holds true for pure functions in functional programming languages!

And, that's what referential transparency means.

Basically, we're saying that if we plug in some value $m$ into some function $f$, we can always expect to get the same value back ($f(m)$), regardless of the state of the rest of the stuff we're dealing with.

Why's that useful?

First of all, that's a heck of a lot an improvement for compilers/interpreters.

If you're finding the value of $f(2)$ 40 million times, it will (usually) only find it once (this also has to do with how parse trees are replaced when parsing functional programming languages, but, the basic idea is here).

This also lifts the idea of code running sequentially, because it can be run however it would run the fastest.

Referential transparency makes code a lot easier to reason about and debug, since you KNOW that something will return a certain value when it receives a certain set of values as input.

There you go, you should hopefully understand referential transparency.

Friday, February 3, 2012

Functional programming explained

I've noticed a lot of confusion over why functional programming languages seem to be what the cool kids use these days, and why they can actually make a difference in your productivity as a developer, and I hope I can clear that up with this.

Consider a standard, run-of-the-mill imperative language, like Java, or C.

Your procedures or methods typically look like this:

Basically, you're taking two numbers as arguments, adding them, and returning them, and saving the result to a database somewhere in the middle.

Not too shabby, right?

Consider this situation. For some reason (your application isn't working), you need to check if this particular method is working.

What's the problem? Well, you can't test this method separately, because it needs access to the Database class, which needs access to about 6 billion other things!

The ability to test things separately from others is called "decoupling".

Another issue with imperative languages is that they can modify all kinds of state in one function!

So, in effect, when you call the function addNumbers, it could delete your database, empty your bank account and draw a circle on the screen. Why is that a problem, you ask? Well, that's a pain to debug, because you have no idea what types of functions can modify what types of state!

Another issue are global/static variables, which cause no end of problems for Java programmers who find themselves changing static fields in certain objects affecting all the others in adverse ways, and this takes a ton of work to weed out.

How does functional programming solve these problems?

The first thing to understand about functional programming is that in a completely pure functional programming language, there is no way you can modify external state.

That means you can't print to the screen (that's changing the state of the screen), access the web (changing socket state), or anything particularly useful.

But, this wouldn't be terribly useful since you wouldn't be able to get any output shown, so, it would be practically useless (people get around this by writing languages that printing every calculation to the screen as a part of the language implementation, but, this just creates more confusion).

However, that's not to say that they aren't used. An example of an ancient and hugely successful purely functional programming language is mathematics (there's no "print" command in math).

So, languages like Haskell use this special idea (borrowed from mathematics, see my abstract algebra posts) called a Monad to "mark" special functions that do modify state so that they can be separated cleanly from the rest of the program.

This nearly kills off the decoupling problem!

Because there are very few non-pure functions in a typical functional piece of code, everything else IS decoupled from the environment!

This is heaven when you need something debugged!

And, this cleans up what functions can do and can't do; every pure function MUST return a value, so, you can expect something in return, not just a void (because that would mean its changing state, is therefore not a pure function), so, if you try to write to the database from a pure function, you can't do that, which keeps your code squeaky clean.

Usually, functional languages come with malleable and nimble static typing, which means that you don't have to write out the types when you're declaring functions, but, they're there, so, when you try to pass a String to a function that wants an Integer, the compiler will tell you.

All of these restrictions make it much more difficult to compile, but, once it does, it sure as heck won't crash (it might not do what you want it to, but, that's what unit tests are for).

In functional languages, variables aren't really variables in that they can't be changed.

You're probably saying "WHAT?!! How am I supposed to implement a counter then?!", and that's solved entirely by the fact that you can use recursion.

So, that's why functional languages are awesome, and you should go check out Haskell!

Thursday, February 2, 2012

Why is server configuration so annoying?

A lot of programmers these days are switching over to dynamic languages like Python and Ruby, mostly for their ease of development.

Sure, it entails that one must make sure that you aren't passing the wrong types to wrong functions, or you won't find out until runtime, but, if you're willing to take that risk, dynamic typing can be amazing to speed up your work process.

And, just like languages, nearly every other aspect of computer science/engineering has evolved for the simpler, more elegant approach, and, in this shift, many have profited.

For example, Twilio's business model is based entirely on the fact that building phone systems in a modern programming language is just much too difficult for an average programmer, and, by simplifying the process by which developers can take advantage of phones, they've made quite a company.

And, people work everyday to simplify things that were previously immensely complicated, into black boxes in which the details are abstracted away.

As another example, take the humble Python list. It seems like a pretty simple idea; you can push and pop things, and you can delete things from the middle, you can sort and search them.

But, implementing a data structure like that in C is a formidable task, especially with all the optimization work that the Python folk have put into it.

And, things like this have lead to an explosion in Python's popularity.

But, I see one area where this type of stuff is immensely lacking, and that's server configuration.

First of all, why on Earth does Apache still use XML?! Is there literally no other format they can use?

Then, SSL shouldn't be so difficult to set up. It should take, at most, one shell script, or ANYTHING like that that'll generate a basic SSL virtualhost.

There need to be sensible defaults for log-files per vhost and so on.

There need to be macros that we can use to have chunks of configuration with variables plugged in.

And, in general, a lot of things need to be abstracted away into more usable and easy components.

That's all I've got to say, but, do tell me what you think in the comment section below.

Please follow if you liked this (top right).

Wednesday, February 1, 2012

The IRC mentality

In recent years, more hacker forums have sprouted up, moving away from the traditional "somebody asks, a bunch of people reply, and then there are replies to that", to some newer ideas.

For example, there's Stackoverflow.

StackOverflow is an excellent idea.

By introducing gamification into the idea of answering other people's questions for free, they've changed the entire scene of programmer forums. This causes people to think twice (well, for the most part) before posting useless questions, and the same for question answerers.

This works with amazing efficiency; ask a fairly easy to understand question on StackOverflow (SO), and you'll get an answer within a couple of minutes (unless its something incredibly unpopular like running ARM assembly on a MBR), and the answers are immediately filtered out because the upvote-downvote system.

As a added bonus, the interface is clean, succinct, and clear.

Frankly, its just all out awesome.

Coming from places like Daniweb (ads above the fold; aarrgh), its a welcome breath of fresh air.

But, that's not where I started.

I started on IRC.

IRC is facing popularity issues these days; with all the new solutions for collaborating (for example, Stripe uses Campfire by 37signals).

I'm not going to talk about how IRC is outdated, and, I'm not going to tell you that its the only "real" way to chat with people, because that's your decision.

But, I will tell you about the community around IRC.

When you're a beginning programmer, and you're starting to walk around the intricacies of a language, and you're faced with a problem you can't solve with, say, 10 minutes of work, what do you do?

You try to ask for some help.

Consider two scenarios, one on a website like StackOverflow, and one on IRC.

You post it on SO, and you get a response in five or six minutes that tells you how to solve the problem, and tells you what you're doing wrong, and you're able to get your code working, you hope that the person who gave the answer has a nice dinner, and move on.

Second, you go onto IRC, and you ask the question. Someone comes to your assistance. Then, the first thing they usually tell you is to check the "docs", and if you don't know what this is, they'll point you towards where it is. You search around for 10 minutes, find something you think might be right, and the person helps you get it working. But, he just gives you a few hints like "see, you're trying to pass in a string right there, but, $num is supposed to be a number", and at the end of an hour, you've finally solved your problem, and you feel amazing.

What would you pick?

I would pick the IRC experience.

It teaches you HOW to go about solving problems in general, not just the solution to a particular problem that you're facing right at that current moment (well, that too), and you get a much more deep understanding of the problem you're facing, and you'd be apply the same knowledge of "check docs, figure out function, pass in right types" to hundreds of other problems you'll face pretty soon.

Now, this is for a new programmer.

If you're an experience programmer, and have been trying to solve something for 4 hours without success and think you're missing one small quote somewhere, or you're getting a NULL pointer, you'd be better off on SO, but, for new programmers, IRC is important.

That's what I think, do tell me what you think in the comments section below!

Follow if you like it (top right).