FREE BURMA!

( ? , qUeStIoNMaRk )

Seeking for a sustainable amount of chaos. AKA an electronic stream of consciousness about software engineering, open source, life. By Marco Fabbri.

March 18, 2008

Martian Headsets

Joel wrote, as usual, a thoughtful and witty article on the web, standards, interoperability and the upcoming mother of all flame wars; this is a must-read for everyone concerned with web-related software development (web designers, web programmers, information architects, marketeers…) .

As usual, the idealists are 100% right in principle and, as usual, the pragmatists are right in practice. The flames will continue for years.

Joel goes into a lengthy explanation, driven by an extra-terrestrial catchy case study, of what are the possible “cardinalities” of “market standards” (One-to-One - all is fine and simple, One-to-Many - yet fine, Sequence-to-Many - a story of pain and backward compatibility, Many-to-Many - you know, PurePain ™), why a standard without a reference implementation it’s not that standard, and why in the long run being conservative in what you do, and being liberal in what you accept from the others potentially ends in deployment issues kicking your conservative yet liberal butt. In the meanwhile you get also acquainted with some real-world compatibility issues between rabbis from different ultra-orthodox communities:

If you’ve ever visited the ultra-orthodox Jewish communities of Jerusalem, all of whom agree in complete and utter adherence to every iota of Jewish law, you will discover that despite general agreement on what constitutes kosher food, that you will not find a rabbi from one ultra-orthodox community who is willing to eat at the home of a rabbi from a different ultra-orthodox community. And the web designers are discovering what the Jews of Mea Shearim have known for decades: just because you all agree to follow one book doesn’t ensure compatibility[…]

As a very brief personal memorandum: Real Standards must have Real Reference Implementations (because “reality siphons off excess complexity1) and although Postel’s Robustness Principle is (imho) still much valuable for the wide spread of the internet/web it has been able to sustain so far, it should be carefully balanced - “in medium stat virtus” - with having very, very strict standards and “components” positively obnoxious about pointing them all out to you; maybe we (as developers/engineers) should resort to some sort of “carrot and stick” principle.

NOTE 1: The full citation from David H. Gelernter’s Mirror Worlds (a wonderful and fascinating book narrating a vision of computing and information of extraordinary elegance - that is by other words a good combination of simplicity and power) is:

“Information structures are, potentially, the most complicated structures known to man. Precisely because software is so easy to build, complexity is a deadly software killer.
The same problem exists for hardware machines, but it lacks comparable significance. Physical reality is the overflow valve that siphons off excess complexity before the whole system blows.[…]”.

January 16, 2008

ACM Classic Books Series (free PDFs)

Via Lambda the Ultimate: ACM Classic Books Series. The ACM posted PDF versions of some books in its Classic Books Series, which are available to anyone who creates a free ACM Web Account.

Among the currently available books, LtU readers are likely to be particularly interested in Hoare and Jones’s Essays in computing science, Adele Goldberg and David Robson’s Smalltalk-80: the language and its implementation, and Dahl, Dijkstra, and Hoare’s Structured programming.

Long time readers will also know that I highly recommend Papert’s Mindstorms: children, computers, and powerful ideas to anyone interested with the effect computers might have on education. Papert’s Logo remains to this day the best children oriented programming language, but even if you disagree with me about this, his book is a must read.

Among the soon to be released there are Dijkstra’s “Selected writings on computing: a personal perspective” and Yourdon’s “Classics in software engineering”; tasteful food for your (computer scientist/software engineer) brain.

August 22, 2007

Erlang in the evening

Let’s unveil the (not so) misterious bug from yesterday Erlang post…
The auto-increment counter controller receives two kind of messages:

a start message which starts the timered auto-increment;
a stop message which stops the auto-increment.

An implicit assumption on the sequence of messages has been made; no two following messages are equal, i.e. assuming the auto_counter process starting as inactive every message history is produced by the following grammar:

S, scope
S -> start A
A -> stop, S | stop

What happens if the auto_counter process starts as inactive and the message history is of the kind start, stop, stop, start?
When the first start message is received the process starts incrementing (the message match the pattern in the receive expression in the inactive auto_counter function clause (auto_counter(inactive, CounterP, DeltaT)). The process stops incrementing when the first stop message (the message matches the stop pattern in the receive expression and the inactive auto_counter function is called) . Hence, when the second stop message arrives, the function clause being evaluated is auto_counter(inactive, CounterP, DeltaT) which holds indefinitely until a start message is received; bottom-line the second stop message is left in the process’ message box. When the second start message it matches the start pattern and the active auto_counter function gets called (remember: the second stop message is waiting in the message box for a “disaster” to happen :-) ). At this point as the active auto_counter function is being evaluated (which comprises a receive start expression) the process receives the second stop message which causes the auto-increment to stop. Is this behaviour supposed to be correct?
The last post didn’t go into a deep analysis of the problem and its requirements, but now has the problem is in the spotlight time has come to take a decision, the actual implemented behaviour is… WRONG! (an assumption holding true for fairly high percent of the programs being developed). I (the self-appointed project leader) expect the auto-increment process to filter out every stop message received when stopped and every start message when started. This requirements sounds a lot like having a filter on the messages sent to the auto_counter process, so let’s define a process with this capability:

Two “states”, active/inactive which translates in two functions clauses: increment_filter(active,...), increment_filter(inactive,...);
increment_filter(inactive,...) discards stop messages (which means after receiving one calls increment_filter(inactive,...)) and when a start message is received a start message is sent to the auto_counter process and increment_filter(active,...) gets called;
increment_filter(active,...) discards start messages (which means after receiving one calls increment_filter(active,...)) and when a stop message is received a stop message is sent to the auto_counter process and increment_filter(inactive,...) gets called;
knowledge of the auto_counter process identefier.

Here it follows the straightforward implementation:

autoinc_filter(active, Auto_counterP) ->
    receive start ->
            autoinc_filter(active, Auto_counterP);
            stop ->
            Auto_counterP ! stop,
            autoinc_filter(inactive, Auto_counterP)
    end;
autoinc_filter(inactive, Auto_counterP) ->
    receive stop ->
            autoinc_filter(inactive, Auto_counterP);
            start ->
            Auto_counterP ! start,
            autoinc_filter(active, Auto_counterP)
    end.

The impact of the change on the previously developed system (even in the case that it was running) is minimal: only a new process is required to be spwaned; this definetely makes the case for true incremental development and it resembles a kind of living system (interesting reading: The Pinocchio Problem), capable of gracefully growing and evolving.

Another approach to the fixing of the spotted bug would have been discarding the “un-interesting” messages right into the auto_counter function:

auto_counter(active, CounterP, DeltaT) ->
        Activity = receive
                stop -> inactive;
                start -> active
        after DeltaT ->
                CounterP ! inc,
                active
        end,
        auto_counter(Activity, CounterP, DeltaT);
auto_counter(inactive, CounterP, DeltaT) ->
        receive
            start -> auto_counter(active, CounterP, DeltaT);
            stop  -> auto_counter(inactive, CounterP, DeltaT)
        end

This solution seems at first sight correct: althougth the change at the inactiveclause is rather simple, the change at the active clause is trickier enough you end up with a failing system. Every time a start message arrives the timeout gets reset, so the increment doesn’t happen every Delta and the timing is rather faulty; if the arrival rate of the “un-interesting” messages is high enough, let’s say greater than 1/(DeltaT-1) , the counter gets never incremented no matter how long the process keeps on evaluating the auto_counter(active,...) function. This ending osservation is purposed to point out that if you try to twist an inherently counterrent problem in a sequential way the odds you end up screwed are quite high. Erlang encourages to keep concurrent things concurrent, and this is, in my humble opinion, an important point (in software systems engineering) that Erlang gets it right. The world is parallel, better to keep it in mind ;-) .

August 21, 2007

Erlang in the morning

Due to my interest in concurrent and parallel programming I recently started investigating the Erlang programming language, reading and practicing the book by Joe Armstrong "Programming Erlang, Software for a Concurrent World" (the book is awesome, easy and involving writing style, or in editor’s words: "Joe I want you to imagine you’re sitting at a terminal with your friend beside you, explaining how things work" - more on the book). Erlang is a concurrency oriented programming language (based on a mix between actors and functional paradigms) meaning you get first class (and really well engineered) abstractions for process and process comunication, pattern matching, higher order functions, and absolute lack of shared variables, processess communicate by (asynchronous) message passing. Coming from an object oriented (imperative) programming language (e.g. Java, C#) Erlang just feels kind of esoteric and alien, but having a sound Prolog background you can advance in the process of learning quite easily (time to express some deep appreciation for prof. Viroli’s Prolog lessons from "Computational Langauges and Models" graduate course). To have a quick grasp of the language and of some features I value most, let’s take a look at the code that realizes a counter in Erlang (who doesn’t love counters? :-) ). Personal note: I found the "Hello Counter" a really nice example to start exploring the computational landscape of a programming paradigm; it’s basic and quite universal (it resembles a Minsky Machine). This is a function modelling a cyclic counter:

counter(Value, EndVal) -> 
    NewValue = receive
        inc ->
                (Value + 1) rem EndVal;
        {get_value, PID} -> 
                PID ! {value, Value},
                Value
        end,
    counter(NewValue, EndVal).

The function has two arguments:

  • Value represents the state of the counter and it is passed through the (tail) recursive function call;
  • EndVal is the limit for the cyclic counter;
and receives two kind of messages, through the use of the communication primitive receive (which is blocking, i.e. it stops the execution flow until a message matching one of the specified patterns is received):
  • an inc atom which causes the increment of the counter value argument of the recursive function call. Here you have to keep in mind that receive evaluates to the value of the expression following the -> of the pattern matching the message. The expression following the inc pattern is the expression that represent the next value of a cyclic counter: value(n+1) = (value(n) + 1) % limit rem is the infix operator for the modulo operation. Hence, in the case an inc message is received, the variable NewValue is unified with the value of the previous expression, i.e. the counter increments;
  • a {get_value, PID} tuple where get_value is an atom and PID is a variable unifying with the second field of the tuple, the identifier of the process requesting the current counter value. Here the expression following the -> is made of two expressions: in the first one the current counter value is sent (through the ! operator) to the requesting process in a tuple of the kind {value, Value}, the second simply evaluates to the current counter value, which NewValue is unified to (otherwise NewValue would have been unified with {value, Value}, being that the last expression). A note on !: contrary to receive it is not blocking, hence if you want to create a rendezvous you have to send a request message and then wait for an answer message.
The last expression is a tail recursive function call, whith NewValue as value of the counter. Now, after saving the code in the counter module, a counter process can be spawned using the the counter function as the process body (the variable CounterP is unified to the process identifier of the spawned process):

CounterP = spawn(counter, counter, [0,42])

incremented by sending a message:

CounterP ! inc

and questioned on its current value (the self() function evaluates to the process identifier of the calling process):

CounterValue = CounterP ! {get_value, self()}

The cyclic counter here defined can be quite puzzling at first (for those who come from an OO background) but has some interesting properties about concurrency and distribution due to the Erlang programming model/platform, hereafter one to cath them all: this program is yet ready to be run as part of a concurrent/distributed system. As a proof of concept to this bold statement let’s implement an auto-increment controller (optionally remote) for the cyclic counter (auto-increment controller is a pompous definition for introducing a start/stop auto-increment cycle). The controller can be thought as another process incrementing the counter on a given time interval, switched on/off by sending it start/stop messages. Through the use of pattern matching in function definition different behaviour can be distinguished, id est the function auto_counter can have multiple clauses for different patterns of its arguments, as follows:

auto_counter(active,…) -> incrementig the counter every delta time until a stop message is received auto_counter(inactive,…) -> doing nothing until a start message is received
Here it follows the actual code for the (remote) auto-increment counter controller (the do X until a Y message is received is modelled through the use of a complete receive clause which comprises after T -> Expression):

auto_counter(active, CounterP, DeltaT) ->
        Activity = receive
                stop -> inactive
        after DeltaT ->
                CounterP ! inc,
                active
        end,
        auto_counter(Activity, CounterP, DeltaT);
auto_counter(inactive, CounterP, DeltaT) ->
        receive
                start -> auto_counter(active, CounterP, DeltaT)
        end.

The system just developed is "distribution ready": the counter process and the auto-increment process can be in the same Erlang runtime, in different runtimes on the same machine, in different machines, no change in their definition is needed to go through these different scenarios. This does not happen because the distribution/concurrency of the system is hide by some magic trickery trying to over-smart the programmer in believing remote entities could be considerated as near as local (the Fallacies of Distributed Computing), but it does happen because Erlang builds on the assumption that (from "Programming Erlang - Sofware for a Concurrent World":

The world is parallel If we want to write programs that behave as other objects behave in the real world, then these programs will have a concurrent structure.
This is just a small taste of Erlang and there is much more in the reliance/healing/scalability capabilities (granted by OTP), however it should be clear that: high-level abstractions are offered (and needed) to capture in a syntethic and powerful manner problems: communication primitives, pattern matching, first-order functions, list comprehensions; (very cheap and lightweight) Processes are first-class citizens, and systems are "naturally" modelled as concurrent; avoiding side effects (as long as it is possible… more on this in next posts) leads easens scalabilty and load capability. A last note on the presented program, the auto_counter function does some over-restrictive assumption on the sequence of messages it receives (exercise left to the reader :-) : try to figure out what the problem is) tomorrow a solution will be proposed. Ending a special thank to prof. Alessandro Ricci and Giulio Piancastelli for the tolerance shown towards my tendency to "express" (face-to-face/online) every random thougth about concurrency happening to strike my brain. (more…)

August 20, 2007

Motivational Posters

Filed under: engineering, software, fun

I found on flickr three great “motivational” posters by Bill De Hora. This one should be in every introductory book/course on software engineering/programming:

I Pity the fool who does'nt write test cases!!

This should be in front of the entrance of the software lab/office:

I pity the fool who breaks the BUILD!

And the last one is definetely my favourite:

I love when a build comes together

February 3, 2006

Requirements Engineering (for Dilbert)

Getting the purpose of a software system right is an evergreen theme in software engineering, and one of the fuzziest issues in this beautiful discipline. My friend Marco some times ago posted a great toon on the subject, with a brief and sharp commentary.
On sunday Dilbert posted an awesome strip on the tensions between engineers, customers and requirements:

  • engineers want customers to express requirements clearly, possibly in a non-ambigous language (a set of differential equations wuold be perfect ;) ).
  • customer wants engineers to guess what they are thinking their problem is.

This would seem a self-referencing non-terminating problem (sounds huge), but Dilbert has the perfect solution:

  • As the designed software can do whatever the engineer designs it to do, the engineer should design a software to tell himself customer requirement.

Striking clear, not?

Get free blog up and running in minutes with Blogsome | Theme designs available here