FREE BURMA!

( ? , qUeStIoNMaRk )

Seeking for a sustainable amount of chaos. AKA an electronic stream of consciousness about software engineering, open source, life. By Marco Fabbri.

March 18, 2008

Martian Headsets

Joel wrote, as usual, a thoughtful and witty article on the web, standards, interoperability and the upcoming mother of all flame wars; this is a must-read for everyone concerned with web-related software development (web designers, web programmers, information architects, marketeers…) .

As usual, the idealists are 100% right in principle and, as usual, the pragmatists are right in practice. The flames will continue for years.

Joel goes into a lengthy explanation, driven by an extra-terrestrial catchy case study, of what are the possible “cardinalities” of “market standards” (One-to-One - all is fine and simple, One-to-Many - yet fine, Sequence-to-Many - a story of pain and backward compatibility, Many-to-Many - you know, PurePain ™), why a standard without a reference implementation it’s not that standard, and why in the long run being conservative in what you do, and being liberal in what you accept from the others potentially ends in deployment issues kicking your conservative yet liberal butt. In the meanwhile you get also acquainted with some real-world compatibility issues between rabbis from different ultra-orthodox communities:

If you’ve ever visited the ultra-orthodox Jewish communities of Jerusalem, all of whom agree in complete and utter adherence to every iota of Jewish law, you will discover that despite general agreement on what constitutes kosher food, that you will not find a rabbi from one ultra-orthodox community who is willing to eat at the home of a rabbi from a different ultra-orthodox community. And the web designers are discovering what the Jews of Mea Shearim have known for decades: just because you all agree to follow one book doesn’t ensure compatibility[…]

As a very brief personal memorandum: Real Standards must have Real Reference Implementations (because “reality siphons off excess complexity1) and although Postel’s Robustness Principle is (imho) still much valuable for the wide spread of the internet/web it has been able to sustain so far, it should be carefully balanced - “in medium stat virtus” - with having very, very strict standards and “components” positively obnoxious about pointing them all out to you; maybe we (as developers/engineers) should resort to some sort of “carrot and stick” principle.

NOTE 1: The full citation from David H. Gelernter’s Mirror Worlds (a wonderful and fascinating book narrating a vision of computing and information of extraordinary elegance - that is by other words a good combination of simplicity and power) is:

“Information structures are, potentially, the most complicated structures known to man. Precisely because software is so easy to build, complexity is a deadly software killer.
The same problem exists for hardware machines, but it lacks comparable significance. Physical reality is the overflow valve that siphons off excess complexity before the whole system blows.[…]”.

March 4, 2008

Highway to #Hell

Filed under: life, fun, geek, programming

Introduction: velocity is a beautiful templating language… (easily turnable into a not-so-general-purpose programming language); #macro is the syntax to call the macro named macro.

via My Mood Swings - Highway to #hell

  • Tizio #1: this xwiki thing has all the potential to quickly turn into…
  • Tizio #2: Soylent Green?
  • Tizio #1: a macro #hell
  • Tizio #2: Highway to #hell… http://youtube.com/watch?v=erJc4dzZ3IA

February 3, 2008

Real Programmers

Filed under: fun, geek, programming

I do love this web-comic (xkcd), I really really do.

Real Programmers

Real programmers set the universal constants at the start such that the universe evolves to contain the disk with the data they want.

I am fairly confident it also exists the emacs command “C-x M-C M-answer-to-life-the-universe-and-everything”… you know the result.

By the way, spot the quote in the post ;-) .

December 3, 2007

Password Strength Meter Kata

DISCLAIMER: quite long, programming related post.

A few days ago Marco Ramilli reviewed a useful password strength checker written in JavaScript. The JavaScript "score" function is (here without any DOM manipulation-"side effect"):

function passwordStrength(password)
{
   
    var desc = new Array();
        desc[0] = "Very Weak";
        desc[1] = "Weak";
        desc[2] = "Better";
        desc[3] = "Medium";
        desc[4] = "Strong";
        desc[5] = "Strongest";

        var score   = 0;

        //if password bigger than 6 give 1 point
        if (password.length > 6) score++;

        //if password has both lower and uppercase characters give 1 point     
        if ( ( password.match(/[a-z]/) ) && ( password.match(/[A-Z]/) ) ) score++;

        //if password has at least one number give 1 point
        if (password.match(/\d+/)) score++;

        //if password has at least one special caracther give 1 point
        if ( password.match(/.[!,@,#,$,%,^,&,*,?,_,~,-,(,)]/) ) score++;

        //if password bigger than 12 give another 1 point
        if (password.length > 12) score++;

    return {"score": score, "desc": desc[score] } ;

}

This function can be written in a fairly more compact manner taking avdantage of some nice JavaScript programming fatures. (Remembering that JavaScript is the World’s Most Misunderstood Programming Language)

First you can declare the array with a single assignament:

var desc = ["Very Weak", "Weak", "Better", "Medium", "Strong", "Strongest"]

Second you can avoid repeating the "if (condition) score++" (if you are lazy like me you should hate writing the same code over and over), have the tests stored in an array and executed via "eval" in a "for (element in collection)" statement, this way you keep the tests separated from the score evaluation; this is no big deal in a so simple and short program, but it’s a good practice (and attitude) to have rules and rules engine logic separeted.

var tests = [
    "(password.length > 6)",
    "( ( password.match(/[a-z]/) ) && ( password.match(/[A-Z]/) ) )",
    "(password.match(/\d+/))",
    "password.match(/.[!,@,#,$,%,^,&,*,?,_,~,-,(,)]/) )",
    "(password.length > 12)",
]

for (test in tests){
    if eval(test) {
        score++;
        }
}

Now the password strength checker function can be rewritten as:

function passwordStrength(password)
{
   
    var desc = ["Very Weak", "Weak", "Better", "Medium", "Strong", "Strongest"];
   
    var tests = [   
        "(password.length > 6)",
        "( ( password.match(/[a-z]/) ) && ( password.match(/[A-Z]/) ) )",
        "(password.match(/\d+/))",
        "password.match(/.[!,@,#,$,%,^,&,*,?,_,~,-,(,)]/) )",
        "(password.length > 12)",
    ];
   
    var score = 0;
    //every passing test gets the score incremented of 1 point
    for (test in tests){
        if eval(test) {
            score++;
            }
        };
   
    return {"score": score,"desc": desc[score]};
   
}

This is a more compact formulation than the original one, which means less bytes down the tube which turns to less bandwidth consumption, and it is also more manageable (the initial disclaimer is always valid: the example is quite simple so you get no big savings, it’s more about the attitude).

A note on eval: (ab)use of eval is somewhat considered harmful and watched with despise in some circles, so you can easily substitute the expression eval(test), relying on a more explicit use of the "metaprogramming" facilities of the language, with:

(new Function("return " + test))()

Here you build an anonoymous Function whose body is made out of the return statement (otherwise, as no return statement is specified in the test body, the return type of the function would be undefined, kind of void for the Java-inclined) and the test body; the function is then called () and the test gets evaluated.

By having "rules engine" and rules separated you can further improve the code robustness: an implicit assumption is made on the number of tests and descriptions  you provide, i.e. you must provide N tests and N+1 descriptions. How can this become more flexible? You can broad the initial assumption by stating you can have an arbitrary number of tests (each one with a separate score) and an indipendent number of level descriptions. For example’s sake you have 4 descriptions (you throw away the "Better" and the "Strongest" ones), 6 tests and the last one gets you 2 points if passed.

First you have a simple design decision to make: how you relate the strength description to the score?

Opting for a simple design you can fully specify the mappings (defining the minimum score for each description), e.g. using a dictionary / hashmap like {0: "Very Weak" , 2: "Weak", 3: "Medium", 5: "Strong"} for the descriptions. Hence you have:

var desc = {0: "Very Weak" , 2: "Weak", 3: "Medium", 5: "Strong"}

The score for each test can be specified taking advantage of an hashmap in wich the body of the test is the key and the score the value.

var tests = {
    "(password.length > 6)": 1,
    "( ( password.match(/[a-z]/) ) && ( password.match(/[A-Z]/) ) )": 1,
    "(password.match(/\d+/))": 1,
    "password.match(/.[!,@,#,$,%,^,&,*,?,_,~,-,(,)]/) )": 1,
    "(password.length > 12)": 2,
    }
   
To take into account the custom score for each test you also have to make a fairly trivial change to the "rules engine", instead of score++ you have:

score += tests[test];

Now that the score has been calculated you have to retrieve the description; descriptions are "indexed" in the hashmap by their minimum score level, so you have to retrieve the entry with the maximum possible key value less or equal than your actual score. This problem can  be solved by a helper function that starts with retrieving the entry for the actual score and in case of failure (i.e. desc[score] evaluates to undefined) it calls recursively itself with the score decremented by one unit, and so on until it eventually reaches a valid key. A note on  the ? operator and associative arrays in JavaScript, the test fails for the value undefined, to which the expression desc[score] evaluates if there is no entry for the key score.

function desc_f(score)
{
    return desc[score] ? desc[score] : desc_f(score-1);
}

Now you can substitute desc[score] with desc_f(score) in the return statement of passwordStrength and your’re done with satisfying the evolved requirements.

function passwordStrength(password)
{
    var desc = {0: "Very Weak" , 2: "Weak", 3: "Medium", 5: "Strong"};
       
    var tests = {
        "(password.length > 6)": 1,
        "( ( password.match(/[a-z]/) ) && ( password.match(/[A-Z]/) ) )": 1,
        "(password.match(/\d+/))": 1,
        "password.match(/.[!,@,#,$,%,^,&,*,?,_,~,-,(,)]/) )": 1,
        "(password.length > 12)": 2,
        };
   
    var score = 0;
   
    for (test in tests){
        if eval(test) {
            score += tests[test];
            }
        };
       
    function desc_f(score)
    {
        return desc[score] ? desc[score] : desc_f(score-1);
    };
   
    return {"score": score,"desc": desc_f(score)};
   
}

If you  didn’t get overly bored you should have appreciated the power and simplicity offered by a "responsible use" of JavaScript. If you really didn’t get overly bored and you feel like expereminting something on your own, you can implement your own strategy for description-score mapping; there is plenty of room to introduce arbitrarly complex strategies.

These ramblings have been inspired by some (off|on)line chat with Giulio Piancastelli (who suggested me the Function() alternative to the "evil eval"), his Kata Four in JavaScript post and by these nice presentations on JavaScript Metaprogramming: @media Ajax by Dan Webb and  @Columbus Ruby Brigade by Adam McCrea.

November 19, 2007

Mixed Feelings

Filed under: life, java, geek, programming

Somewhere in a Java source file

// content does not include the quotes or curly braces around the string!
private String content;
/**
* content includes the quotes or curly braces around the string!
*
* @param content
*/
BibtexString(BibtexFile file, String content) {
super(file);
this.content = content;
}

This code features a self-contradicting couple of comments: “does not include” / “includes” referred to the same object, what can be called a Schroendiger’s String; i.e. until you open the box… hmmm… take a look at the actual code, content is a marvelous data structure able to and not to include the quotes or curly braces around the string at the same time. A look at the actual code, a few lines below, leads to the death of this quantum string, revealing it does not contain the quotes or curly braces around the string.

// is this really a number?
try {
Integer.parseInt(content);
writer.print(content);
} catch (NumberFormatException nfe) {
writer.print('{');
writer.print(content);
writer.print('}');

August 22, 2007

Erlang in the evening

Let’s unveil the (not so) misterious bug from yesterday Erlang post…
The auto-increment counter controller receives two kind of messages:

a start message which starts the timered auto-increment;
a stop message which stops the auto-increment.

An implicit assumption on the sequence of messages has been made; no two following messages are equal, i.e. assuming the auto_counter process starting as inactive every message history is produced by the following grammar:

S, scope
S -> start A
A -> stop, S | stop

What happens if the auto_counter process starts as inactive and the message history is of the kind start, stop, stop, start?
When the first start message is received the process starts incrementing (the message match the pattern in the receive expression in the inactive auto_counter function clause (auto_counter(inactive, CounterP, DeltaT)). The process stops incrementing when the first stop message (the message matches the stop pattern in the receive expression and the inactive auto_counter function is called) . Hence, when the second stop message arrives, the function clause being evaluated is auto_counter(inactive, CounterP, DeltaT) which holds indefinitely until a start message is received; bottom-line the second stop message is left in the process’ message box. When the second start message it matches the start pattern and the active auto_counter function gets called (remember: the second stop message is waiting in the message box for a “disaster” to happen :-) ). At this point as the active auto_counter function is being evaluated (which comprises a receive start expression) the process receives the second stop message which causes the auto-increment to stop. Is this behaviour supposed to be correct?
The last post didn’t go into a deep analysis of the problem and its requirements, but now has the problem is in the spotlight time has come to take a decision, the actual implemented behaviour is… WRONG! (an assumption holding true for fairly high percent of the programs being developed). I (the self-appointed project leader) expect the auto-increment process to filter out every stop message received when stopped and every start message when started. This requirements sounds a lot like having a filter on the messages sent to the auto_counter process, so let’s define a process with this capability:

Two “states”, active/inactive which translates in two functions clauses: increment_filter(active,...), increment_filter(inactive,...);
increment_filter(inactive,...) discards stop messages (which means after receiving one calls increment_filter(inactive,...)) and when a start message is received a start message is sent to the auto_counter process and increment_filter(active,...) gets called;
increment_filter(active,...) discards start messages (which means after receiving one calls increment_filter(active,...)) and when a stop message is received a stop message is sent to the auto_counter process and increment_filter(inactive,...) gets called;
knowledge of the auto_counter process identefier.

Here it follows the straightforward implementation:

autoinc_filter(active, Auto_counterP) ->
    receive start ->
            autoinc_filter(active, Auto_counterP);
            stop ->
            Auto_counterP ! stop,
            autoinc_filter(inactive, Auto_counterP)
    end;
autoinc_filter(inactive, Auto_counterP) ->
    receive stop ->
            autoinc_filter(inactive, Auto_counterP);
            start ->
            Auto_counterP ! start,
            autoinc_filter(active, Auto_counterP)
    end.

The impact of the change on the previously developed system (even in the case that it was running) is minimal: only a new process is required to be spwaned; this definetely makes the case for true incremental development and it resembles a kind of living system (interesting reading: The Pinocchio Problem), capable of gracefully growing and evolving.

Another approach to the fixing of the spotted bug would have been discarding the “un-interesting” messages right into the auto_counter function:

auto_counter(active, CounterP, DeltaT) ->
        Activity = receive
                stop -> inactive;
                start -> active
        after DeltaT ->
                CounterP ! inc,
                active
        end,
        auto_counter(Activity, CounterP, DeltaT);
auto_counter(inactive, CounterP, DeltaT) ->
        receive
            start -> auto_counter(active, CounterP, DeltaT);
            stop  -> auto_counter(inactive, CounterP, DeltaT)
        end

This solution seems at first sight correct: althougth the change at the inactiveclause is rather simple, the change at the active clause is trickier enough you end up with a failing system. Every time a start message arrives the timeout gets reset, so the increment doesn’t happen every Delta and the timing is rather faulty; if the arrival rate of the “un-interesting” messages is high enough, let’s say greater than 1/(DeltaT-1) , the counter gets never incremented no matter how long the process keeps on evaluating the auto_counter(active,...) function. This ending osservation is purposed to point out that if you try to twist an inherently counterrent problem in a sequential way the odds you end up screwed are quite high. Erlang encourages to keep concurrent things concurrent, and this is, in my humble opinion, an important point (in software systems engineering) that Erlang gets it right. The world is parallel, better to keep it in mind ;-) .

August 21, 2007

Htmlize your * code buffer

Filed under: weblog, programming

I attempted to use the htmlize package for Emacs as described in Htmlize your Erlang code buffer, but I managed to get the code syntax hilighted only in the preview window, once the post gets published the syntax hilighting is no more effective. Any suggestions?

Erlang in the morning

Due to my interest in concurrent and parallel programming I recently started investigating the Erlang programming language, reading and practicing the book by Joe Armstrong "Programming Erlang, Software for a Concurrent World" (the book is awesome, easy and involving writing style, or in editor’s words: "Joe I want you to imagine you’re sitting at a terminal with your friend beside you, explaining how things work" - more on the book). Erlang is a concurrency oriented programming language (based on a mix between actors and functional paradigms) meaning you get first class (and really well engineered) abstractions for process and process comunication, pattern matching, higher order functions, and absolute lack of shared variables, processess communicate by (asynchronous) message passing. Coming from an object oriented (imperative) programming language (e.g. Java, C#) Erlang just feels kind of esoteric and alien, but having a sound Prolog background you can advance in the process of learning quite easily (time to express some deep appreciation for prof. Viroli’s Prolog lessons from "Computational Langauges and Models" graduate course). To have a quick grasp of the language and of some features I value most, let’s take a look at the code that realizes a counter in Erlang (who doesn’t love counters? :-) ). Personal note: I found the "Hello Counter" a really nice example to start exploring the computational landscape of a programming paradigm; it’s basic and quite universal (it resembles a Minsky Machine). This is a function modelling a cyclic counter:

counter(Value, EndVal) -> 
    NewValue = receive
        inc ->
                (Value + 1) rem EndVal;
        {get_value, PID} -> 
                PID ! {value, Value},
                Value
        end,
    counter(NewValue, EndVal).

The function has two arguments:

  • Value represents the state of the counter and it is passed through the (tail) recursive function call;
  • EndVal is the limit for the cyclic counter;
and receives two kind of messages, through the use of the communication primitive receive (which is blocking, i.e. it stops the execution flow until a message matching one of the specified patterns is received):
  • an inc atom which causes the increment of the counter value argument of the recursive function call. Here you have to keep in mind that receive evaluates to the value of the expression following the -> of the pattern matching the message. The expression following the inc pattern is the expression that represent the next value of a cyclic counter: value(n+1) = (value(n) + 1) % limit rem is the infix operator for the modulo operation. Hence, in the case an inc message is received, the variable NewValue is unified with the value of the previous expression, i.e. the counter increments;
  • a {get_value, PID} tuple where get_value is an atom and PID is a variable unifying with the second field of the tuple, the identifier of the process requesting the current counter value. Here the expression following the -> is made of two expressions: in the first one the current counter value is sent (through the ! operator) to the requesting process in a tuple of the kind {value, Value}, the second simply evaluates to the current counter value, which NewValue is unified to (otherwise NewValue would have been unified with {value, Value}, being that the last expression). A note on !: contrary to receive it is not blocking, hence if you want to create a rendezvous you have to send a request message and then wait for an answer message.
The last expression is a tail recursive function call, whith NewValue as value of the counter. Now, after saving the code in the counter module, a counter process can be spawned using the the counter function as the process body (the variable CounterP is unified to the process identifier of the spawned process):

CounterP = spawn(counter, counter, [0,42])

incremented by sending a message:

CounterP ! inc

and questioned on its current value (the self() function evaluates to the process identifier of the calling process):

CounterValue = CounterP ! {get_value, self()}

The cyclic counter here defined can be quite puzzling at first (for those who come from an OO background) but has some interesting properties about concurrency and distribution due to the Erlang programming model/platform, hereafter one to cath them all: this program is yet ready to be run as part of a concurrent/distributed system. As a proof of concept to this bold statement let’s implement an auto-increment controller (optionally remote) for the cyclic counter (auto-increment controller is a pompous definition for introducing a start/stop auto-increment cycle). The controller can be thought as another process incrementing the counter on a given time interval, switched on/off by sending it start/stop messages. Through the use of pattern matching in function definition different behaviour can be distinguished, id est the function auto_counter can have multiple clauses for different patterns of its arguments, as follows:

auto_counter(active,…) -> incrementig the counter every delta time until a stop message is received auto_counter(inactive,…) -> doing nothing until a start message is received
Here it follows the actual code for the (remote) auto-increment counter controller (the do X until a Y message is received is modelled through the use of a complete receive clause which comprises after T -> Expression):

auto_counter(active, CounterP, DeltaT) ->
        Activity = receive
                stop -> inactive
        after DeltaT ->
                CounterP ! inc,
                active
        end,
        auto_counter(Activity, CounterP, DeltaT);
auto_counter(inactive, CounterP, DeltaT) ->
        receive
                start -> auto_counter(active, CounterP, DeltaT)
        end.

The system just developed is "distribution ready": the counter process and the auto-increment process can be in the same Erlang runtime, in different runtimes on the same machine, in different machines, no change in their definition is needed to go through these different scenarios. This does not happen because the distribution/concurrency of the system is hide by some magic trickery trying to over-smart the programmer in believing remote entities could be considerated as near as local (the Fallacies of Distributed Computing), but it does happen because Erlang builds on the assumption that (from "Programming Erlang - Sofware for a Concurrent World":

The world is parallel If we want to write programs that behave as other objects behave in the real world, then these programs will have a concurrent structure.
This is just a small taste of Erlang and there is much more in the reliance/healing/scalability capabilities (granted by OTP), however it should be clear that: high-level abstractions are offered (and needed) to capture in a syntethic and powerful manner problems: communication primitives, pattern matching, first-order functions, list comprehensions; (very cheap and lightweight) Processes are first-class citizens, and systems are "naturally" modelled as concurrent; avoiding side effects (as long as it is possible… more on this in next posts) leads easens scalabilty and load capability. A last note on the presented program, the auto_counter function does some over-restrictive assumption on the sequence of messages it receives (exercise left to the reader :-) : try to figure out what the problem is) tomorrow a solution will be proposed. Ending a special thank to prof. Alessandro Ricci and Giulio Piancastelli for the tolerance shown towards my tendency to "express" (face-to-face/online) every random thougth about concurrency happening to strike my brain. (more…)

Get free blog up and running in minutes with Blogsome | Theme designs available here