MetaPundit

Wow, just wow

This article at the Weekly Standard is completely worth reading. For those who have followed the Charles Enderlin saga there won't be much new information but the attitudes of French journalists towards their sole non-negotiable professional obligation (to tell the truth) is still staggering. If you aren't aware of the controversy  about the famous Al-Durra shooting at the beginning of the second Intifada, of course, this will make even more interesting reading...

Permalink: Jul 1st 2008, 01:58 PM PST

Smoke

I thought this pic would make a good companion to the meta-wife's firemap.  The area shown is north of where I live but the whole central valley has been a smoke-filled bowl for the last week and a half. It's actually clearing up pretty good in the last day or two but we skipped Mo-Band (a weekly concert in the park) last Thursday because the air was so bad...

Permalink: Jul 1st 2008, 01:57 PM PST

You're killing me, man

OK, you're killing me out there!

I'm turning into a crank, I know, but the quality of conversation around PHP development lately is really bugging me.

I ranted at length in the comments at the Sitepoint blog post about microbenchmarks and thought I got it out of my system.

No such luck - this morning in my feeds was a link to a blog post titled in_array is quite slow.  DO NOT WANT!

I'm going to explain one more time what's bugging me and then start ignoring it...

First take a look at the really badly titled in_array post. Basically the author is repeatedly importing a large dataset from XML to a database and wants to eliminated duplicate records. The script starts running slowly (as in hours) and he discovers that the duplicate elimination logic consists of building an array of all the existing unique id's in the database and another array of the all the ids from the XML data. Then for each id in the list of new ids it searches the array of existing ids to see if the id already exists; inserting it only if it's new ...

With me so far? Notice how I cleverly didn't use the words "in_array". In fact, let's write some sample code to accomplish this algorithm without resorting to in_array.


foreach($new_ids as $new_id)
{
    $found = false;
    foreach($old_ids as $old_id)
    {
        if($old_id == $new_id)
        {  
            $found = true;
            break;
        }
    }
    if(!$found) insert_new_id($new_id);
}

So what's wrong with that code? Nothing, if my datasets stay small. The problem is that it's an O(n2) algorithm. This is Big O notation and completely worth reading about if you are or intend to be a programmer. Big O notation gives you a way to categorize the speed of  algorithms. I've heard that O stands for "on the order of" but I don't see that on the wikipedia page. In any event, Big O analysis doesn't care about the actual speed of an algorithm, only about the way in which the number of operations varies as the dataset the algorithm is performed on varies. In this case it is pretty easy to see that if each of our lists has 10 items the inner loop will run up to 10x10 times (say if there were no matches between the two lists). If there are 1000 items in each list it will run 1000x1000 times. And to abstract this - if there are "n" items in each list the inner loop will run n2 times - hence the Big O label. N squared is really bad because exponential growth doesn't scale the way we intuitively want things to scale. Instinctively I always assume algorithms are linear. If processing 1000 items in a list takes 1 second, processing 2000 items ought to take 2 seconds, right? That's true for O(n) algorithms only!

So back to the blog post - what's wrong with in_array() that makes it run so slowly? Nothing - in_array() is a search function - probably more sophisticated than my for loop code but essentially the same idea. Using in_array() in an inner loop gives you n-squared runtimes!

Now using the $old_ids list as a hash-table instead (values in the keys) lets you convert this back to a linear runtime. Replacing the for loop (or in_array) with


    if(!isset($old_ids[$new_id])) insert_new_id($new_id);

Makes the code run much faster. Down from hours to .8 seconds in this case. But notice that this has nothing to with in_array - instead it has to do with choices of data-structures and algorithms. It doesn't help that PHP's hashtable and list type are the same structure (that's not per se bad, as long as you are aware that there are two different such data structures)! We now have an O(n) algorithm because the speed of a hashtable lookup is independent of the number of items it contains. The whole "inner loop" becomes a single constant time operation that doesn't vary with the length of the input data. And of course a better blog title would be "Picking the right data structure" or perhaps even "Duh! Searching is slower than lookup!"

And here's my where my rant comes in. I'm not going to speculate about the causes (ease of use and ubiquitous deployment, in my book, plus the lingering awkwardness of the language that causes some of the solid developers to defect (eg: I still have Paul Bissex and Simon Willison filed under PHP in my RSS reader...)) but the quality of commentary in the PHP community is bothering me. Shallowness and bikeshedding abounds. The sitepoint post I mentioned at the beginning was pointing to yet another PHP microbenchmark discussion - should you use single quotes or double quotes around strings? Is while faster than foreach? To reference or not to reference when passing data structures around... All discussed un-ironically as "PHP best practices".

Stop it already people!

Here's some semi-constructive advice. Don't ever write about syntax unless you use the term newbie in the title. (The "for loop" for newbies is just fine). For everybody else - syntax is not programming! Saying you are a programmer because you "know PHP" (ie understand the syntax of the language) is like me saying I'm a painter because I can name all the colors. Especially don't pair discussions of syntax and speed - in fact don't talk about speed at all! Unless of of course you cite the rules for optimization:

  1. Don't!
  2. (for experts only) Don't yet

Or at least Knuth's aphorism ("premature optimization is the root of all evil"). In fact... I'm coining my own aphorism here (naming your statements makes them sound more official). Henceforth metapundit's law must be respected: don't optimize (or blog about optimization) unless you know who Knuth is. And if you've got some solid additions to my PHP feeds (is Harry Fuecks still writing?) I'd be glad to hear them.

Permalink: Jun 6th 2008, 01:13 PM PST

The History of Christianity

Thanks to nothing_to_say (and check out her new blog focusing on New Testament studies) I just finished reading Justo Gonzalez' The Story of Christianity. This is a two volume survey of Church history that covers the birth of the Church in Acts through through to the 20th century. I heartily recommend the book; it is an easy read, rarely polemical or particularly academic and the author makes the personalities and controversies of the Church come alive.

There's no way to comprehensively review a book that covers so much ground so I'll just record a few of the impressions I formed as I read.

One of the fun things about reading history is when you slot pieces of information into what you already know. I had this happen a lot. For exampe: Everybody knows more or less the history of Henry VIII, his wives and divorce, the birth of the Church of England and his two daughters who became Queens: Mary and Elizabeth. I was aware Mary was Catholic and Elizabeth was Protestant (got to love her Motto: "I see, and say nothing"). I never quite put the obvious pieces together however: Why was Mary Catholic? Why was Elizabeth Protestant? Personal belief probably had nothing to do with it - Mary was the child of Henry's marriage to Catherine of Aragorn. Henry tried to have his marriage to Catherine annulled and eventually split from the Pope over his (just) refusal. Elizabeth was the daughter of Henry's second wife Ann Boleyn. The Pope held that Elizabeth was illegitimate (and therefore a pretender to the throne) so of course she had to support the Church of England. Mary on the other hand was supported by the Catholic Church which rejected the annulment of of her mother's marriage. Of course she would be strongly Catholic... An obvious thing but an aha moment for me none the less.

Another thing I enjoy about reading history is when it causes me to rethink today in light of the past. One of the issues that Church history raises for me is the place of tradition.  Chesterton describes a respect for tradition as democracy where the dead can vote! I'm coming from the radical protestant wing of Christianity (and if you'd read Gonzalez' book you'd know what that means) so I obviously have a different relationship to tradition than did Chesterton. It's worth thinking about though: Christians can be (I know I am sometimes) insular and complacent... Assuming that what I believe is obviously the truth and most Christians agree with me.

Tradition and history can give complacency a much needed smack on the head. I feel less certain of my theological positions when I discover that the vast majority of Christians throughout time would profoundly disagree with me.  Take spiritual disciplines and exercises:  how does it inform how I live my faith to discover the variety and consistency of spiritual discipline in the Church throughout history? Reading about early Church practices like weekly fast days makes think about the value of spiritual disciplines. Similarly -  finding out that the early Church typically made applicants for baptism go through two years of instruction should inform how I view baptism.

I also had the opposite reaction when I located some beliefs of the Church in their historical context. I identify with the  Internet Monk (who has long been one of my favorite blogger but recently has gotten even better: don't look now but I think he's trending anabaptist...) when he expresses affection for Catholicism because of it's non-trendy I-am-what-I-am nature. American Evangelicalism is ridiculously trendy and there's a certain attraction to a Church who just says "This is who we are. Deal with it."

That said - it's funny to read about the (relatively) recent status of some Catholic dogma that most bothers me. Transubstantiation (the doctrine that the elements of communion physically change to become the actual body and blood of Christ)? That didn't become dogma till the fourth Lateran council in 1215! A recent innovation... Papal infallibility (the idea that the Pope when speaking Ex Cathedra is free from error)? Declared by Pope Pius IX in 1870! This declaration gave force to his earlier Ex Cathedra pronouncement on the Immaculate Conception of Mary (ie: Mary was born free from original sin). In 1950 Papal infallibility allows Pope Pius XII to dogmatically declare the Assumption of Mary (the doctrine that Mary was taken up into heaven at the end of her life)...

The point is not that the ideas are recent innovations - but their establishment as Dogma (things you must believe to be Catholic) is a recent innovation. On the scale of Church history, at least, the Catholic Church has also had its evolutions.

That said - it is good for me to read a non-partisan history of Christianity. My ignorance of Church History is at least partially a Protestant ignorance. Protestant tellings of Church history (at a popular level) have tended to caricature: there was the early Church which obviously started right but gradually went off track and by the time Constantine shows up was completely screwed up until Martin Luther started straightening things out... In between (313-1517) there were a few good guys (I think we like St. Francis) but mostly the Church was a mess.

There's even a little bit of truth to that caricature: it is disheartening to read about the Church involved in wars and political intrigues - ecclesiastical figures with no interest in Christianity, division and confusion (three Popes at the same time!) and so on (and don't get cocky Protestants! All I have to say is 30 Years War!). Despite all the failures and fiascos the Church didn't disappear and the history of the less interesting but more devout leaders of the Church is worth reading for the reminder that there have been faithful through all times and circumstances. It is one Church catholic - though I sometimes forget - and especially for those of us who don't really tie our Christianity to history it is a worthy thing to be reminded and aware of tradition. I'm not sure that the dead get to vote, but they should at least get to nudge us from time to time.

Permalink: Jun 2nd 2008, 11:58 AM PST

Social Engineering at Google

I just saw an awesome display of social engineering at Google in San Francisco (I'm waiting for the Google App Engine event to start). This might have been white hat - an authorized person just getting what they needed - but there's no way to tell.

A guy walked up to the receptionist desk, backpack in hand and asked "Hey are there any hotdesks open?"

The receptionist didn't seem to know what he meant so the "attacker" babbled on for a moment about meeting up with some people who might be moving to San Francisco google and who were "squatting" here. The receptionist offered to call somebody to check it out and the attacker quickly assured her that they were here unofficially so far and only X (didn't catch the name) knew they were here. "They said 4th floor though - are there any empty cubes back that way?" (pointing towards a secured entrance and walking towards it...)

"Yeah I think so..." responds the receptionist and buzzes him through.

As he's headed he calls back "Hey is there a microkitchen down here?" And the receptionist gives him instructions.

No ID. No name even given that I caught. And based on what knowledge did the attacker gain access to the googleplex? Just lingua franca - not all of it successful even! Say "hotdesk", "squatting", "microkitchen" and a mid level googlers name confidently enough and you either A) don't have to wait for your unofficial arrangement to be approved or B) score the google lifestyle (kitchen, cube, wifi) without actually working for them! Very nicely done...

Permalink: May 16th 2008, 10:35 AM PST

Archive: 1   |   2   |   3      »      [34]