Rise and Fall of the Black Box Developer

motif

Let me start by stating that this is not a rant, it’s a look at my personal experience with interviewing candidates for technical Java positions. I guess that I first started to think about the following concepts when I read James Donelan’s post “Can Programmers Program?“. In that post he elaborates on some statistics from coding tests perform by developers applying for jobs. Those results indicated that many experienced developers were not able to solve simple problems. At first I thought that those statistics were blowing things way out of proportion, but then I started a retrospective down the memory lane and began to doubt.

I first started interviewing Java candidates back in 2006, eight years ago. I’m proud to say that the very first candidate I ever interviewed is to this day a very dear friend. In these 8 years I’ve seen outstanding, great, regular, mediocre and really bad sets of skills – and that’s nothing but to be expected. However, I do think that candidates were very different in 2006 than they are today, reason being that frameworks and libraries have come a long way in these 8 years.

The 2006 landscape

Back in the day, Hibernate and Spring were only in version 2 and starting to dazzle the standard Java EE stack. Struts 1.2 was the most popular MVC framework with JSF and Spring MVC as distant contenders. Of course there were other frameworks around, but these 3 basically constituted the backbone of most Java apps in those days. What those frameworks did really well was move complexity from the code (in some cases boiler plate code) to the configuration – quite a verbose configuration I might add. From the data side, relational databases were the only way of storing data, with Oracle, IBM and all the traditional heavyweight providers ruling, while MySQL first became a serious option with the release of version 5.0. In this context, knowing how to identify long running queries and to determine which indexes needed to be created was a basic skill. Finally, sites like Stack Overflow and other forums existed but were not as popular as they are today, and even Google had a less keen eye at the time of searching for technical information. I remember that getting a copy of a Maning “In action” e-book was something similar to finding the source of all and only truth. In summary, there were a lot of tools, but you needed to know what you were doing.

The Rise

Time went by and frameworks got smarter, requiring less and less configuration through the use of annotations, fluent APIs, DSLs and “convention over configuration” models. A Maning book is not as valuable as an asset as it once was, since there are now infinite blog posts, forums and examples around in the internet. Relational Databases now compete with NoSQL engines and full-text engines (like Lucene and Elastic Search) which are just fast, no questions asked. As thus the black box was born. Developers all around the globe who are enabled to quickly build very powerful applications in relatively little time, having to concentrate in little but they’re own business logic. And that’s awesome.

The Fall

The problem is that when I interview today, I see more and more developers who claim to be senior, very experienced ones, but all they know is how to use those frameworks. Here’re some examples:

  • A question I frequently ask is how would you sort a 100GB file with all the phone numbers in the US, using nothing but a laptop with 1GB memory, 1TB hard drive, a text editor and a Java compiler. One candidate replies with something like: “This is very simple! I would just load the file into a DB and retrieve the results in a select query with an ORDER BY clause”.After reminding the candidate that his only tool was a Java compiler, I decided to go along with it and asked: “How would a DB engine manage to resolve the ORDER BY clause of a 100GB table with only 1GB of RAM?“.The candidate’s face turned pale and after some minutes staring at the table he said “I don’t know”.

    As a pointer to I asked back “how does a DB index work?”.

    “Indexes make queries faster”.

    I replied: “The question is how do they work, not what do they do. How do they make queries faster?“.

    After some silence, the interview finished.

 

  • In another interview I met a candidate whose resume claims he is some kind of an expert in Hibernate. So, just as a conversation starter I asked: “Could you name all the fetch strategies, how they impact performance and which one would you use in each case?”.“What’s a fetch strategy?” he said.I recently had the honor to have lunch with Gavin King, creator of the Hibernate framework and I told him this story. He was really surprised and he commented that, “setting the wrong fetch strategy could be tremendously harmful for an application“. As a side note, I’m completely aware that Hibernate is not well suited for certain types of applications, but at the same time I have met many developers and architects claiming that, “we need to remove Hibernate because it makes the app too slow” while those performance problems went away with just some small performance tweaks.

 

I could go on and on with stories like this. Stories of candidates who were supposed to be really strong on OOP but couldn’t model something as simple as a composite filter, or who couldn’t draft a scalable architecture. I even met a candidate who claimed that web servers spawn one thread per request and had absolutelly no idea of what a thread pool was. But that’s not the point I want to make

So what am I saying? Nothing but a plea…

So what’s my point? My point is that having teams of developers who are so caught up with the abstraction that frameworks provide allow companies to quickly build minimum viable products. But because those devs are not aware of the inner machinery underneath, because their mind is bound by the limits of the black box, when the time comes to address performance issues, to handle scalability problems, to make sure that the object model is flexible enough to easily accommodate changes – in those times – these developers won’t have the necessary tools to be successful. So are frameworks evil? Were we better off in the pre 2006 era? NO! Actually, my family eats because of one of these tools! Mule ESB and all the related solutions that are made for Anypoint Platform are part of these post 2006 tools.

What I’m making is a plea for developers to not lose their curiosity. Just because it doesn’t make sense to re-invent the wheel doesn’t mean that from time to time we shouldn’t stop and ask ourselves what is it that makes the wheel so great and why is it that it spins – because if you understand why it spins, only then you can know which size of a wheel you need.

So this is my advice to any developer who is bored enough to have read so far, don’t lose your curiosity. To whoever is about to create a DB index, please take a minute to read about how those work, which types are available and which suits you better. To whoever is using Hibernate, when you do auto-complete and see a little attribute called fetchStrategy, go ahead and Google what that is. Use frameworks but look at their code, see how they work, experiment on the impact of small or big configuration changes. Be curious. Be bold. Even attempt your very own version of those products. It doesn’t matter if your product sucks big time, you will learn quite a few things you would never have from just using an existing one. Be curious. Do not forget that as Plato puts in Socrates mouth: “Philosophy begins in wonder“.

Thank you for reading!


We'd love to hear your opinion on this post


15 Responses to “Rise and Fall of the Black Box Developer”

  1. Excellent!

  2. Very good. I think it is not uncommon for developers to learn-on-the-fly which is a bit of a siren’s call (to crash on the rocks). These developers tend to have very narrow views of technology, which is bad, but then they tend to generalize from this narrowness, which is worse.

  3. Great advice to keep in mind, awesome!

  4. Great Post!
    I like the point of view that you take on this topic.
    But I also think that as the society is becoming more demanding and urgent for technological solutions, it is something expected that the tools also evolve in order to build more things with less effort and counterattack this demand.

  5. Shouldn’t it be “plea”? Good article!

  6. Great post! I fell the same here in Brazil.

  7. Thank you Mariano for sharing such an insightful post. This is absolutely true and I call it consumerism of programming.

  8. True separation between creators ans users that claim to be creators!
    Thanks

  9. I think it is a bad idea and concept to use the terminology of ‘black box’ when developing. Understanding how a framework or tool works, and it’s consequences, is imperative to use it effectively. Using a black box approach is tantamount to chucking a grenade in the mix and hoping for the best.

    It’s also not helpful to be one of those developers that will only use notepad for everything because they’re such cool purists, and tend to disregard efficiency to stroke their own egos ;).

    A pragmatic programmer that is focused on client short term (minimum viable product) and long term objectives (maintainability, scalability, etc) will develop with their eyes open. They will know the consequences of flicking the light switch without needing to understand the physics of electricity (metaphorically speaking).

  10. I can only agree to a point, besides some corporate programmers working on a fixed project nowadays programmers are supposed to work on very wide range of projects and finish everything in asap.. They never can have time to dig deep into how hibernate or a web server works.

  11. I too can only agree somewhat. These days you ask curious questions and often get “premature optimization”, this angers me so much that I wrote my own post: http://www.xenoterracide.com/2015/01/premature-optimization-is-not-evil.html

    however, I find some of your questions fairly depthy, sometimes, most of the time, the smart answer is to leverage a framework. Also different people have different specialties, for example I’ve failed almost every single algorithm question asked in an interview, because I specialized in Object Oriented Design, Software Architecture, System Administration, Security Engineering, and Relational Databases. I completely neglected algorithms, and it seems to be on the top 10 list of what I get asked in interviews, and yet I’ve yet to need to know how to do it in a job.

    I don’t know that I could answer your sort question, though I could guess, my database uses btree’s and have a minor idea of how that works, realistically though, I can’t give you an in depth.

    On the hibernate issue, per my blog post, I recently resolved a performance problem caused by someone querying the data, and then requerying it inside a loop, it took 20s to load a page, I took it down to about 4, and then suggested that we change our UI to not pre-generate reports, because the next 3.5 seconds was spent generating a pdf on page load whether you’d download it or not.

    I’ve had to spend time arguing with someone that I’m not going to do 2 queries when I can do it in one (and did).

    recently I wanted to get native Postgres UUID type support working with Hibernate and use H2 as a test database, what a pain, and no you can’t google it (well maybe you can now, I submitted a patch and left an answer on SO). I had to dig into Hibernate and Postgres JDBC drivers to figure out what wasn’t happening properly. Still need to consider further work on that.

    as far as Thread Pooling, as if that were even the only model to do web servers…

    and yet, all this and I’m not sure I could answer your questions, why? because I really don’t know algorithms and how they work… I think you should focus on a different kind of question (and get more disappointed) my favorite now is “what is SQL Injection and how do you prevent it?”.

  12. My first computer ‘roll’ involved writing a drawing manager……. in qbasic…. with no database. It kept track of 1000 or so drawings, in a fixed length record format textfile. Sorting used a bubble sort algorhythm, but sorted backwards as it only needed one pass through the file if only one drawing was added….. if more than a few drawings were added, was added it sorted using a radix sort instead. this, at a time, when I had no idea what SQL was…., so everything was coded manually.

    And, this was the first, and last time I ever used those sort algorithms in a professional setting… that was back in 1993!…. if you asked me now, how to sort those 100gb of phone numbers, given those restrictions, I would possibly go for the radix sort, as It would more efficiently use the available memory – QuickSort is the faster algorithm, but I am unsure how well it would perform when it is swapping items on disk, where a radix sort could implement a ‘bin’ in memory, and make a scan of the 100gb file, populating the bin in memory, then dumping that bin to disk, followed by another scan off the 100gb to a different bin in memory. This should cut down on disk thrashing….. Do I get the job?

  13. Hmmmm……. on further pondering the problem, I would imagine that the vast majority of those 100gb’s of numbers are contiguous integers…. so, I wonder if it might be possible to store the lot in a large hashmap of areacodes, and from/to numbers. So, 100gb file read in, sorted and compressed to a hashmap which hopefully never expands over the 1gb limit. Not the way a database would do it….. but, it’d be far faster!

  14. If I may put your comment related to re-inventing the wheel into another form: re-inventing the wheel is not something you should do for a living. But it’s a hell of an exercise if you want to become a better programmer.

  15. Hey Mariano!
    Was referred to this article by the awesome Mike Stowe!

    This is a very nice article, I came through these too while interviewing people down here on my end and I’m very much inline with your thoughts!

    “Just because it doesn’t make sense to re-invent the wheel doesn’t mean that from time to time we shouldn’t stop and ask ourselves what is it that makes the wheel so great and why is it that it spins”

    Good stuffs!