I thought I’d write a little about programming today. Usually, I use the blog for announcements and such, but maybe expanding a little will be a good thing. Let me know what you think.
I was going to post this on the discussion board, but I thought it might be of more general interest to discuss some programming topics. So, I am posting it here (it automatically gets sent to the discussion board, too).
Some thoughts on Sagelight and Programming
There have been a few general discussions on the discussion board (www.sagelightforum.com) about computer programming and Sagelight, and I thought I’d discuss a few things here.
It’s an interesting concept, especially for such a large program consisting of well over 800,000 lines of code, 2500 little graphic icons, and years of hard work.
A decided to write on a few topics after finishing the Cinepan Player. This is because the Cinepan Player is a much smaller project than Sagelight, and as a fairly large program that is also a discrete piece of code, it really showed me a few things that I’ve learned with Sagelight and that could be great for discussion.
I Learned a Lot More Than I Expected Writing Sagelight
When I started Sagelight, I had already had a long career in being a consultant, programming on various projects. I have a great reputation for being able to solve hard-to-find problems that other people have looked at with little or no success, as well as writing very complex, fast code in large systems. I have worked on all aspects of products, both enterprise level and low-level embedded.
I really shine when it comes to crisp, fast, and sophisticated algorithms
That’s one of the reasons I decided to write Sagelight. The idea of writing a Lightroom/Photoshop-level program was very intriguing and fun. Plus being very into Photoshop and mathematics, as well as photography, it was a great fit.
New Technology is Stunning, and Keeps me on my Toes
Because image-processing algorithms are so intensive, one of the first things I realized when writing Sagelight was that I really needed to understand new technology. For example, a few years earlier, you never used the floating-point processor unless it was really necessary. But, that’s old-school.
Another thing you avoided was using memory, whenever possible. In a way, that’s still the case, but now CPUs have such a high-level of cache, you want to use some memory but not a lot of memory – which is more of a radical change in the thought process than some people might think.
The Bokeh Was Slow Until I Realized Something about Caching
I actually wrote the Bokeh function twice. The first pass was just too slow. It was taking too long, and I just couldn’t figure it out. I profiled the code, and it really should be working much faster than it was.
Then I realized it wasn’t the instructions I needed to be looking at, but the memory usage. Tthe Bokeh algorithm was looking at maps for the shape of the blur. These maps are necessary and they are stored in memory. With a high-enough radius, the memory usage (being N^2, which is always a time-killer) slowed down the process far too much.
I wasn’t quite sure what to do. But, then I realized that there really is a lot of core memory available in the CPU cache, but not enough to store these shape maps.
Then I decided to compress the maps and decompress them on-the-fly. The Cpu cache can store the compressed maps. Now, when you’re doing a Bokeh function, Sagelight is literally decompressing data constantly as it creates your final image. Most of the time, the CPU never leaves the cache and reads from the memory where the maps are stored, even though the code itself is referring to it constantly.
It turns out that with newer CPU architecture, and the available memory in the Cache , as well as the speed of the Cache Memory, it was much, much faster to actually perform a decompression algorithm than to just store the shape maps uncompressed!
Sagelight has made me a much better programmer
Without realizing it, Sagelight has made me into a much better programmer. As the lines of code mounted into the hundreds of thousands, you need to get organized – very organized.
I started writing Sagelight in a highly Object-Oriented manner. First of all, I love object oriented programming. It’s just great, and is what makes it possible to write something like Sagelight as one person (not to mention some great, free libraries out there like TiffLib, Jpeg, and others) where it just wouldn’t be possible a few years ago.
Even though at least 30-40% of the code is written in multi-processing SSE assembly language (where OO has little impact), making the code very OO based is key in keeping track and re-using all of that code, SSE, assembly, or C++.
With such an object-oriented structure, I was able to write code to be very re-usable. In many companies, you see a lot of cut-and-paste strategies, where code is just stolen from one part of the project and copied into another area with just one little tweak.
I have seen this over and over, and also seen it become the bane of the project over and over, too.
Thank goodness I had that experience to know beforehand what a trap copying-and-pasting code becomes when used as an easy way to re-use code because you don’t want to take the time to expand on a function that other functions are already calling.
Being able to write such re-usable – and expandable code, such as through set() and get() functions, as well as virtual functions – becomes a way to build on what you’re doing rather than having it become confusing as cut-and-paste actions do. That is one of the things you tend to ‘know’ as a programmer, but sometimes don’t want to follow because it is time-consuming up-front, which is always…well… a bummer.
The reason many programmers do this sort of thing in company environments is because of deadlines – they will often go to their manager and state that we need to change function X() to support this new set of code. But, the manager knows that to change function X() just a little means testing the entire system over again, which means going upstairs to let them know more money will need to be spent just to make a little change. So, the manager says to the programmer, “Just cut-and-past function X() and call it function X2(). The manager just saved the company some money in the short-term, but cost the project a lot – possibly the entire set of code, as doing this as a common strategy causes code to crash in on itself.
Don’t get me wrong. I used to be that way, too, and writing Sagelight – because of the needs of such a project and the desire to make it work (that is, suddenly being manager, programmer, and that guy upstairs) – caused me to really think about long-term strategies and form some great processes that make such an incredible difference.
I’ve also learned that certain standards are there for a reason, even when they seem like they are in the way.
Every time I write a new piece of code, such as the Cinepan Player I just released – or even a smaller item as part of a larger project like Sagelight – I am faced with the same issue.
It goes like this: if I just want to get this function done, I can just go with it and write it out. But, if I want to start it right, by writing the exception handlers, implement get() and set() functions, and so-forth, the set up for it will take hours – hours with zero tangible results.
So many times, it’s just easier to dive right in and start programmer.
And I pay for it every time.
Here’s a good example. When I wrote the Cinepan Player, there were also a number of smaller items I wrote as console programs (i.e. non-Windows; just DOS/Cmd). These were meant as temporary programs for the creation of the Cinepan Panoramas.
But, then I realized that I needed to add support for adding your own panoramas, and these programs were not only needed for the overall program, but needed to be put inside the program itself.
Suddenly, I needed this code in the main program. I couldn’t just take a console program with all of its printfs() and global variables (I didn’t OO it, either. ) and fit it into the Cinepan Player. Also, I had an error mechanism that just bailed to DOS on any error, where now it would need to clean up memory and present the error message to the user in a nice Windows box.
I then had to go through these programs and create a class structure, wrap the printfs() into a virtual function (which was easy, since I luckily wrote a printf() wrapper for the Windows libraries I use a long time ago), and then (the hard part) go through the entire set of code and make sure the error-handling caught any error and piped it back to the main program instead of bailing to DOS.
This took me nearly two days when, if I had just stuck with what I know to be absolute standards from the beginning, it would have taken me just a couple hours, because that’s what you get when you follow good coding standards.
I think I might do a post just about this subject, as it turns out to be very interesting, especially when taking a look at even small practices, such as making sure there is only one exit point of a function, or wrapping all memory allocations and that sort of thing.
More in a couple days
I actually intended this blog entry to be on something else. I wanted to compare the Cinepan Player code to writing modules for Sagelight, but I guess I had more of a stream-of-consciousness going on than I thought.
Please send me your feedback! Let me know if this is a good place to post this sort of blog entry (or not). I can always keep it on the discussion board. But, I’ve wanted to do a ‘programmer’s’ blog’ for a long time now (to discuss some interesting things (to me, anyway) going on with programming, in general, as I create Sagelight).