Monday, September 29, 2008

What are you afraid of? (i.e. apply for internship at Google)

So I really ran out of time this time. I've been spending the entire day idling and biking so didn't get the chance to write anything. Then I realized that today (Sunday) is my last day officially guest blogging here. I guess to close my week of guest blogging here, I should share with whoever is willing to read, my internship experience at Google. Now whatever I said here is from the point of view of software engineer. There are other type of interns working here as well, such as marketing, business, or research interns.

The past four months have been the craziest time in my life. Depending on your luck, at Google, you immediately got thrown with a project. If you're lucky, you will get a toy project (small project that is supposed to gently bring you up to speed with your team's "ways of life" where "life" == "coding"); if not, you got your main project immediately. In either case, you have your host to help you out all the time. Apparently, the last part is not true for all interns, some interns have host who have to go on trips to other Google offices, or have a host who is a product manager (who, obviously, is very busy), etc. I'm pretty lucky that my host actually sits next to me (well, most interns have similar arrangements) so I got to ask questions easily.

Coding-wise, most hosts will expect a good sense in coding. They expect you to know your object-oriented programming and, if you have not yet learned the language you're going to code in, they expect you to pick it up on the fly. It's not as bad as it sounds. I have completely no experience in C++ before I got into Google and, right now, I would say that I'm confident with my C++ (not expert or anything, but good enough to program almost everything you throw at me). Fortunately in Google, we try to avoid the weird, crazy part of a language. C++, for example, can be used in procedural ways, in object-oriented ways, in functional ways, and as template meta-programming. Functional programming and template meta-programming are pretty much up there in the bizarre world of advanced (or, more accurately, obscure) part of C++ standard and only few people would know how to code in this style. Fortunately, we don't need to know. That's one of the best thing working in Google. We got to easily learn other people's codes because we don't pepper up our code with those obscurities (well, sometime it can't be help, but those should be very, very rare). I also got to brush up my Javascript and HTML a lot. I would say that I've gone from being okay with Javascript, to very good with it.

It's okay if you don't know what's design patterns, or what's C++ templates, or that for-in in Javascript does not do what you expect when used with arrays. But you're expected to learn them and apply them. It definitely helps a lot to have strong OOP grounding.

Most days will be filled with coding, so if coding is your cup of tea, you'll have a good time here. If you hate coding, try to like it before coming here. You definitely need to code here (if you're doing research, less, if you're doing software engineering, a lot).

As interns, we have most of the privileges full-time Googlers have. This is really good! We are not slowed down by inability to access, say, codes that some other Googler wrote. (It seems to be a problem in some other companies, where everything is confidential such that interns need to go through several layers of bureaucracy just to get access to a piece of code, or a wiki page, or a design document, etc). Unfortunately, whatever I worked on at Google remains within Google. Inside, we're very open and talk about mostly everything. You get to learn whatever you want to learn (and can squeeze to your brain) in addition to learning what your host throws at you. You got to discuss things freely with other Googlers (interns included). But once you talk to non-Googlers, everything becomes a black box. I found it really hard to talk about my work to non-Googlers since I have to resort to telling generalities instead of specifics. Say, for example, I can say I'm working on this thing utilizing something like MapReduce, but not exactly MapReduce. Fortunately Google open-sourced quite a bit of things, and published papers on other things. So these stuffs that are already up in the open, we can freely discuss with anyone we know.

If you're lucky enough to get an internship at Google, I can guarantee you one thing, if you have the right attitude, you will learn mountains of stuffs. And they are really, really interesting stuffs. You also got to learn "common sense" stuffs (programming techniques, point of views, knowledge, etc.) that are apparently not very common elsewhere (they are really, really just common sense, nothing extraordinary). Furthermore, we have series of internal talks (dubbed Tech Talks) that any interns can attend. These tech talks range from author's talk, to visiting research scientists' works, to Google tools, to development being done by other teams. They also include some really, really interesting talks that will benefit you outside Google. Unfortunately again, I can't provide you with any sample talks (NDA is one scary thing). On average, I attended 1.5 tech talks a week.

Oh, and did I tell you that you get a workstation to yourself for your work (of course the workstation is not yours, but you're the only one who will be using it when you're there for the internship). You have the same amount of table space and shelf space as anyone else near you too! You might get additional privileges depending on kind of work that you do. Our code repository is pretty awesome (beats cvs or svn by orders of magnitude!). We can submit code to wherever we want (there is a standard procedure to submit code, but it doesn't differentiate full-time Googlers and interns; and the procedure is pretty straightforward).

Now, in addition to things going into your brain, interns also have things going into our stomachs! Yes, like all other Googlers, we got to eat free buffet, gourmet meals in any of the 18+ cafes in Google 3 times a day, 5 days a week (during weekends, only 1 cafe is open to serve lunch and dinner). The food here is awesome. Just imagine spending $25-$50 outside for a meal, 3 times a day, 5 days a week. That's probably the right equivalent value of Google food. The food is very, very healthy! We got access to all the micro-kitchens (pantries that are stocked with tonnes of snacks and cereals). Get ready for 15 pounds increase in your total weight for 12 weeks of internship. Other privileges including gym access (to attempt to shave off those 15 pounds), the infinite pool (water treadmill), the internal mailing lists (ranging from road bikers mailing list to photography to technical stuffs; there is even Singaporeans mailing list there), intern-only events (events full-time Googlers don't even have access too ;)), and the TGIF.

The TGIF is a weekly event (you guessed it, it's held every Friday, Thank God It's Friday!) where the founders, Larry and Sergey, will share significant stuffs that Google as a company did in the past week. The contents of TGIF is mostly confidential (think Google Chrome, just that it's a few weeks before the general public release).

In short, if you're an intern at Google, you can consider yourself a Googler (after the first few weeks; the first few weeks, you're called Noogler rather than Googler; don't ask me what does Noogler stands for, I don't know!). As my internship is 7 months in duration, I'm lucky enough that right now, I'm working as if I'm a full-timer, doing stuffs that any other full-timers do (minus some, but not much).

I would encourage the programmers out there to apply to Google. If you're non-programmer, but really have passion towards programming and are willing to slog through and learn a programming language quickly, you're most welcome to apply. But Google is so difficult to get in, you have to be the smartest students around! Or so you complain. How exactly do you define smartness? Is it CAP 5.0? Is it programming skills? Is it common sense? If you don't try, you never know.

That said, you should know yourself best. So I'll just throw some stereotypes of Google interns. Interns are usually smart (in the sense: throw a complex algorithm at him, and he should figure it out easily, throw a difficult open-ended problems at him, he should figure out how to solve it in not too long a time), we usually code well (emphasize usually, not all interns code that well), we are independent (there are many learning resources available at Google, you are responsible to pick the best learning methods; that includes asking other Googlers ;)), we are fun-loving (most interns have secondary stuffs they always do during weekend, I love manga and biking, another friend of mine plays volleyball everyday, yet another loves water-rafting, etc.), and we don't give up yet know when to give up (think that out yourself).

Check out Google's jobs page if you are interested. You will need to submit your resume online (during the stipulated period), and the recruitment team will do their first screening based on resume. If you passed that stage, you will be assigned a recruiter who will be your friend throughout the entire application and selection period. You'll have 2 phone interviews, both technical interviews. This means five minutes chit-chat, 50 minutes technical questions, 5 minutes for the interviewer to answer your questions. Mind you, only your answers to the technical part matters (they don't care if you are not the most chit-chatty person on earth, or if you don't have any questions for them). You can't really bluff through the technical interview. They'll make sure that you got to answer questions that you don't know how to answer. Thus, preparing by reading past interview questions will not help you, on the other hand, brushing your basic algorithms—the first half or so of this book, a book that I recommend all computer science students should have in their bookshelf—will help you a long way. Be confident and talk! Don't keep mum throughout the interview. The interviewers want to know how you think, not just your final answer. It helps if you're particularly strong in an object-oriented programming language (e.g. Java, or C++, or Python, or Objective-C, or Javascript). If you got the job, you can expect at least 12 weeks of exciting life in one of the Google offices (you got to pick which one, but acceptance to the office is based on internship openings in the particular office(s) you applied to). You apply as software engineer and don't get to choose which team you're gonna be working with though.

The last thing I want to bring up is the fact that a lot of NUS students (or SoC) gives up before they even try! I've heard that NTU consistently manage send a few students to Google for internship, SoC, nay! This year, we only have 1 undergraduate and 1 PhD student interning here! Are SoC students not as good as NTU students? I don't think so! But we do seem to give up when we heard the name "Google".

"No, it's too hard."
"No, it's impossible for me to get in, why apply?" "
"It's just a waste of time."

If that's what in your mind right now, kick them out. Be honest with yourself, if you think you have a 5-10% chance of getting into Google than that's good enough! Again, qualifier: you do need to know your own limit, if you're consistently scoring below 3.0 in CAP (hey, if you're smart, you should at least get 3.5 easily without studying), then maybe you should be rethinking your priority. You should be pulling that CAP up first! After all, if I don't recall wrongly, the jobs page indicate that you need a GPA of 3.0/4 to get in (that is a B/B+ average).

To ease you up, I'm probably one of the unlikeliest person to get into Google, but I did anyway. I just managed to pull my CAP up to 3.83 from 3.66 the previous semester just before I applied to Google, so that saved me a little trouble. (Right now, I've managed to pull it up to 4.02, but hey, that's beside the point! at that time, my CAP was borderline!) Nobody will think of me as someone who's smart (at least not in term of CAP). I do have dreams of working in places like Google, but didn't actually have real hope of getting in with that kind of CAP, but I was lucky to have a professor who kept insisting me to apply. So I did. And here I am. If I can, you can too! (Wow, what a cliche!)

So, I guess, this will be the end of my week of blogging! It's been great posting all sort of weird posts here. Hopefully, next year, someone will tell me that he/she has been accepted for internship at Google. If that's my only achievement through this series of posts, I'm happy enough. (:

Ja ne!

As a final parting gift, here is something pretty cool to watch! (:

- Chris

Friday, September 26, 2008

Faster Web: CSS Sprites

I've been really busy at work so this is likely a short post (but hopefully an interesting one). Now almost all of us are using the Internet on daily basis. But a lot of the websites out there are very inefficient and slow! Now there are two ways to improve latency for these websites, one way is to tweak the backend code to be more efficient. For example, eliminate SQL JOIN and use less normalized table; yeah I know, against CS2102S principle of normalization, but guess what, table JOIN sucks! It constructs table in-memory, and if you're not careful, that table could be huge, I'm talking about INNER JOIN here, don't even ask about CROSS JOIN. Another example is to use proper INDEX in your database. Yes, all these are good, but you know what, after some point, tweaking the backend becomes a really hard problem that can take months (imagine Google or Facebook scale) to complete, only succeeding in reducing latency by a small 5-10%. The better way to improve latency is to improve frontend latency! That is how your website got into your users' web browsers, how is it being rendered, speeding up the Javascript, etc. Improving frontend latency can speed up your website by at least 20%. Just with the technique I describe here, some website could have loaded up to 50% faster (your mileage will vary though, as you'll see).

Yes, in this article, I'm going to just show one way to improve your website latency. (For more, visit Steven Souders' High Performance Web Site website or buy his book). Just one, but I assure you, it'll help you write faster website. (I actually wanted to do two, but decided to not do the other one right now.)

The rule is: reduce the number of HTTP requests.

Yes, simple rule, but guess what, almost everyone are not following this rule. Don't believe me, download Firebug extension for Firefox and use the Net tab to analyze websites. Let's just try with the website we are all familiar with: www.comp.nus.edu.sg! Yes School of Computing's website! Open the website, open Firebug, open the Net tab and enable it. Press Ctrl+Shift+R (or Cmd+Shift+R in OS X) to force cache reload (most visitors to SoC website can be assumed to be first time visitors who wanted to evaluate SoC as a choice of school). Count the number of request. Believe me already? Yes! SoC website has 42 HTTP request! Unbelievable! (Some websites and blogs can have up to 100 to 200 HTTP requests, especially those *ahem*download*ahem* websites with tonnes of ads.)

SoC website's # of HTTP request
Taken with Webkit
42 requests! That's the number of request SoC website made. The top image is taken from FF3's Firebug Net tab, which is not very accurate; the bottom picture is a more accurate representation taken from Webkit's (Safari Nightlies) resource inspector. Note the time taken to send the HTTP request (lighter colour) and the actual time taken to transfer the data (darker colour). Both are not as accurate a sniffing, but I don't have the right tool on this computer.


So how do we decrease the number of HTTP requests? If your website has tonnes of small images, the best way will be to use CSS sprites! In the past web designers slice and dice their images to create smaller images that are arranged with tables (and whatnot) to form the full images. The reason: the cable that connects the internet to home was slow! 56Kbps to download a huge 500KB image? That will take ages! But today, today is a completely different age! We now sprite images together. We combine all those small few kilobytes image into a bigger 50KB-100KB file. Why? The cable is now fat enough that negotiating TCP connection and HTTP request can be a bottleneck (remember HTTP requests contains a lot of headers, including that huge header that everyone is afraid of, Cookies!).

How does this work exactly? Well, recall some CSS. With CSS, you can set a background of a "box" (a div or an img) using background property. At the same time, you can size the box with height and width. You got the idea? Yes, set the background image to the sprited image, and size it to the size of the image that you want. That's not all, is it? Yes, that's not all. Right now, you are probably showing the top left part of the image. Now we want to adjust the background image position. No problem! We can adjust the top and left of the background image to negative numbers! Yes! Let's combine all those together shall we? For this example, let's show a rectangle that is part of the image above! The CSS I'm going to use is:

#the-sprite {
background: url('the-image.jpg') -75px -80px;
height: 80px;
width: 100px;
}

That's it? That's it! (Well, rinse and repeat for the other divs and imgs; the main idea is that you're only downloading 'the-image.jpg' instead of a lot of different images.) Let's see it in action! Here is the resulting image:



Now, I'm using a div for simplicity, but you can also use img, li, and some other thing. If you're using img however, you need to use a placeholder image as the src attribute. Maybe a 1px by 1px jpeg image (use this same image everywhere so that you only need to load it once)? Also note that while I use inline CSS for this example, it is better to set the id of the div or img and use external stylesheet.

That's way cool right? Try it the next time you design a website! A rule of thumb, a big image is better left alone. A dynamic image (e.g. image in a blog post may not appear everytime you load a blog) is better left alone too. So do this for all those static small images you use. This method is very versatile, you can even combine it with your Javascript skills to create cool stuffs like sliding images, hover menu, etc. Your imagination is your limit!

With any technique, there is always some drawback. What's the drawback of this technique? All of the sprited images will be loaded together! So before the download of the image is complete, none of the image is displayed. If your page only download one or two resources other than the HTML itself (say a stylesheet and the sprite), this method may make loading slower instead of faster. The reason is you can't parallelize the downloads. Remember that modern browsers usually download two to six resources in parallel (FF2 downloads 2 in parallel, FF3 downloads four or five). So play with your resources. Make it such that you enjoy the parallelization of downloads but at the same time minimizing the number of HTTP requests (remember, although browsers parallelize downloads, they only download a few in parallel, so if there are tens of requests...). Remember to exclude javascript from your calculation. Javascript is treated differently. The way Javascript is treated warrant another article in its own. I may not have the time to write that one up though. ):

Another drawback, spriting images are not an easy task! Furthermore, if you add new images, you have to add those images to the sprite. It's troublesome, that's why not many website use it (mind you, a lot of the larger websites use this technique!).

This technique alone could reduce SoC website's number of HTTP requests by more than 20! (I did a rough count and counted 24 static images that shouldn't change until the website got redesigned.) It will be harder to apply the techniques to blog with pictures that change all the time. It is still possible to sprite some of the static images though (e.g. buttons, navigation images).

Well, that's it for today! Hope this post helps you in some way or another the next time you design a website.

P.S. A List Apart has an article to make a navigation bar based on CSS sprites here. A friend told me that it's pretty cool (I haven't read it myself).

- Chris

Thursday, September 25, 2008

120psi awesomeness

Okay, let's do something totally random today. d:

Now if you have read my introductory "essay", you'd remember that I cycle practically everywhere over here, including to and fro work everyday. It is considered part of life here at Mountain View (a lot of people cycle, not as many as in Europe, but far more than in Singapore). Now for road cyclists, tires pressure is of utmost important. You should inflate your 19/23/28mm tires to at least 100psi. I'm quite conservative at that and usually bike at 105-110psi tire pressure, which feels okay. So today, I decided to be less conservative and pump the tires up to just above 120psi! That is actually more recommended for me since I have quite a heavy built (umm, just say fat!). You know what? The ride was awesome. The rolling resistance of my Vittoria Zaffiro Pro tires just dropped significantly. Cranking up the speed becomes so much easier, even on slight uphill.

It was a little scary! It's like the first time I got my roadbike and the first time I was speeding up above 20mph (that's over 30km/h). This morning, I definitely hit a new top speed (well, at least on flat road; on downhill, it is even scarier!).

Let me deflate my own happiness then. Having a high(er) tire pressure actually gives the illusion of you going faster. That's because more of the vibration caused by uneven road surface is channeled up to your handlebar and saddle. It made it seems as if you go faster when actually it's all illusion. Sigh. But the bit about reducing rolling resistance and actually going faster still holds true. The only way to be sure is to time yourself though (and lo and behold, I got faster by ~11%!

It wasn't just because of the increase in tire pressure. Over the past weekend, I've been busy hunting for parts and upgrading my bike. My hands were all oily and greasy (plus full with grimes), but the results worth all the hardwork. So what exactly did I do? I upgraded the front crankset and the entire front drivetrain!!

The old FSA crankset and chain
The old FSA compact 50-34T crankset (for sale) and chain. You can see the chain tool on top right, along with part of the chain cut from the new chain.


Originally, I had an FSA Energy compact crankset with 2-gear 50T and 34T (along with Shimano 105 front derailleur). It was noisy and it was horrible on an uphill. 34T is not small enough on steep uphill (especially on a fat person that is me). I had to constantly pump and waste energy when going uphill. Bad. So I decided to upgrade to Shimano Ultegra triple crankset! That is, one that provides me with 52T, 39T, and 30T! That means I had to upgrade the crankset, front derailleur, and the chain! Yes, it was an expensive upgrade, but it was all worth it.

With the new 52-39-30 combination, I could ride on a flat road at faster speed and better gearing ratio (with the 39T, middle gear), on a downhill, 52T gave a really good gear (and much faster speed, those 2 gear teeth gave a little extra speed, but coupled with downhilling, it's just much faster now). Most importantly, the 30T! Yes, the so called granny ring (demeaning I know, but who cares, I can conquer more hills with the "granny" ring) enabled me to go uphill much better than before. No pumping. Just constant spinning (pedaling as fast as you can). It doesn't build strength as much as the larger gear, but it gives a good aerobic exercise.

The new Ultegra front drivetrain
The new ultegra crankset with the derailleur just above it (top); top view of the triple crankset, you can see the three chainrings with the chain currently at the lowest gear (bottom).


Now the Shimano Ultegra 6603 series front drivetrain (consisting of the crankset and the front derailleur) was enamoured further with Shiman top of the line Dura-Ace chain (and the stock Dura-Ace 10-speed rear derailleur). The chain rotates so smoothly that it was quite eerie riding it the first time, especially after hearing day in day out the noisy FSA crankset's friction with the old chain. The 10-speed real derailleur gave me a total combination of 30-gear! Now that's something. Not that I need all of them (I haven't touched the smallest 5 gears and the largest 3 gears ever). The only combination that gives more gearing is Campy 11-speed rear derailleur coupled with triple crankset (33 gear! wth!!).

Oh, and btw, mechanic-ing your own bike is fun! If you have a bike of your own and want to upgrade or replace parts of it, try it yourself! Ask your bike shop's mechanics for some pointers and read online (and get the right tool!). Sheldon Brown's website would be a great place to start.

-Chris

Tuesday, September 23, 2008

C++ Corner - Smart pointer #1: scoped pointer

Now this post is gonna be pretty dry and boring for some of you and exciting for the rest of you. But, one thing, for sure, it's gonna go a little technical and you might need to google some stuffs along the way.

Before we start, let's read something more light-hearted first. Please read this. No, seriously read it. It'll lighten up your mind a little and make you happier.

It's a funny manly bike ads from craigslist recently (archived to my web domain). (If it wasn't funny, well, you can bash... nah, you can't do anything to me, too bad!)

Okay, let's get movin'. This is my first technical write up aimed at more general audience, do provide some feedbacks when you think of any! (:



Umm. Pointers? Smart? Woot? Worry not, by the end of this post, you'd know about pointers and smart pointers well enough to feel confident to code with them (and of course finding out that you're dead wrong when faced with a wall of g++ compile errors and segfault-ing C++ executables . . kidding! I know you guys are smart enough to handle C++).

C and C++ are known for their notoriously memory-leak prone memory management. This is because you as the developer is in charge of doing the memory management. That is not as easy as you might think as we will see later. There is an issue of when to deallocate allocated memory, what to do to make sure that the deletion is executed (what if there is an exception thrown earlier in the function?). Now the smart pointers attempt to make memory management a lot more manageable (pun not intended). By the end of this "article" (yes, it's gonna be long!), you should know a lot more about C++ pointers and hopefully you will be more confident diving into C++. Even if you don't know C++, it's a good idea to know about these smarty pants.

Stack vs. free store

We start with the very basic memory lesson: stack vs. free store. Memory allocation in C++ (and as you'll see later, in Java and in simPL/rePL in CS3000 something Programming Language class) are divided into two types, automatic allocations and manual allocations. Let's check out the easy one first: automatic allocations.
void Foo() {
int number = 1; // auto variable
for (int i = 0; i < 10; ++i) { // i is also auto variable
string str("Hello"); // this string object is auto!
}
number = i; // wait a minute, i no longer exists!
cout << str; // hey, str no longer exists too!
}

Sounds pretty normal, right? As the name suggest, you needn't do anything, everything is managed automatically by the compiler. These variables will be allocated in the program stack. There is a problem though. What about if you create a new class in the scope of Foo function and pass str into the new class. You could pass str by copying it and all is fine. But copying is slow, so you would most likely want to pass anything that is not a simple primitives (primitives == int, double, float; non-primitives == object) by reference. Java defaults to exactly that. Primitives are passed by values and objects are passed by reference. But if you do pass str by reference:
ClassA* CreateA() {
string str("Test"); // auto variable!
return new ClassA(&str); // pass the memory addr of str.
}

class ClassA {
public:
ClassA(string* str) : str_(str) {} // constructor, assign str to str_.
void PrintString() { cout << str_; }
private:
string* str_;
};

int main() {
ClassA* a = CreateA();
a->PrintString(); // fails! segmentation fault!
}

Why does it fail (segfault means that you access a memory address that doesn't belong to you)? Because by the time PrintString is called, the string pointed by str_ (which is str) has gone out of scope and is no longer available in the program stack! It has been destroyed! This is true for both primitives and objects! (Copying the object works though and you can copy primitives, but, copying object would be too expensive almost all the time.)

So auto variable doesn't work. This is where free store comes to play. Memory allocated in free store will remain available for the duration of the program (or until you explicitly deallocate it). Anything that is created with new (or malloc in C-style code) are allocated in the free store. Let's rewrite Foo now:
ClassA* Foo() {
string* str = new string("test"); // string* is now a pointer!
return new ClassA(str);
}

There you go! Now the program works just fine! A pointer is something that contains a memory address of the object allocated by new (or have a memory address of an object assigned to it, e.g. string* str = new string("test"); string* str2 = str;). Thus you can pass the pointers around and everyone will be able to access the same object (like in Java). Happy!

Um, no, it's not a happy ending (yet). What is the problem with this? The problem is the phrase "until you explicitly deallocate" the allocated memory. This is where memory leaks come from. To deallocate an object allocated by new, you call delete on the object (to be exact, on any of the pointer that points to the object). e.g. string* a = ...; delete a; will delete whatever object a pointer points to. Now let's do something simple:

string* str = new string("test");
// do something . . .
str = new string("test2");

Wait a minute. Simple my ar*e! Do you see anything wrong with this code? Assuming that str was the only guy pointing to "test" when you reassigned str to the new string "test2", you just lost a pointer to the string object "test"! Congrats! You just had a memory leak! Imagine similar code in a loop:
string* str;
for (int i = 0; i < 10000; ++i) {
str = new string("hello");
}

Wow! You just lost 9999 instances of the string "hello" without knowing it!! (Dumb example I know, but similar, more complicated codes have caused nightmare elsewhere.)

Important note: auto variable will always allocate the object and call the constructor of the object. You can think of auto variable as calling new and assigning the object to the auto variable. The difference is that the allocated object is allocated in the stack space (thus, you can call things like: vector<int> a(); vector<int> b(a); where b copies vector a by calling vector<int> copy constructor). There are more complicated thing with copy assignment, but let's leave that aside for now. (Take a look at this article if you want to know more.)

Note: This stack vs. free storage works exactly the same in Java. Everything allocated with new is allocated in free storage. Everything that is not (int, double, float, char, etc.) are allocated in stack (and they are passed by value, i.e. the value is copied instead of passed by reference). The difference is that Java Virtual Machine (JVM) has a garbage collector (GC) that checks the free storage periodically to delete objects that are no longer accessible from the application. (Note: GC is slow(er) than manual memory management! If you don't optimize your JVM's GC parameters correctly, a memory-hungry programs may freeze for several seconds during garbage collection phase).

The first smart pointer: scoped_ptr

The first of the smart pointers that will help you manage pointers is the scoped pointer.

When should you be using it? When the ownership of the object is clear! That means, you can say for sure that object A should be the owner of the object obj, and object A should outlive every functions and objects that need to use obj. Otherwise, of course, segmentation fault may occur.

How does it work?
We are going to look at how Boost (a C++ library) implements scoped pointer. Let's take a look (I have taken a liberty of stealing Boost's interface and implement it myself (and cutting corners):

// Note the use of template here, similar to Java generics.
template<typedef T> class scoped_ptr : noncopyable {
public:
// Constructor assign p to p_ private variable. Now we own the object.
explicit scoped_ptr(T* p = NULL) : p_(p) {}

// Destructor will delete the object.
~scoped_ptr() {
// Delete the object being pointed by this scoped_ptr.
delete p_; // okay to delete NULL
}

void reset(T* p = NULL) {
// Check that the passed pointer is not the same!
if (p_ != p) {
// Delete the pointer we are holding before assigning the new one.
delete p_;
p_ = p;
}
}

T& operator*() const { return *p_; }
T* operator->() const { return p_; }
T* get() const { return p_; }

private:
T* p_;
};

Now we can use scoped_ptr as an auto variable (do not use it as a pointer). Whenever the scoped_ptr goes out of scope, the destructor (~scoped_ptr) will always be called. Therefore the object pointed by the scoped_ptr will be deleted automatically!

scoped_ptr owns the pointer. Thus it is free to delete the pointer when it is being destroyed (~scoped_ptr destructor) or when it is being reset (reset method). We also provide operator* and operator-> so that we can use scoped pointer as if it's a normal pointer (this is something you can't do in Java: operator overloading). If we need to get the pointer itself, e.g. to pass to another method that expect a pointer, call get method.

Also note that scoped_ptr is non-copyable. You probably can imagine what will happen if two scoped pointer holds the same object. The object will be deleted twice! Segfault! (There is a method that you can use to swap the contents of two scoped_ptr, aptly named swap.)

How to use scoped pointer
Consider two things before using scoped pointers. One: do I know who owns the object clearly (if not, consider using shared pointer, coming up in another post)? Two: can I use auto-variable instead? If the answer to both questions are 'yes' then use scoped pointer. Place the scoped pointer with the owner.

Wait a minute. If I can use auto-variable, why would I want to use scoped pointers? Answer: to avoid copying! Remember auto-variable will call the constructor of the object. e.g. ClassA a;, a seemingly side-effect-less thing actually calls the default constructor of ClassA, which might be doing a lot of work! Let's illustrate with examples shall we?

// Usage illustration for scoped_ptr.
class ThunderBirdHawk {
private:
// Allocating and calling constructor of AddressBook may be
// expensive. Additionally a may not be used at all, so we
// want to lazily instantiate address_book_.
scoped_ptr<AddressBook> address_book_;

MailSender sender_;

public:
// Constructor will only call scoped_ptr constructor, which is cheap.
// Note constructor of MailSender is called whenever the ThunderHawk
// is created, here constructor that accepts 1 int argument (the port
// number) is called.
ThunderHawk() : address_book_(NULL), sender_(25) {}

// Destructor
~ThunderHawk() { /* do nothing for now. */ };

// . . . Many methods here . . .

// This method is one of the two methods that use a_.
void SendEmailUsingAlias(
const string& msg, const string& recipient) {
// Now we need to find the e-mail address from address book.
if (address_book_.get() != NULL) {
// Try loading address book (by calling LoadAddressBook).
a_.reset(LoadAddressBook());

// If could not load address book, load default address book.
// Another advantage of pointer is you can use inheritance!
if (address_book_.get() != NULL)
address_book_.reset(new DefaultAddressBook());
}

// . . . Now we can use address_book_ like any normal pointer.
address_book_->Validate();

// We can also pass address_book_ to to the sender_ object,
// and not worrying of address_book_ going out of scope.
// Note that this call is non-blocking/asynchronous.
sender_.AsynchronousSend(msg, address_book_.get());
}

AddressBook* LoadAddressBook() {
// Note here, while LoadAddressBook creates the object, it does
// not own the object; instead passing ownership to the caller.
try {
return new AddressBook();
} catch (...) { return NULL; }
}
}

What is the primary advantage of scoped pointer? The pointer contained in the scoped pointer is guaranteed to be deleted when the scoped pointer goes out of scope, even if the object destructor somehow throws an exception. scoped_ptr's destructor should never throw, but ThunderHawk's destructor might before you got the chance to delete a_. Without scoped pointer, there is no guarantee that your delete address_book_ below is executed:
AddressBook* address_book_;
~ThunderHawk() {
// do some cleanup here, may throw exception.
delete address_book_; // what if exception was thrown earlier!?
}

Furthermore, you can free your mind tracking more important stuffs than making sure that the scoped pointer is destroyed!

So how was that? Does scoped pointer goodness makes you feel all warm and fuzzy inside? It did the first time I saw it (and no! It's not love, though you might mistake it for one). It has a very smart and twisted implementation. If you want to use scoped pointer, consider using Boost C++ library instead of writing one yourself (which is prone to errors). Alternatively, use auto_ptr, which is a more problematic version of Boost's scoped_ptr, but included directly in C++ standard header <memory>.

scoped_ptr: use it as often as you can!

Coming up sometime later this week will be shared_ptr, a more powerful, reference-counted smart pointer that enables shared ownership of an object (it should be used more rarely though, using shared pointers often could mean that your program design is bad).

- Chris

Monday, September 22, 2008

Why OS X user shouldn't switch to Linux?

Note: this is in response to Way of the Penguin

You can stop reading now if you're not interested in OS X. Disclaimer: I'm an anti-Windows/IE person (note: no mention of Office, Visual Studio, ASP.NET, etc., just Windows, just IE). I've used Linux for 5 years but found the one in Apple OS X Tiger/(Snow) Leopard.

Answer: Because there is no perfectly good reason to do that! This post will discuss why OS X is a better choice if you already have it (it might be useful for people considering to get one too). The reason why I write this post is because a lot of people have told me that Mac is a waste of money and I should just stick with Linux. There are reasons why OS X costs money. And they are pretty good reasons (unlike that other operating system that costs a lot more and cause much more pain).

If you already have a mac, you can do practically everything a Linux machine can do! Yes, that's right! And more. That's why you pay more for the OS, right? This article aims to bring those reasons out (wow! so high and mightly... what a typical OS X user! eh, what the heck did I just type).

Now the first thing I checked out when I consider buying a computer is, does it have a built-in ssh? (Ruiwen has explained what ssh is awhile ago, so I shall happily skip over it.) Does OS X has it built-in? Well, I ended up with 3 OS X computers now (a Macbook Pro and an iMac, and another Macbook Pro on loan from Google), so yes damnit, of course, it has ssh!

ssh in a mac

ssh in a mac, coupled with gorgeous ever-changing background display d:


Another benefit of having a mac, Adobe design software. Yes, it's a pain doing things with GIMP (mind you, I used Linux for five years now and still using it at work and virtualized in my mac sometime, and until now, GIMP is still a pain). And I do use Adobe software a lot. Photoshop for all my design needs and photo touch-up, Bridge to keep track of all of my photos, InDesign for pretty, pretty docs, and Dreamweaver (well, in the past at least, I no longer use Dreamweaver and prefer coding my HTML by hand, to keep the HTML, CSS file size to the minimum—notice, no mention of Javascript, Javascript deserves it's own post). If you're a graphic designer who have lived your entire life with Adobe Photoshop and Illustrator, you're better of sticking with what you have now; or, if you have a Windows machine, consider switching to a Mac and do away with monitor and colour calibration.

Great UI. Yes, the OS X UI is still unmatched (Vista even blatantly stole ideas from OS X UI and slap their dirty hands on them to form Aero). Not only is it pretty, it's very user-friendly. The amount of user experience research (usually referred to as UX research) that goes to OS X design is tremendous. And being directly involved with UI design several times, it is no easy task to make something that good. The thoughts they put into shortcut keys are amazing as well. I'm used to using keyboards to navigate and do stuffs, so good shortcut keys are a must.

(As a side, while talking about keyboard shortcut, for those emacs user, there is another advantage of using OS X. Now you emacs users would be very used to using C-something and M-something keyboard shortcuts, which are Ctrl- and Alt- button respectively. Now in Windows and Linux Ctrl- are being used for a lot of their shortcuts, making it difficult to utilize both in-application and out-application shortcut at the same time. In OS X, however, the most common shortcuts use Cmd- button, the little Apple button, so Ctrl- and Alt- can be freely used in emacs without causing confusion.)

Now the thing about OS X UI is that it is not limited to the operating system. A lot of third-party software for Mac, free or otherwise, have amazing UI too! OS X attracts many kinds of developers, but out of those, it attracts one kind the most out of any other, programmers with great UI and design taste. Look at Adium X (compare it with Pidgin), look at the powerful TextMate text editor's soft-looking UI (and great choice of fonts btw: Monaco). Or Keynote FTW (strictly speaking, Keynote is not third-party).

By the way, to balance it out, while OS X UI is amazing, I'm not saying that it is perfect. There are stuffs I wish they would have put more thoughts into, such as allowing right-clicking in grid-viewed stack, easy way to modify Dock and Desktop icons, allowing switching of spaces when having 'show desktop' on, or some other minor details that bothers me (those are rare moments, mind you).

Now for me, the deal was sealed when a friend showed my the Terminal (look at the ssh picture above, that black window is the unfathomable, fluffy world of the command line terminal). Yes OS X is BSD-based. It has all of the standard BSD command-line tools (from the tired old cp, mv, rm, ls, to the venerable find, xargs, and sed. Top that with terminal-based emacs text editor (I have never, ever use emacs out of terminal, even in Linux, I'll run (x)emacs with -nw switch, which means, no X GUI). With the terminal, all the reason I have chosen Linux over Windows broke down, and I started saving money like hell to get my first mac. So from an OS X perspective, we have just eliminated one reason to switch to Linux, the ease of the command-line. Now, for those of you still stuck in GUI world, yes, command-line does increase your productivity by a lot, even when you only apply that to doing day to day stuffs. Moving/copying/deleting files from subdirectories several level deep in your home directory can be done easily with tab-autocompletion. The more daunting tasks, such as mass renaming, can be done in command-line much faster than renaming each files one by one (hint: combination of find, xargs, and sed).

What else? Oh yeah, the built-in Apache2, FTP server, and Bonjour. With Apache2 and FTP server built-in, you've almost everything you'd possibly want to serve web pages and files (to yourself or to the world). I've recently set up TWiki on my own computer for my private usage, and the experience was a bliss. Take a look:

TWiki on my http://wiki

This webpage is accessible in my computer by simply typing http://wiki (and don't try to access wiki.chrishenry.com from your own computer, that page doesn't exist outside my computer, in fact, it will bring you to some weird website asking you to register the domain)


All that was done in about half an hour; and that's the first time I was setting up a Perl-based CMS ever (why Perl, you ask? Simply because anything other than PHP-based is fine by me: Perl, Python, C++, Java, ASP.NET). Oh and did I mention that TWiki is awesome? (No, I didn't mention, in case you're wondering, but now I do.)

Now Bonjour is arguably the strongest plug-and-play networking daemon ever written for home network (and office network, when administered properly). Ever wondering how your iTunes can detect other iTunes in the network (yes, even in Windows, Bonjour has been ported for Windows by Apple), or how your computers can easily detect Bonjour-enabled network printer, Time Machine, and other Mac in the vicinity? That's all Bonjour for you. (On the flip side, Bonjour can be a network security nightmare in corporate networks if it is not administered correctly.)

Let's put it in context more familiar to most SoC students, programming. For very basic text editor with syntax highlighting, TextWrangler is a viable free alternative to TextMate. After using different text editors though, I've settled with emacs. It is powerful and fully customizable once you learn the shortcut (oh, and CS1101S students rejoice! You can customize emacs easily using LISP programming language, the more complex alternative to MIT/Scheme). Eclipse is also available for Mac for those Java programmers out there. In addition, there is this one language that you can only code in a Mac, Objective-C. Objective-C is a very pretty object-oriented programming language with dynamic type system. By very pretty, I mean the code looks pretty and highly readable.

id person = [[Person alloc] init];
[person setFirstName:@"Chris" lastName:@"Henry"];


In Java, the equivalent of the above code is:

Person person = new Person();
person.setFirstName("Chris");
person.setLastName("Henry");


Objective-C is used everywhere in OS X. iPhone developers should also be familiar with Objective-C. So Mac opens up one avenue of programming that you can't do in other platforms.

Now, what about C#!? ASP.NET? There's no Visual Studio 2008 for Mac, is there? No. There isn't. Chotto matte kudasai! There is Parallel or Fusion! Both of them are virtualization software that enables you to run other operating system on top of OS X! Recent multi-core CPUs also contain this technology that allows hypervisor to perform much better to run multiple OSes at the same time (Intel's VT or AMD-V technology). You can virtually run Windows XP/Vista or any flavour of Linux on your Mac! They run pretty fast and you can literally sleep the guest operating system and wakes them up in seconds. If you maxed out your Mac's RAM to 4GB, running two guest operating systems is no longer a dream. (Recently, I ran Windows XP quite a bit to try out Chrome and it runs faster than my Firefox 3 running natively in Mac; it's an unfair comparison though, since I usually opened 50-100 tabs in FF3 while I only opened 3-4 Javascript heavy website on Chrome, but still, virtualization is a very real alternatives).

Google Chrome on Fusion

Google Chrome running on Windows XP 64, which in turns ran on VMWare Fusion, on a Mac!


The last thing I want to bring up is darwinports, a package manager for OS X. Quite frankly, the one thing I do miss moving to OS X from Linux is package manager. I came from Debian camp, so I'm used to typing apt-get install firefox3 xemacs21 nmap apache2 to download and install stuffs automatically (Linux software dependencies are a b*tch in itself! A piece of software you want may depend on x number of libraries and other software, each in turn may depend on more; without package manager, they are just too hard to install manually, or . . maybe we are just too pampered with these package manager, I mean, people used to live without them, though then, the world was so much simpler). Luckily in Mac, most software comes with installer. But, if you want to use Linux tools in Mac, that's gonna be a different story. What? Linux tools on a Mac? How? Well, most open source tools available for Linux are made to be portable across operating system. You just need to have the dependency installed, gcc (available in OS X), and the source code. Run configure script, compile, run make install, and bam! It's installed. Sounds easy. But how the heck do I get all the dependencies installed? Manually? Right. That's how DarwinPorts fit in. It is a FreeBSD-like package manager for OS X with quite a huge library of packages (not as huge as Debian repository though, Debian repo easily has 13,000+ packages).

Well, so are there reasons for OS X users to switch to Linux? Almost none. But sure, even for a die-hard OS X user like me, there are perks in using Linux too:

  • Linux has this awesome feature where you can just hover your mouse over to another window for it to gain focus, instead of clicking. This feature is awesome if you can touch type and your screen is small. The benefit is that while the window gains focus, it's not brought up to the top. So I could hover over to a half-visible text editor and type what I saw in my webcast while still having the webcast covering almost the entire screen. Now that's one bad-ass feature.

  • Compiz Fusion is really pretty. Compiz basically uses your usually unused 3D graphic card to do cool stuffs with your Desktop, e.g. wobbly windows when you drag them, accelerated window transparency, and a lot of other effects.

  • Speed: Linux is fast. Period. If it's not fast enough, recompile your kernel and removes all the useless modules (compile as many parts of the kernel as built-in and not as modules). Recompile everything! You can literally recompile everything optimized to your CPU, thus squeezing every ounce of performance you can get from your PC. (Or use Gentoo Linux, whose emerge tool will automatically compile whatever package you're installing; alternatively Debian-based distros can use apt to compile from source, Arch has Arch Build System—pretty similar to FreeBSD ports.)

  • It's free and free. Yeah sure. I know the pain of saving enough money to actually get a Mac. Free is good. (Quiz: why did I say "free and free", isn't it redundant?)

Al'right. Now we have one post for Linux and another for OS X. Now who will be the one coming up with a post for Windows? (smirk d: oops)

Well, all right, I have to work tomorrow, so I shall end here. This post is by no means comprehensive, there is no mention of Spotlight, or Expose, or XCode, or Cover Flow, or . . . (ellipsis indicating I'm running out of idea but refusing to admit)

Btw, in case you are wondering what are those (:, ):, d:, etc, they are reverse emoticons. I don't like MSN and GTalk trying to convert every smileys I type into pictures, so I started reversing them. I'm so used to reversing them that I just use it all the time now.

For reading 'til the end, doomo arigato gozaimasu. Ja mata ne.

- Chris

Hi there...

I guess I'm this week's guest blogger. Yep. Well I was asked to do this short intro as my first post. I thought of doing a paragraph or two short intro, but then I saw Ruiwen's intro and 'holy crap!', it's long!! So I shall not hold back (after reading this thing, if you survived 'til the end that is, you'll probably guess how noisy I am; though, you probably guessed wrong).

I'm Chris, a rising senior (yes, I'm still 'rising' senior and not a senior yet since I took a leave of absence this semester) in School of Computing's Computer Science stream. If you're wondering why I called myself rising senior instead of someone who's going to be a final year or something, that's because I'm addicted to say 'rising senior'. Yes, even since I moved to Silicon Valley a few months ago, people have been introducing themselves as 'rising sophomore' (going to be second year), 'rising junior' (if you couldn't guess it already, it's going to be third year), and 'rising senior', like me. I thought it sounds so much better than calling yourself first-, second-, third-, or fourth-year.

So let's get on with the introduction. I entered NUS SoC's Computing programme in July 2005, and streamed to Computer Science a year later (well, it was quite 'exciting', my grades barely made it for CS). I was also accepted to the University Scholars' Programme (USP). (Now I'll talk more about this in another post, but for now, a shameless plug: USP is a wonderfully awesome programme and, if you are a first year and think that you want to know more about it, let me know! Right now, USP is opening applications for freshmen for acceptance in their second semester. Application is, unfortunately, by recommendation only. But do drop me an e-mail at 'chrishenry.ni' et 'gmail' dot 'com', or chat it out in gtalk, I might be able to help you find sponsor ;).) You can also consider me a weird students who like to waste his time. In my 6th semester, I completed 143MCs worth of modules, but yet couldn't graduate in 7 semester. Why? Because I took extra non-requirement modules! Yes, that's right! I took one USP module in excess, did Chinese I and II, did weird USP Advanced Modules that don't satisfy CS requirement, etc. But that's the point of university life, isn't it? To try out as many things as you can and enjoy it while doing them.

I stayed in hall for the first three years; meaning I just got out of hall at the end of last semester. I stayed in Kent Ridge Hall, did many crazy things with them, and joined an unbelievable club. Cheerleading. Yes. It just so happened that a few seniors founded a cheerleading team (the real one, the one where you lift girls and throw them without hesitation) and dragged a some of us freshmen (then) to join. The seniors were pretty cool and I had a real good time for the two and a half year I was there. We even won second in nationals. Demo... (read: but...) my grades suffered quite a bit. I ended up being scolded by my AI professor at the end of my fourth semester, but was given second chance and allowed to by undergraduate tutor for CS1101S. Now if you thought teaching is a waste of time, think again! It was so darn fun! You can ask my ex-students how much fun did I have teaching CS1101S (oh, and you earned quite a bit of money from it too). I decided to go all out and taught CS1102S and CS3216 concurrently the next semester. At the same time my grade went up and my life hit the most exciting part of it (yet).

So the same prof asked me what I wanted to do this summer. And I just absent-mindedly said I wanted to do some internship to earn some money! Yeah, internship in Singapore is, most of the time, just earning money. You only learned precious little (yes you do learn, though it's not much). For most companies, interns are slaves! So guess what? My prof 'tekan'-ed me to apply for internship at Google! I was, what the heck, it only needed resume and I already had one almost ready then, so I just sent it out. A few months later (and two interviews later, plus hassling, torrential preparations, from Visa to plane tickets), I'm here at Mountain View, at Google HQ, doing internship with Google Finance team until the end of the year. And life has been sort of an exciting up and down rides here. Yes, we are still slaves, though we slave much less. But we learned a lot and got to meet super intelligent people from schools like MIT, Berkeley, Stanford, etc.! Oh, and by learning a lot, I meant really as much as you want to learn and as much as you can squeeze to your brain at those short months interning.

I had this nagging feeling that one of the reasons why I was invited to guest blog here is because of this internship. Yes, I was pretty much an unknown entity in SoC (except for my reputation as the guy who skipped more than half of his classes in his first two years; oh wait, my reputation just went down the drain, didn't it?), I was ever more into USP than SoC, so I was pleasantly surprised that I was asked to write here. But the idea is cool and Ruiwen has ack'ed it by being its first guest blogger, so why not? Plus you know, as an inside information, I realized that the guest bloggers roster consists of pretty awesome people who have done so much more than me! The only thing that I wasn't so sure about was the tone I should take while writing my thoughts here. To be really honest, most of the time, I said what's in my heart and mind without thinking, and refined it afterwards. So a lot of time, I wrote pretty honest opinions in term papers and what-not, some of which can be a pain to SoC or USP or the university in general. Do wish me luck with my posts. ;) Still thank you Alexia and Juliana for inviting me to post here.

Anyway, I wrote this post on Friday though you probably will only see this post on Monday. So what did I do on weekends? I bike! Bike as in bicycle not motorbike. The difference between Singapore and NorCal (North California) is that the weather here is so damn nice and the roads are bike-friendly. I got myself a road bike and has been biking extensively. I almost exclusively biked everywhere. And yes, tomorrow, my plan would be to bike north to San Francisco and more (hopefully I'll reach Sausalito, a really nice town north of SF). That would be a total of 60 miles of riding and it will be pretty tiring. The one thing you need to bike that long a distance is a mental concentration, especially on the downhills towards the end of the ride. At 20-30mph, any slight mishandling of your bike can send you tumbling right to the hospital's door. It almost happened to me quite awhile ago, while going on a descent on a hybrid bike with a not-so-decent wide tire and lousy brake; I lost my footing on the pedal and the bike swerved violently. Ended up with an injured heel, but wasn't serious enough to stop me from biking.

Well anyway, I think I've written enough, if not too much, introduction. So this week expect some posts on life at Mountain View/Google, biking, teaching, technical but interesting CS stuffs, OSX/Linux, and USP life (obviously I probably won't be able to write about all of these in a week). Also expect at least one overly persuasive post to help you decide that you want to apply to work/have an internship at Google. (What you shouldn't expect: any confidential data about Google and Google's technical information; I'm under an NDA and it would be bad, no, really, really bad, if I managed to leak something through this blog.) Comments and constructive debates are always welcome. Private comments are welcome too at 'chrishenry.ni' et 'gmail' dot 'com', which is also my gtalk.

- Chris

Way of the Penguin

Linux.

That's an interesting word. Have you heard of it? Maybe you have. Maybe it's floating around somewhere in the back seat of your subconscious, lurking in the shadows. You know it's there, you've heard of it, but you just don't know what it is, or what it looks like. Or maybe you have. In a lecture somewhere, perhaps. Or maybe you've read about it online. You know, that operating system. The free one. You do know it's an operating system right?


Tux, the Linux mascot,
tends to hide in the shadows

Well, anyway, if you said yes, you'd be wrong.

Technically, "Linux" is the kernel of the operating system, first written by Linus Torvalds when he was a student, back in 1991. But hey, we don't want to nit-pick, so for the purposes of this post, we'll just use "Linux" to generically refer to that whole group of OSes that use Linux as its kernel. You might have heard of some of them before, Ubuntu, Fedora, Debian, openSUSE... any one ring a bell? Or hey, you might even be running Linux on your computer right now. If so, good for you!

The thing is, not many people are using Linux at this point in time. Correspondingly, not many people have even heard about it, much less used it. As such, Linux is still out there, lurking at the fringes, and sometimes peeking its head out a little. Unfortunately, Linux still isn't really what you might call "mainstream" at this point, and what this means, is that a large majority (including even SoC students) of the population is unfamiliar with Linux, and what it is, or what it looks like or what it can do. And we all know people are afraid of what's unfamiliar to them.

So tell you what. Let's use this post to bring Linux out into the light. Don't be afraid, Tux won't bite.

Maybe I'll start off with a little myth that I've heard about Linux in general. Mention Linux to Joe User on the street and you'll probably get the opinion that Linux is only for geeks and technical users. They say it's difficult to use command line based, and all around user unfriendly. Does that scare you? An OS that is "techical"? Wait, which faculty did you say you were from again? Oh, the School of Computing. The School of Computing? And you're afraid of getting your hands dirty with technical details? All right, in my next post, I'll let the Dentistry students know to stop looking at teeth, all right?


Compiling software on Linux.
Who's afraid of the big, bad terminal?


Jokes aside though, that's outright wrong. Today, Linux-based OSes have progressed to a stage where their graphical user interfaces (GUI) are as, if not more, user friendly than either of the other two market-leading OSes. People tend to judge on appearances, hence, how good or bad something looks, determines what people think of it. Granted, earlier versions of Linux might have been seriously butt-ugly, but not any more. Allow me to show you what I mean.



This is Linux
Screenshot by STAR_LIGHT2007 on Flickr





And so is this
Screenshot by enlightener on Flickr





Oh, and this too
Screenshot by Vulturo on Flickr




Guess what this is. Definitely Linux.
Screenshot by Filip Knežić on Flickr

(For more eye candy goodness, check out Flickr)


Ugly? Command line only? Difficult to use? Really?

Well, of course, eye candy alone does not make a good operating system. What's the use of looking good if you don't have the substance to back it up, right? People often ask me, "So, what's so good about Linux anyway? How does it compare to Windows, or OS X?" Usually, they're worried about ease of use, and more importantly, compatibility with the other OSes. For students, it's usually worries about being able to use Linux for their assignments, whether its able to handle the Microsoft document formats and so on. I tend to use myself as an example most of the time in my replies. I've survived four years in NUS, including one year spent abroad, and I've never had any problems with submitting assignments or reports of any kind. Most Linux distributions come with the requisite software out of the box to get almost anything you'd want done. OpenOffice churns out documents and reports, GIMP does decent image editing, and Pidgin handles all your instant message needs without breaking a sweat. Oh and just for the Computing students, you'll be delighted to know that it's almost ridiculously easy to set up a development environment for your favourite programming language and/or framework in Linux. Java? Check. C/C++? Check. Python? Ruby (on Rails)? PHP? Perl? Erlang? Checkcheckcheckcheckcheck. Apache, Tomcat, Postfix, MySQL? Oh yes, indeed. Oh, and did I mention that even the most basic text editor provided in most Linux distributions has support for syntax highlighting and auto indentation for code?


Download what, ah?

I hear this most often during development projects. When we're about to begin coding, friends who are using other operating systems (especially the one developed in Redmond), usually start by running around the web downloading all sorts of packages from vendors' websites to get themselves going. All right, we need the MySQL server and MySQL Administrator from here, PHP from here, Apache from there.. and so on. And then of course, sometimes there's the "Eh, you got Dreamweaver? Can lend me? Got crack?" And once they've got them all downloaded, we start the installation doubleclick-yesyesyes-doubleclick-yesyesyes-doubleclickdoubleclick..

Linux users, comparatively, and contrary to common belief, simply need to point and click their way through a GUI, selecting the packages they want installed, hit Apply, and then sit back and let the machine download, install and configure the necessary software, all at one go. That's the best bit about using Linux and its related software, really. Thanks to the distributions' huge software repositories (stores of software packages usually maintained by the distribution's maintainers), Linux users are able to find almost any bit of software they need to get their job done with almost indecent ease. A sampling of the repository I'm looking at lists about 25,000 software packages, all of which can be installed via a simple point-and-click GUI.


Screenshot of the Synaptic Package Management GUI on Ubuntu Linux

And not only do you get easy access to loads of software, because software packages downloaded from the official repositories are digitally signed, you can rest assured that whatever you are downloading is reliable and trustworthy, and not riddled with spyware and what-not. Furthermore, because we're all about Free (as in speech) software, all the software packages found on the repository are unencumbered by restrictive licensing and are therefore great candidates for redistribution and sharing. No worries about hunting down cracked license keys there.


How to connect, ah?

Apart from being friends to programmers, Linux distributions also play nicely with remote servers. I once took a module in which we had to deploy our projects onto the Solaris Zones provided by SoC. Now, anyone who's worked with one of these Zones before, will know that just about the only way in to manage the files on the Zone is via ssh. Now that's all well and good until you realise that you'll need to edit the files on the Zone. How might you do this then? Should you copy the file onto your harddrive, edit as required, then copy it back to the Zone? That's all and good for a single file, but remember, you're dealing with an entire project here. Or perhaps, you could use one of the built-in editors in the Zone's environment and just edit them on the Zone itself. Anyone remember vi? Either way, such scenarios are too complex and aren't too appealing to most of us. Especially when we've got a deadline hanging over our heads, and two other projects due at the same time.

Thankfully, for those of us who need to deal with files on remote servers, there is a solution. Most Linux distributions offer a way to seamlessly mount a remote server such that it appears as just another directory in the file system. From there, copying and moving files between different machines is as easy as transferring them between two directories. Drag and drop easy, in fact. What's more, since the remote server is now represented like a local directory, code editors can now seamlessly open, edit and save remote files just as easily as they would local files. To the programmer, apart from a slight lag due to network latency incurred while saving the file, the fact that the file actually resides on a different machine is entirely transparent. What's more, most of the common network protocols are supported. You could just as easily mount a remote server over SSH, as you can FTP, as you can Windows file sharing and even WebDav. This of course means you save loads of time simply by not needing manually manage your files. Oh wait, did I copy that over? Maybe I should just copy it over again. Oh wait, do I have the latest version? You get what I mean.


Right-click menu showing option to edit a file
on a mounted remote server (simulated using a virtual machine)



Drag and drop copying of files from one
remote server to another


So how?

As you can see, there are indeed advantages of using a Linux system, as compared to other more "mainstream" operating systems. Of course, these are only two features, out of a host of many, many more, far too many to ever cover in a single blog post. However, the point of it all, at least, is to show that Linux is at this time, a viable alternative to the other OSes you find on the market. A large majority of people are afraid to venture into the unknown, and as a result, miss out on giving things like Linux a try. Now that I've introduced a few features of a Linux system that I find handy, perhaps you might find that you'd like to give it a try as well. As they say, don't knock it till you try it, you never know, you might like it!

Anyway, Linux really isn't a new thing to the computing world. Since its birth in 1991, it's been steadily growing in terms of functionality, stability and usability (and all other forms of -ity's). While you might not see it often on the consumer desktop at this point, it's handy to know that Linux has now become the server operating system of choice for a number of goverment agencies worldwide. The list includes the Swedish Armed Forces, the Government of Switzerland and the Government of Japan. The city governments of Berlin and Munich in Germany also announced that it plans to use Linux on all their desktops. The National Security Agency (NSA) and the National Aeronautics and Space Administration (NASA) in the USA also use Linux in their operations. And of course, the (most likely!) largest Linux user of all, Google.

Like it or not, Linux and other free and open source technologies are slowly gaining ground. For the pragmatic folk, if you're looking for a job in the tech industry after graduation, you know what you should be brushing up on. For the rest of us, it's always good to gain a little more exposure to available technologies instead of being locked into a proprietary monoculture, oblivious to anything else.

--
There are a number of ways to get started with Linux, and the easiest of which is via a LiveCD. A LiveCD is simply a Linux installation on a bootable CD. Just boot off the CD, and when that's done, you'll be running Linux. Not to worry though, LiveCDs won't touch the existing data on your harddrive, and the Linux environment you see is only temporarily loaded in memory. Take the time to explore the system, play around with the applications and take the time to get comfortable. When you've seen enough, all you need to do is to reboot your computer, remove the LiveCD, and you'll find yourself back where you started. One of the more popular Linux distributions, Ubuntu, provides a LiveCD for download. Should you wish to install Ubuntu on your computer, the live environment has a handy Install icon on the desktop that you can click to initiate the installation process. Needless to say, as with any other major operation on your computer, back up your data safely before you commence the installation process.

For more resources on Linux in NUS:
--
Notes:
  • With the exception of the screenshots taken from Flickr, all screenshots taken on Ubuntu 8.04 Hardy Heron.
  • If you'd like to reuse the screenshots from Flickr, I've linked them to their Flickr page where you can find licensing information
  • Screenshots used were licensed under forms of the Creative Commons license that allow reuse and have been attributed to their owners.

- Ruiwen

Thursday, September 18, 2008

Openness in Education

Hands up, if you've heard about MIT's OpenCourseWare initiative. Anyone?

All right, for those not in the know, here's a description of OpenCourseWare from the project's About page:


What is MIT OpenCourseWare?
MIT OpenCourseWare is a free publication of MIT course materials that reflects almost all the undergraduate and graduate subjects taught at MIT.

  • OCW is not an MIT education.
  • OCW does not grant degrees or certificates.
  • OCW does not provide access to MIT faculty.
  • Materials may not reflect entire content of the course.

So, in essence, they're putting up (almost) all their course material online, for the general public to view and download.

And they have a huge range of courses too, somewhere in the region of 1800 different courses.

All right, now hands up, those who thought, "Hey cool. Why doesn't NUS have something like that?" Good question, really. I have no idea. Of course, if NUS does indeed have something like MIT's OpenCourseWare, then this post is all hot air, so feel free to flame me in the comments. =)

Otherwise though, it does seem for now that NUS' course material exists in the walled and gated garden of IVLE. Oh wait, it's not even a garden where you can roam around freely after you're past the gates. I think, maybe.. dungeon.. is more the appropriate word. With securely segregated cells into which nobody may enter without explicit permission. Have you faced this situation before? You'd like to check out the course material for a module that you're not taking at this particular moment, but when you try to access its Workin on IVLE, but all you get is a sign saying you do not have access rights to view its contents. All right, that was helpful.

I'm not sure I understand the restriction of access to course materials though, especially to students of NUS. I had the impression that Universities were institutes of learning, and exploration, and not of restriction. Even if course material is not opened to the public like MITs OpenCourseWare (which, I guess, would be the ideal scenario), why are students restricted from viewing course materials by default? In an article on ZDNet Asia, dated 20 Febuary 2008, Ravi Chandran of the Centre for Instructional Technology here at NUS mentioned that the University left it to the lecturers to make the decision on whether or not to open up their course material, citing a "bottom-up approach". And indeed, certain lecturers do have their course materials listed on the open web (CS2105: Introduction to Computer Networks, was actually the only one I found in a short survey of about 10 module pages. Eg. http://www.comp.nus.edu.sg/~cs2105). Also in the article, Chandran mentions that, "while laudable, open education appears to be more advantageous for educators, who [can] now reference other material while developing course material. Students... may not fully experience the learning by just downloading content from the Web". He also expressed concern that learners may face difficulties in discerning the "authenticity or accuracy" of said material. But surely, even if you may not have the lecturer standing in front of you, it helps to be able to reference the material as and when you need it? I presume, as University students, we don't exactly need our lecturers to spoonfeed us if we are to learn. Besides, if the material were published on an established source, for example, on the University's OpenCourseWare project page, from the University's lecturers, I guess, just maybe, we as students, could put a little trust in the "authenticity and accuracy" of that material?


Screenshot of the CS2105 public webpage

In this day and age, I (to the best of my knowledge, at least. I acknowledge the fact that there may be factors at play of which I am ignorant) see no reason why educational material should be kept under lock and key, or in IVLE's case, user-id and password. I have heard it mentioned that course material like lecture notes are not freely distributed due to issues of intellectual property rights. But what intellectual property are we talking about here? Are there concepts discussed in the notes that have been invented by NUS teaching staff but have not been published in a public, peer-reviewed journal? If not, why the secrecy? I guess the claim about protecting intellectual property rights is valid if there was licensed content bought from other providers and used in lecture notes, to which NUS or its teaching staff do not have the rights. After all, even the MIT OpenCourseWare page states that material on the site "may not reflect the entire content of the course". Apart from that, however, I make the assumption that most, if not all, the concept taught in lectures are public knowledge anyway, and there should therefore be no reason to restrict access to them.

So here's a hope for the future: that course material for NUS modules will eventually be put up one day in a fashion similar to MIT's OpenCourseWare. After all, if we would like to make the claim that we are a world class institution, I guess it wouldn't hurt to take a page out of MIT's book?

--
For more information regarding the OpenCourseWare and other similar initiatives, here are a few resources that might be handy:
--
So yup, once again, if I've missed out on anything, do feel free to point me in the correct direction in the comments.

- Ruiwen

Wednesday, September 17, 2008

Learn and Explore

I'd have to say, one of the more defining moments in my time here in NUS would have been being accepted into the NUS Overseas College (NOC) Programme back in 2006. I'd have to say, that at the very least, going on the NOC Programme allowed me to experience life outside of Singapore for an entire year and to open my eyes to the world. On the other end of the spectrum, I could say that the experience on the Programme was pretty much life changing.

In brief, the NOC Programme hopes to help students develop a sense of entrepreneurship by sending them off to work in various locations around the world, namely, Silicon Valley, Bio Valley (Philadelphia, USA), Shanghai and Stockholm. (A fifth location, Bangalore, is also available, but only for graduate students via the Graduate Research and Internship Programme (GRIP), also offered by the NUS Overseas Colleges) While overseas, these students work in various startup companies, take courses in the local partner universities, and at the same time organise and attend events that focus on entrepreneurship.

Just a short disclaimer though. I currently serve as Vice-President, Events, on the NOC Alumni Executive Committee. However, once again, the views expressed in this post are entirely mine.

Initially, I never quite gave the NOC Programme much thought. Sure, I'd heard of it, but honestly, it was never one of my priorities. To me, one year abroad was just time away from friends and family, and away from my life here in Singapore. And life was going pretty well for me at that point. linuxNUS was just founded, and I was scraping through coping with my modules. When the time came though, I sent in my application all the same. I didn't harbour too high hopes of being admitted into the programme at that time I guess. I wasn't a scholar, wasn't a Dean's Lister, and I simply wasn't anywhere close to being an 'A' student. And then, a few weeks later, I received an email telling me that I had been selected. Good grades or not, looks like I made it through.

As I soon found out, being an intern on the NOC Programme is far from the traditional concept people tend to have of "interns" in general. Far from being tasked with menial and mundane chores like filing paperwork and photocopying documents, NOC interns are often given projects to work on that involve the operations of the company. And far from being "just another worker", there are times when the companies would have just been founded, and the intern ends up being directly involved with helping the company establish itself.

Now that's all well and good, but some of you might be asking, "What does that have anything to do with being an SoC student? I study Computing, not how to start companies."

True, but the real value, I feel, lies in being made to overstep your own boundaries. Which, frankly, is a terrifying thought for quite a number of people. Here in Singapore, we're all conditioned very well to respect the boundaries. The boundaries are sacred lines that should never be crossed and the boundaries are law. Over time, however, certain boundaries if enforced too often, become our own limits. Similar to the dog so used to being chained up that even if it is removed, he never learns to move beyond the length of his chain. We should avoid becoming that dog and recognise that there is a whole world beyond what we sometimes limit ourselves to.

Take for example, the use of a certain programming language in development projects here in SoC. We start our basic courses like CS1101 with Java. Then CS1102, in Java. Then CS2103, also in Java. Or at least that's how it went for me. And the list goes on to higher level modules. However, sometimes, what this arrangement results in, are students in the third or fourth year, who are more or less only familiar with Java. Not that there's anything wrong with the language. It's just that after three to four years in the school that provides education in what is perhaps the fastest moving industry, some SoC students are familiar with only one single technology. These students have become so comfortable with what they are familiar with, that they have never learnt to explore outside their comfort zone and end up being trapped within their own limits. Programming languages is just an example of course, but the scenario described is pretty real.

So, even as a Computing students, we really shouldn't restrict ourselves just to the world of technology any more than we should restrict our own learning to the technologies taught in the lecture hall. Information technology pervades all aspects of the world today, and gaining an insight into as many fields as possible, business administration, marketing, publicity, may well put you in a position to see the proverbial "bigger picture". After all, aside from pure academic research, technical knowledge is best used when applied to a particular field. How would I use IT to improve a business process? How can my knowledge of web technologies be used in helping the company with its marketing efforts? Indeed, overstepping those boundaries, while initially uncomfortable perhaps, may well lead to greater insight in the "real world". Besides, if you never gave yourself the chance to try something new, you might never find out what possibilities lay beyond. As a personal example, I started off as the normal SoC student, having done projects mainly in Java. I hadn't much web programming experience, and I tended to stay away from web technologies at that point in time. However, one of the first projects given to me while at my internship company was to develop a web-based system. I sat down to pick up the technology, and have since come to realise I enjoy developing web applications instead.

So indeed, it doesn't really matter if you like it eventually or not. More importantly I guess, is the experience gained from going beyond familiar ground. Just like our poor chained dog. Without the chain, he is free to explore beyond his usual territory. If he likes it outside, well and good. And if he doesn't, there's always the familiar and comfortable ground to retreat to.

And the great part about the NOC programme is that is provides exactly these kinds of opportunities for students to explore new ground. I suppose, speaking from the perspective of a person who has lived in Singapore all his life, that spending a year abroad was almost the symbolic loss of the chain that bound me to what was familiar. If anything at all, it was an opportunity to start afresh, so to speak, free of the walls that I had so comfortable lived within my whole life. Furthermore, when you allow yourself to be immersed in so many new experiences, and speak to so many new people (loads of opportunities to network while on the NOC programme), you begin to realise that the world is a whole lot larger than just what we can see from our little island in the sun. As you speak to the people you meet (they might be startup founders and CEOs, or maybe venture capitalists, or maybe industry veterans), they might share their experience with you, and from there you gain. While in Stockholm, we were lucky enough to speak to people like Joe Armstrong, one of the creators of the Erlang programming language, Henrik Torstensson from Stardoll and Andy Smith from Jaiku (Jaiku was acquired by Google in Oct 2007), among many others, and to hear their stories.

Hej! 2007 about to start

Time in University is best spent learning, but then again, we cannot learn if we restrict our horizon to the confines of the tutorial room or lecture hall, or even the campus grounds. Similarly, we cannot learn about the world by staring at our feet. So yes, I would like to encourage you, dear reader, to learn by exploring and experiencing as much as possible. If there's something that you really find interesting, maybe you could always skip your next lecture to go explore that interest. And if you're game for it, why not apply for the NOC Programme?

- Ruiwen