Project-Based Learning

Are we Artisans?

Computer Science has an interesting quality that sometimes makes it feel more akin to an art than an actual science: the ability to practice without risky consequences. Think about the plethora of practical sciences that exist. In fields like Medicine, Structural Engineering, or Nuclear Physics, beginners do not simply start operating on a patient, build their first skyscraper or get involved in the calculations of a Nuclear Reactor, respectively. Failure is catastrophic in these practical sciences, even at a small scale. You do not begin to practice your field until you are well beyond the beginner stage, and in some cases, you are not even allowed to take part in it until you have mastered it.

Computer Science is different, because the upper ceiling of failure is extremely low. You can learn about recursive Fibonacci sequences, write a command-line calculator or create a Unit-Testing framework without fear of catastrophic consequences. Unless you are working with production or live code, the worst that can happen is that you render your computer unusable. But more commonly you will end up with a nasty bug (or ten) and a few life lessons. Computer Science can feel more akin to an art such as drawing or sculpting, because as long as you have access to the medium you can keep practicing indefinitely. You can use any kind of paper to practice sketching, and inexpensive mediums to learn sculpting. When learning to code all you need is RAM, CPU and Disk space (optional, you could work without ever writing to disk), resources which are readily available even on phones. But coders are more than just artisans, because we also have eureka moments that either show up unannounced or creep up after hours of pondering. It is truly the best of both worlds.

Context is Key

Let us return to the Art analogy. If you want to learn how to draw realistic and accurate portraits you should not jump straight into drawing faces. That will lead to a lot of disappointment and hard work. Instead, you should first focus on learning the basics of drawing: straight lines, near-perfect circles or perspective. Once you have a better grasp of foundational shapes you can move on to more complex objects such as faces. This rule can be applied to any practical skill: Learn the foundations or basics, practice them continuously and then move on to more complex tasks. The two key aspects are: Continuous practice and incremental learning.

What is a face? A sum of its parts: Ears, eyes, mouth, nose, head, hair, etc. And how do you draw a face? Incrementally. You would need to practice the overall shape of a head, and then slowly incorporate the remaining parts. Practicing each individual part without ever incorporating all other parts will rarely lead to realistic portraits. Portraits require that all individual parts blend in together. A frowning or crying face require that the individual parts all move in unison.

The same analogy can be applied to learning any programming skill. You cannot practice or learn about Unit Tests without writing actual code to test. Containerization is of little use without proper CI/CD, or understanding of microservices, because that is where its biggest potentials lie. Mocking sounds stupid and is close to impossible without Dependency Injection. The list goes on and on. Attempting to learn coding skill in isolation leads to shallow theoretical knowledge, and a lot of frustration. You can keep drawing as many eyes as you want, but that will make you a good eye artist and not a good portraitist. The key ingredient missing from a lot of learning material, ranging from online videos, blogs and University course, is context.

Toys are for Kids

Combining my own experience as both a mentee and mentor, I can attest that there is no better way to learn real-world programming than building projects from scratch. The one issue with University assignments or coding exercise sites like Leetcode is the idea of toy problems. Toy problems are very good at encapsulating specific concepts, but their usefulness stops there. Toy problems will not teach you about the effects of bad decisions early in a project, or how good deployment practices can save you hundreds of man-hours in the long term. Abstractions can be extremely useful to simplify a problem, or make it more approachable, but if you abstract too many concepts, it becomes a toy problem. An abstraction such as “model human population growth but exclude the effects of food and disease” can be useful under certain scenarios or could be a great start to a project where you slowly add complexity. A toy problem that sounds like “Find the first matching left and right pairs of socks fo the same color from an infinite queue of socks. You cannot sort and must be at least \(O(n\log{}n)\)” is rarely useful in real life. Unfortunately, problems like these are used in interviews, which is akin to saying “I want a good programmer that can document, deploy and test our software safely. Let’s put them under heavy time pressure to work on a specific toy problem completely untethered to real-life problems”.

Good coders are not defined by the number of lines in a program or number of commits in a project. A good and competent programmer must have a significant impact in multiple areas of the Software Development Lifecycle, as opposed to only being a good implementer. “Unmaintainable good code”, “untestable wonderful code” and “undeployable well-written code” are all oxymorons, Mature code must satisfy all qualities of the Software Development Life Cycle. This is why the solutions to a lot of toy problems cannot be considered good code.

A diagram of the Software Development lifecycle. There is no canonical Life Cycle, as different companies have different approaches. Project management methodologies, such as 'Waterfall' or 'Agile' also affect the order of the cycle.

Working on projects that take you from one end of the Software Development cycle to the other is the only way to learn about the entire Software Development Life Cycle. Toy problems will only teach you specific and narrow paths of the cycle, just like becoming an eye artist instead of a portraitist. As an experienced and competent programmer, you must be able to plan and design a program before you begin execution, given a set of requirements or constraints. You should be able to implement it, test it and deploy it without compromising its long-term maintainability. You must foresee effects in one step of the cycle affecting future steps. “But” you may ask yourself “how can I get experience in bigger projects if no one is willing to give me a shot?”. It’s very easy: The art of Computer Science has plenty of blank canvasses for you.

Defining Maturity

What makes a program “mature”? How do you know when your code has moved from the Proof-of-Concept stage to the Production code stage? From a business perspective this occurs when real end-users begin to interact with the program or when the program begins interacting with other production-grade programs. This definition of production code introduces risks to our processes because the stability, maintainability and robustness of the code has neither been defined nor tested. In the same way you can “smell” when food is going bad and you instinctively know that it is not safe for consumption, you can also “smell” code and with enough experience know when it’s not mature. There are many attributes that we could cover, but I will only focus on a handful:

Resiliency: Mature code tends to be resilient. Common errors or expected problems should not render the program unusable or crash it. It should not crash and explode in flames every time simple mistakes occur, such as losing network connection or receiving badly formatted input.
Observability: Programs must not be built with the idea of never expecting errors, because this is a fool’s errand. Programs will invariably encounter situations where it is not possible to safely recover, and exiting or raising an error is the best course of action. Unexpected situations will arise, and the program may explode into flames when it does not know what to do. Leaving behind a post-mortem is of incredible value to developers, as it can save them an enormous number of man-hours to replicate the issue. Logging is a sign of mature code, because the authors have looked ahead and decided to save themselves debugging time. But beyond logging, observability is important. As programs start becoming services that interact over a network, being able to monitor these systems in near real-time becomes more crucial. Health endpoints, log shipping and log rotation may not be the sexiest computer science topics, but they are a sign of mature code.
Maintainability: Programs are rarely built to not be updated or improved. Unless you are working on very specific or niche areas of Computer Science, you will have the ability to improve upon released code. Maintainable programs tend to follow consistent patterns accross its entire codebase. Codebases that do not follow consistent coding conventions reduce its maintainability considerably. If one section of the codebase expects all constants to be named in uppercase, while another section expects you to prefix constants with an underscore, you will not only waste time context switching but the risk of errors being introduced into the codebase increases. Beyond codebases, maintainability covers many auxiliary processes. If your deployment pipeline is controlled by a single bash script made up of hundreds of functions that no one has bothered to document, your maintainability will suffer because making changes to your deployment pipeline will require a lot of time. Your entire Software Development Life Cycle, from start to end, is as weak as your weakest link.

The attributes we just covered tend to be overlooked when preparing developers for the professional world. Universities and other tertiary education institutions rarely cover these subjects in significant depth. Why? I do not know, and neither have I understood the reasoning. Companies tend to value these attributes in codebases. In fact, some of the skills that tend to differentiate a new developer from a more seasoned one is the ability to look ahead and add logging, robust error handling and make their code more maintainable in the long term.

Companies want their developers to have these skills and are willing to reject less-seasoned developers for their lack of “maturity”, yet they rarely test these skills during interviews. Their narrow focus on Toy Problems leads to developers grinding through hours of exercises that rarely resemble real world problems. Real-life problems will benefit more from log rotation, monitoring of health points and good Git commit messages than from reversing a linked list without being able to traverse it backwards.

Project-based Learning

Beyond being able to grind through Toy Problems, we need to begin understanding the entire Software Development Life Cycle. And to have a grasp of the cycle, you must have experienced it. Instead of aiming for small scripts or toy problems, you should aim to work on more complex projects that can guide you through all steps of the cycle. As you go through each step, you should spend time understanding what the step entails and what are the current best practices. The plural in “practices” is on purpose because there is rarely a single good practice that encompasses all cases.

You can start your learning journey by building a calculator. We will incrementally add features while slowly traversing through the Software Development Life Cycle.

The calculator reads 2 numbers and an operation from a string, in a specific order. Humble beginnings. The simplicity of the features makes the planning extremely simple, making it a gentle introduction. You will do some straight-forward string parsing.
The calculator must support both interactive mode and standard input. Your program now has 2 different “entry-points”. You will learn about exit codes, standard input, standard output and standard error.
The calculator must work with parenthesis or brackets, and combinations of operations. This really raises the bar now. Your string parsing needs to be much more elegant, and you are now dealing with data structures other than arrays (spoiler: stacks).
The calculator must have passing unit tests. You now have a multi-step Software Development cycle. Unless you wrote the calculator with unit tests in mind from the start, you will need to reorganize some of your code to be more test friendly. Mocking may also enter your programming vocabulary, depending on how you structured your program. You will also learn about the nuances of testing using your programming language of choice for the project.
The calculator must have an x% code coverage of tests. You are now measuring the coverage of your tests, which takes you a step further into mature software development. You may or may not learn the sad lesson that coverage does not equal test quality.
The calculator should now work with decimals, up to a certain accuracy. You will now deal with floating points, decimals, rounding and accuracy. You may need to add CLI flags and learn to parse them.
The calculator should be tested automatically as part of a pipeline. This is the birth of automated pipelines and CI/CD. Calling it a single step is an oversimplification because it will involve learning best practices and picking some kind of framework or system to run these pipelines. This is the start of the Deploy step of the cycle.
The calculator should provide binaries and/or target multiple systems. This fully opens the floodgates of the Deploy stage of the cycle. The “and/or” is there because it differs per language. You cannot really get a binary if your calculator is written in Python, but you can if it is written in Go or C. Targeting multiple systems means that you need to learn to package your program differently: .exe, .deb, .rpm, etc. You will also need to learn about target systems: where settings should be stored, where the actual binaries should be stored, etc. Again, calling it a single step is an oversimplification.
The binaries or packages should be generated automatically. This would join the automated tests and give you a very robust deployment pipeline. If you connect this pipeline to your master branch you now have Continuous Deployment!
Merges into main/master branch should have policies. You are now tackling a fully automated pipeline. Merging to main/master should entail automatic builds and tests that prevent it if something goes wrong. Failed test? No merge. Failed build? No merge. If you want to take it even further, you can add linting.

Once you complete this project, you will have covered all six steps of the cycle. Your planning and design skills will improve as you tackle more and more projects. You will make awful mistakes that will come and bite you in the future (when you least expect it), especially when you are starting out in your programming journey. Fail fast and fail often, this is the optimal way to learn. As you write and rewrite code you will be covering the implementation step. You will be annoyed at your own requirements, you will lower or raise expectations, you will tear your hair apart at features you thought were easy and have a laugh about features you thought were hard. Testing will, hopefully, become a practice that you take with you everywhere. Implementing tests from the start, making your code test-friendly and having strict code coverage rules will increase the quality of your code. You will learn about the value of automatic deployments by messing up manual deployments. You will learn the difficulties of distributing software or binaries. And finally, having to deal with your very own spaghetti code will hopefully force you to improve your own coding standards. All of this from a simple calculator.

We make picking a project harder than it really is. You do not need to write a containerization engine or a programming language, but can start with simple things that slowly grow, like a calculator. Think of everyday tasks you would like to automate, or cool simulations you would like to learn about. More importantly, fail often, fail fast and don’t be scared to ditch your projects. Any coder would be lying if they told you that they completed all their projects, just like any artist would be lying if they claimed they completed all their artworks. Just like an artist, have fun sketching but make sure you become a good portraitist and not just an eye artist.