Hi this is John with this week’s Coding Challenge.
🙏 Thank you for being one of the 66,545 software developers who have subscribed, I’m honoured to have you as a reader. 🎉
If there is a Coding Challenge you’d like to see, please let me know by replying to this email📧
Welcome To Coding Challenges - From The Challenges!
In this Coding Challenges newsletter I’m sharing some of the common mistakes I see software engineers make when tackling the Coding Challenges.
I’m sharing the mistakes people make and some thoughts on how you can you avoid making the same mistakes when taking on the coding challenges or when writing software professionally.
Aside: Performance Quiz
If we re-wrote Docker in Python, what impact would it have on the performance of the containers you run?
If you want to understand Docker and figure out the answer, check out the build your own Docker challenge. By building your own Docker, you’ll gain a deeper understanding of Docker and become a better software engineer.
If you’d prefer a version of build your own Docker that has automated tests, then check out check out codecrafters.io build your own Docker. You can get 40% off as a Coding Challenges reader plus it helps support Coding Challenges.
If you’d rather I added automated tests, hit reply to this email or comment on Substack and let me know!
Recapping The Build Your Own Sort Coding Challenge
In the build your sort coding challenge the goal was to write your own version of the Unix command line tool sort!
It sounds like a simple utility, but if you build everything in this challenge you’ll be using five different sorting algorithms and at least three different data structures.
If you fancy giving it a go, check out the the build your own sort challenge now and come back to this when you’re done.
Five Common Mistakes Software Engineers Make Solving The Sort Coding Challenge
I’ve pulled together this list of common mistakes from the hundreds of submissions I’ve been sent privately and the many shared in the Coding Challenges Shared Solutions GitHub Repo.
Mistake 1 - Binary / Artefacts In The Repo
This is a recurring issue across many different submissions to many different coding challenges!
Source code repos are for source code. Binaries do not belong in them. It should be possible to re-create the binary from the source code so there is no need for the binary. Equally it’s not obvious from a binary if the current binary is built from the current version of the code or an earlier one. Or even if the binary has been compromised.
Mistake 2 - Pivot Selection For Quicksort
Don’t just pick the first element of the partition to be the pivot. This will lead to the worst-case behaviour of already sorted (or near sorted) input (which is common). Instead pick a random index or read up on the research into alternative pivots.
When it comes to quicksort, you should also consider the implications of large files. Think about the implications of large input and the recursive nature of quicksort.
Mistake 3 - Reading The Whole File Into Memory And Sorting That
A common mistake I see people making for all of the Unix command line tools is to load the whole file into memory.
The problem with this approach is that it doesn’t scale, if someone tries to use the program on a large enough file the program crashes having run out of memory.
If you’re writing code to handle files, do remember to check you can handle large files. Sure it’s not easy to test this - you might not have the disk space for a 100GB test file for example, but there are ways around this. For example to test sort
with a large file we can leverage the power of the Unix command line tools to generate one on the fly without actually taking up any disk space.
Here’s how:
seq 1 300000 | xargs -Inone cat test.txt | sort
To address this mistake there are two steps:
Plan for it and create tests - if your software can read any file, consider what that means, including arbitrary size.
Unless you really need to read the whole file, always process files incrementally. For text that might be line by line for record based files it might be record by record. For speed it might be page by page (which refers to the memory page).
As I understand it the original version of sort
handles large files by splitting anything too big to fit into files sized based some portion of the total available RAM into several smaller files. I then sorts the smaller files and combines them.
Mistake 4 - Not Reading/Understanding The Requirements
Part of the benefit if building real-world software is having a clear unambiguous set of requirements. You make your solution have the same interface and produce the same output as the real thing.
When you ignore the specification, either in the coding challenge or implicit in the thing we’re cloning you build software that doesn’t work to specification.
Mistake 5 - Not Testing Edge Case
It’s easy as software engineers to take a requirement, code up the solution and check it works.
But that’s not enough, we should also think about the edge cases and error conditions and both code to detect and handle them and ensure we’re testing our handling of them.
It’s great to see tests in the submitted solutions, it would be even better to see tests that check edge conditions. For sorting, at very least consider:
An empty file
A very big file
A sorted file
An unsorted file
Replace file with dataset if you’re testing the sort function rather than the whole program.
Request for Feedback
I’m writing these coding challenges and this new from the challenges series to help you develop your skills as a software engineer based on how I’ve approached my own personal learning and development.
What works for me, might not be the best way for you - so if you have suggestions for how I can make these challenges more useful to you and others, please get in touch and let me know. All feedback greatly appreciated.
You can reach me on Twitter, LinkedIn or through SubStack
Thanks and happy coding!
John
P.S. If You Enjoy Coding Challenges Here Are Four Ways You Can Help Support It
Refer a friend or colleague to the newsletter. 🙏
Sign up for a paid subscription - think of it as buying me a coffee ☕️ twice a month, with the bonus that you also get 20% off any of my courses.
Buy one of my courses that walk you through a Coding Challenge.
I run a YouTube channel sharing advice on software engineering.