Coding Challenges

Share this post

Coding Challenge #6 - Build Your Own Sort Tool

codingchallenges.substack.com

Discover more from Coding Challenges

A weekly Coding Challenge to help software engineers level up.
Over 14,000 subscribers
Continue reading
Sign in

Coding Challenge #6 - Build Your Own Sort Tool

John Crickett
Apr 22, 2023
8
Share this post

Coding Challenge #6 - Build Your Own Sort Tool

codingchallenges.substack.com
Share

This challenge is to build your own version of the Unix command line tool sort!

It sounds like a simple utility, but if you build everything in this challenge you’ll be using five different sorting algorithms and at least three different data structures.

The Unix command line tools are a great metaphor for good software engineering and they follow the Unix Philosophies of:

  • Design for simplicity; add complexity only where you must.

  • Design programs to be connected to other programs - each tool can be easily connected to other tools to create incredibly powerful compositions.

  • Design programs to be connected to other programs - each tool can be easily connected to other tools, via files and streams, to create incredibly powerful compositions.

Following these philosophies has made the simple unix command line tools some of the most widely used software engineering tools which can be chained together to create far more complex and powerful set of tools that you’d expect.

You can read more about the Unix Philosophy on the Coding Challenges blog.

The Challenge - Building sort

The functional requirements for sort are concisely described by it’s man page - give it a go in your local terminal now:

man sort

The TL/DR version is: the sort utility sorts text files by lines.  A line is a record separated from the subsequent record by a newline.

We’re going to ignore binary files for this challenge, though I’d encourage you to consider extending your implementation to support them once you’ve finished the challenge.

Step Zero

Like all good software engineering we’re zero indexed! In this step you’re going to set your environment up ready to begin developing and testing your solution.

I’ll leave you to setup your IDE / editor of choice and programming language of choice. After that here’s what I’d like you to do to be ready to test your solution.

Download the this text from Project Gutenberg and save it as test.txt.

Then we will create a list of all the words in the file and save it as words.txt.

% tr -s '[[:punct:][:space:]]' '\\n' < test.txt |sed '/^[0-9]/d' > words.txt

Please note how powerful the unix command line tools are. If you’re not familiar with them you’re missing out on some incredibly power tools that can do a lot data munging. It’s a great topic for self study.

Step 1

In this step your goal is to implement the essence of the sort program, we want to be able to run sort and have it open a file and output the lines in the file sorted lexicographically.

Your goal is to be able to run the following command and see the same output:

% sort words.txt | uniq | head -n5
A
ACTUAL
AGREE
AGREEMENT
AND

Step 2

In this step your goal is to implement the same functionality as the above but with the unique functionality built into sort, i.e. the -u option we saw when we did man sort.

% sort -u words.txt | head -n5
A
ACTUAL
AGREE
AGREEMENT
AND

You can achieve this whilst sorting, before sorting or after sorting the list. Think about the algorithms and data structures involved and the trade-offs you’ll be making in terms of computational complexity, space complexity and extensibility of the software.

Hint: just like writing code professionally, sometimes it pays to check you have all the known requirements before you start coding.

Step 3

In this step your goal is to allow the user of your sort program to select which sort algorithm to use. Checking man sort, we see that the standard Unix version offers radix sort, merge sort, quicksort and heapsort.

Implement as many of the different sort algorithms as you like and think carefully about how they interact with the -u flag.

There are thousands of words in our test text, so you might like to implement some unit testing / using the different sorting algorithms as a way to learn or practice test-drive development using your stacks testing library.

Step 4

In this step your goal is to implement random sort. Random sort is defined by the man page of sort as:

-R, -random-sort, -sort=random
Sort by a random order.  This is a random permutation of the inputs except that the equal keys sort together.  It is implemented by hashing the input keys and sorting the hash values. The hash function is chosen randomly.  The hash function is randomized by /dev/random content.

It’s random so your results won’t match mine, but it should look something like this:

% sort --random-sort -u words.txt | head -n5
gales
represent
satisfied
Wealth
awful

Share Your Solutions!

If you think your solution is an example of the developers can learn from please share it, put it on GitHub, GitLab or elsewhere. Then let me know - ping me a message via Twitter or LinkedIn or just post about it there and tag me.

Request for Feedback

I’m writing these challenges to help you develop your skills as a software engineer based on how I’ve approached my own personal learning and development. What works for me, might not be the best way for you - so if you have suggestions for how I can make these challenges more useful to you and others, please get in touch and let me know. All feedback greatly appreciated.

You can reach me on Twitter, LinkedIn or through SubStack

Thanks and happy coding!

John

8
Share this post

Coding Challenge #6 - Build Your Own Sort Tool

codingchallenges.substack.com
Share
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

© 2023 John Crickett
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing