Coding Challenge #92 - LOC Counter
This challenge is to build your own line of code counting tool.
Hi this is John with this week’s Coding Challenge.
🙏 Thank you for being one of the 85,826 software developers who have subscribed, I’m honoured to have you as a reader. 🎉
If there is a Coding Challenge you’d like to see, please let me know by replying to this email📧
Coding Challenge #92 - Line of Code Counter
This challenge is to build your own version of the tools cloc, sloc and scc. These tools count lines of code and produce statistic on the number of lines in the source code, the lines of code, the lines of comments, the empty lines and so on.
Some also calculate the COCOMO 81 and COCOMO II estimates for the software being analysed. If you’re not familiar with it, the COCOMO model was developed by Barry W. Boehm to estimate the effort, cost and schedule for software projects. I wouldn’t rely on these numbers to plan a software project, but they’re an interesting tool to compare existing projects and get a feel for the size and scope of them.
Counting the lines of code in a software project sounds trivial and quite honestly seems like something you could do in a short bash command, i.e.:
% find . -name '*.go' | xargs wc -l | sort -nr
However if you want to do it accurately and fast, you can get into some interesting computer science challenges. And when it comes to scc, I mean blazingly fast!
But Why Count Lines Of Code?
TL/DR: It’s useful as a gauge of the size and complexity of a project, but if you want much more detail Ben Boyter, the author of scc wrote a blog post explaining why he put so much effort into building a tool to count lines of code.
📌 Systems Programming (Redis) Course Starts July 14th
Would you like to build a network server from scratch with me?
Learning about network programming, concurrency, testing, and systems software development?
If so check out my course: Build A Redis Server Clone: Master Systems Programming Through Practice.
It is designed to be intense! It’s 11 hours of instructor time over two weeks. With the goal of having you build a clone of the original Redis server by the end of the two weeks.
If you sign up before 30th June you can get $100 off using the code: EARLYBLIJ
If You Enjoy Coding Challenges Here Are Three Ways You Can Help Support It
Refer a friend or colleague to the newsletter. 🙏
Sign up for a paid subscription - think of it as buying me a coffee ☕️, with the bonus that you also get 20% off any of my courses.
Buy one of my courses that walk you through a Coding Challenge.
The Challenge - Building A Tool To Count Lines Of Code
In this project, we’re going to build a tool to count likes of code in each file that is in a directory or subdirectories of that directory.
The tool should identify blank, comment and code lines and be able to provide a value per file and a summary of the whole project with output in text or JSON.
For example, here’s the output of SCC on the Go version of the Redis clone I use in my Build A Redis Server Master Systems Programming Through Practice course:
% scc
────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code
────────────────────────────────────────────────────────────────────────
Go 11 2229 202 23 2004
License 1 21 4 0 17
Markdown 1 2 0 0 2
YAML 1 25 6 0 19
────────────────────────────────────────────────────────────────────────
Total 14 2277 212 23 2042 429
────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $57,168
Estimated Schedule Effort (organic) 4.64 months
Estimated People Required (organic) 1.10
────────────────────────────────────────────────────────────────────────
Processed 62447 bytes, 0.062 megabytes (SI)
────────────────────────────────────────────────────────────────────────
Interesting to see that it estimates the cost of building the software as 57k USD!
Step Zero
Like all C based programming languages we’re zero indexed at Coding Challenges! In this step you’re going to set your environment up ready to begin developing and testing your solution.
I’ll leave you to setup your IDE / editor of choice and programming language of choice. While you’re doing that give some thought to the programming language or languages that you might find it useful to be able count the lines of code for.
Step 1
In this step your goal is to scan a file system from a starting directory and identify all the source files in it.
Your program should accept a single argument for the starting directory. For this step I suggest you list out all the matching files. To check your code works you could pipe the output to a file and compare to the bash command:
% find . -name '*.go' | sort
Replacing go
with the extension of the programming language your going to focus on counting the code for.
Step 2
In this step your goal is to differentiate between blank and non-blank lines in each file. You should scan each file in the directory and it’s subdirectories then optionally either print a summary (the default option) or a breakdown per file.
For example:
─────────────────────────────────────────────────────────────
File Lines Blanks
─────────────────────────────────────────────────────────────
internal/parser/parser.go 463 78
~al/interpreter/interpreter.go 334 63
internal/resolver/resolver.go 248 41
Step 3
In this step your goal is to split the non-blank lines into code and comments. As a first step consider the cases where a the first no-whitespace character is the comment character as a comment.
If your language supports it (i.e. C, C++, Java, etc.) don’t forget support for multi-line comments.
Once you have that, check your output and produce an updated report, for example:
────────────────────────────────────────────────────────────────────────
File Lines Blanks Comments Code
────────────────────────────────────────────────────────────────────────
internal/parser/parser.go 463 78 1 384
~al/interpreter/interpreter.go 334 63 5 266
internal/resolver/resolver.go 248 41 0 207
Step 4
In this step your goal is to handle a line that may contain both code and the start of a multi-line comment, for example:
i++; /*
comment
*/
I would count this as one line of code and two lines of comment. The important thing is to recognise a multi-line comment beginning on the first line and then not counting the last two lines as code.
Step 5
In this step your goal is to handle plain text files and Markdown files. Plain text should be easy, just count blank and non-blank lines. For Markdown you should identify multi-line code sections and count the lines of code.
Step 6
In this step your goal is to handle one or more additional programming languages. Think carefully about how you do this. If you approach it the right way it’s possible to make adding a new programming language a simple matter of adding some config to a languages file. Hint, a table driven state machine!
Going Further
To take this further, configure support for multiple languages and make your project available for download. Be sure to provide instructions for users to add support for a programming language and open a pull-request on your repo.
Beyond that add support for calculating COCOMO for the code.
Two Other Ways I Can Help You:
I write another newsletter Developing Skills that helps you level up the other skills you need to be a great software developer.
I have a YouTube channel sharing advice on software engineering.
Share Your Solutions!
If you think your solution is an example other developers can learn from please share it, put it on GitHub, GitLab or elsewhere. Then let me know via Bluesky or LinkedIn or just post about it there and tag me. Alternately please add a link to it in the Coding Challenges Shared Solutions Github repo
Request for Feedback
I’m writing these challenges to help you develop your skills as a software engineer based on how I’ve approached my own personal learning and development. What works for me, might not be the best way for you - so if you have suggestions for how I can make these challenges more useful to you and others, please get in touch and let me know. All feedback greatly appreciated.
You can reach me on Bluesky, LinkedIn or through SubStack
Thanks and happy coding!
John