I want to know how big files are within my repository in terms of lines of code, to see the 'health' of a repository.
In order to answer this, I would like to see a distribution (visualised or not) of the number of files for a specific range (can be 1):
#lines of code #files
1-10 1
11-20 23
etc...
(A histogram of this would be nice)
Is there quick why to get this, with for example cloc or any other (command line) tool?
So the goal was to get a histogram of the sizes (in lines of code) for all the files in a directory. Since our project is a React Native project, we are concerned with .ts and .tsx files. All the test files (also .ts and .tsx files) can be skipped.
Also, show the 5 largest files, so we know where our attention is needed.
What we basically did was traverse the directory recursively and for every file we're interested in 1) calculate size (in lines of code), 2) calculate in which 'bin'/'bar' the file belongs and 3) add it to that bin. Meanwhile, you keep track of all the sizes, to display the 5 largest files.
The following python script worked perfectly for my use case: