The First Step of Refactoring a Rails Application
Refactoring is an important step in software development. It helps to make sure that existing code is kept updated and clean, even after a herd of people trample over it.
Recently I've been asked a lot as to which tools I use to refactor Rails apps. I don't use any one tool to refactor, but instead have a toolbox of several tools. Having several tools at your disposal gives you more flexibility to find what code needs to be refactored.
Intuition
The first and primary tool I use is intuition. If you know your code really well, you will already know which areas cause problems and need to be refactored.
If you are new to a code base though, you won't have this intuition and need to build it up before you can use it. Luckily there are several tools to find problem code that is just waiting for you to refactor them.
wc - the Unix word counter
The first tool is a simple Unix tool called wc
. wc
is the word count command and it's installed on almost all Unixes by default. It does more than count words though, it also counts the number of lines in a file. In my experience, classes and files with the most lines tend to need the most refactoring. If you think about it, more lines means more code which means a greater likelihood of bad code.
Using wc -l
you can easily get line counts for all of the major areas of your Rails project. I typically just look in the controllers and models since that's where most development takes place but you might also want to check your lib and vendor directories. What you are looking for are large and complex classes.
If you have a lot of classes, like Redmine does, it can be useful to pipe the output to sort -n
which sorts it numerically.
$ wc -l app/controllers/*.rb | sort -n 25 app/controllers/auto_completes_controller.rb 25 app/controllers/ldap_auth_sources_controller.rb 26 app/controllers/project_enumerations_controller.rb ... snip ... 248 app/controllers/wiki_controller.rb 271 app/controllers/projects_controller.rb 272 app/controllers/account_controller.rb 322 app/controllers/repositories_controller.rb 324 app/controllers/timelog_controller.rb 326 app/controllers/issues_controller.rb 412 app/controllers/application_controller.rb 5381 total
$ wc -l app/models/*.rb | sort -n 22 app/models/document_observer.rb 22 app/models/group_custom_field.rb 22 app/models/issue_observer.rb ... snip ... 200 app/models/changeset.rb 202 app/models/wiki_page.rb 215 app/models/repository.rb 234 app/models/version.rb 336 app/models/mail_handler.rb 416 app/models/user.rb 442 app/models/mailer.rb 643 app/models/query.rb 774 app/models/project.rb 863 app/models/issue.rb 7412 total
Reviewing the results from wc
You want to look for a few things with raw line counts:
- is there a class with a disproportionate number of lines compared to the others?
- are there more lines in the controllers than the models?
In general, a well factored Rails application will have a majority of its code in the models and most classes will be the same size. There might be a few major classes that are larger than the rest, which are the primary domain objects. Using Redmine as an example, you would expect any Project, Issue, and User classes to be large since it's a project management and bug tracking application. Looking at the line counts from wc
, the Timelog controller and Query model have a considerable number of lines even though they aren't the primary domain objects. In fact, because of my intuition from working with Redmine for years, I know both of these classes need some major refactoring.
Reading the code to find refactorings
After you see which classes are the largest, you will want to read through the class itself. Look for how "thick" the class is and see what you can learn about it's structure.
- Are there a few large method that contain most of the code for the class? Might be able to use extract method and extract class to shrink their size.
- Are there a lot of short methods? Maybe the code was over-factored or the class is trying to do too much and needs to be split.
- Are there a lot of comments? Since
wc
counts comments as lines too,wc
might report that a class is large but there might not be that much actual code there.
After reading through some classes you will start to get a feel for what areas to refactor and if there are any common problems. You should also be able to start working on a few refactorings as you go.
Next time I'll talk about using some tools that understand Ruby and what makes Ruby code complex.
This is a guest post by Eric Davis. Eric is a member of the Redmine team and has written an ebook Refactoring Redmine to show Rails developers what refactoring real Rails code looks like. He writes about the different refactorings done to Redmine every day at http://theadmin.org.
October 8th, 2010 at 1:25 am
If you have any refactoring comments or suggestions, I'd like to try to answer them in some future posts. Just submit them here for me.
Thanks.
October 8th, 2010 at 5:35 pm
Some typos:
"a greater likely hood" -> "a greater likelihood"
"a majority of it's code" -> "a majority of its code"
Feel free to delete this comment after correcting them. :)
October 8th, 2010 at 5:43 pm
"Since wc counts comments as lines too, wc might report that a class is large but there might not be that much actual code there."
You could grep out any comment-only lines (i.e. those starting with "#") by piping it through this:
grep -ve"^#"
(-v means "exclude any matching lines"; -e means "lines matching the following regular expression"; the regexp "^#" means a line starting with "#")
So, to get a sorted line-count (excluding lines that are entirely comments) for all controllers, you could do:
wc -l app/controllers/*.rb | sort -n | grep -ve"^#"
October 8th, 2010 at 5:57 pm
good start, waiting for next post, with much more details )
October 8th, 2010 at 8:28 pm
Joe Grossberg,
Thanks for that suggestion. Adding grep on the end of that command only matches the output of wc and not the file contents itself.
A more advanced grep command is needed to remove indented comments or blank lines. I can't figure out a good way to get total line counts from it though (without piping each file to wc).
# Start of line + any number of white space + #.
grep -ve "^[[:space:]]*#" path/app/controllers/application_controller.rb | grep -v '^$'
This is why I try to keep it simple and just use wc. There are better tools to do a more in depth search.
October 12th, 2010 at 11:59 am
Isn't all tis grep regexp you guys are trying to tweak already covered by "rake stats"?
Not only that, but rake stats already provides you the methods per class and the LOC per method ratios.
Just a thought.
October 12th, 2010 at 3:53 pm
Luis Lavena:
rake stats only gives the summary for the entire application, not per class. We want something like: "ControllerA has 100 lines, ControllerB has 10 lines". rake stats is great to use when you are trying to find which layer most of the code is at (e.g. Controller, Model, or Libraries).
October 13th, 2010 at 2:44 am
Eric Davis:
Good point, never payed too much attention to that.
Then a pure-ruby extension to that instead of fiddling with grep regexp will be a better candidate I believe (but breakdown the stats, not totalizing them)
October 13th, 2010 at 5:45 pm
Luis Lavena:
Exactly. That's why I recommend using a basic wc command to get a quick feel for the application and then digging into it by either reading the code or using more ruby specific tools. I'm working on a follow up post on using flog to get better stats, since it care parse and understand ruby code.