BibleGateway.com Verse Of The Day

Thursday, September 27, 2007

Word, Character, Line Counter Ruby Script

I've been on a few projects lately that involve loading data files that I get from other teams. When I get a file, it's nice to see how big it is to get an idea of how long it will take. Opening the file in a text editor and going to end of file to see the row number can be a pain, especially for big files. Also, I'm on a Windows desktop, so tools like "wc" aren't built-in, and launching Cygwin and getting to the right directory can be a pain in itself (cd /cygdrive/c/Documents\ and\ Settings/ran488/Desktop), so I whipped up this quick Ruby script to give me all the file info I want.

Here's the Ruby script (fileinfo.rb) to get the count of lines, characters, and words in a text file. It also spits out the min and max line size as well. The source can be downloaded at http://www.frontiernet.net/~nicholson150/fileinfo.rb.


#!/usr/bin/env ruby

if __FILE__ == $0
if ARGV.length < 1 then
puts "Usage: #{$0} filename"
exit
end

results = []
words = 0
chars = 0
minline = 0
maxline = 0

filename = ARGV.first
File.new(filename, "r").each { |line| results << line }

puts "#{filename} has..."
puts " -> #{results.size} lines."


results.each do |line|
chars += line.length
words += line.split.length

if line.length > maxline then
maxline = line.length
elsif line.length < minline then
minline = line.length
end

end
puts " -> #{words} words."
puts " -> #{chars} characters."
puts " -> #{minline} character shortest line length."
puts " -> #{maxline} characters longest line length."

end


Here's an example of the output....





C:\cvs\MqThrottler>fileinfo.rb PA_CSS_Test_Load_0911.txt
PA_CSS_Test_Load_0911.txt has...
-> 10008 lines.
-> 58021 words.
-> 4356721 characters.
-> 0 character shortest line length.
-> 489 characters longest line length.

No comments: