Never mind. I was too eager to find problems to fix. That IP did a download if needed packaged app (a test?) since I wrote the below text. That has a very distinctive pattern in the http logs that I don’t yet capture in my automated log analysis. How to go from zero to hero overnight.
You are downloading shoes.exe with your Ruby script when nothing has changed and you’re banging on the pkg.rb script many times per day for no purpose. That’s annoying. I may block you soon if you don’t respect the robots.txt. You’re not the worst I’ve seen but just the one I found first. Respect the robot.txt. OK? Perhaps you could download it before crawling.
For everyone else tuning in, clearly, I now have the ability now to process apache combined logs and identify “problems” (my definition of problem) and 22.214.171.124 is far from the worst. It got a free pass because it’s Ruby.
There are many fish to fry.
I’m probably not the right person to talk about this, I hate SQL for many reasons. I hate it even more when I have I have to do things like this:
sql = "insert into users (ip, cat, reqdate, path)\
Escape anything that might be numeric unless it’s a datetime. Ugly. Would ActiveRecord hide that if only I would learn to accept it’s beauty. I’m not a fan of that ‘do it this way’ baggage either.
Griping and future sql syntax errors in the future aside, soon I’ll be able to analyze the logs files in a way that MIGHT be useful. Then I can get back to working on Shoes and the stuff you might care about.
For example can this be integrated with the new shoes console window? But first, I have to write some SQL in Ruby.
One reason I wanted to analyze the logs was to see if Shoes is being downloaded by real people and how many downloads are from packaging. I did discover an easy way to do some of that. Below is the partial url and the number of time it was used by Shoes for the week of data.
/public/select/pkg.rb cnt 120
/public/shoes/shoes-3.2.24-gtk2-x86_64.install cnt 7
/public/shoes/shoes-3.2.24-osx-10.9.tgz cnt 4
/public/shoes/shoes-3.2.24-gtk2-32.exe cnt 16
/public/shoes/shoes-3.2.24-gtk2-i686.install cnt 2
/public/images/dino.jpg cnt 1
/ cnt 4
Unique IP's 42 Total: 154
There are interesting things. 42 people attempted to do something with the packaging. There were real (to me) 29 packaging attempts in that week and 120 presses on the “Select Architecture” button. 24% decided to package. (29/120). Because of the way Shoes caches downloads, when 3.2.24 was released that week – those 29 had to grab the new version for their cache. Once the new version is loaded in the cache. Further packaging attempts would only be in the 120 number
I’m pleased by these numbers. They are higher than I thought. Shoes is being used! I have no way of knowing how many people actually run Shoes. I might be able to figure out how many new shoes were downloaded by real people that week although that’s a lot harder to figure out and has plenty of uncertainty.
P.S. I have no interest or desire track the ip# back to the origin network or country or try to identify individual users. That said, there are people or bots that are trying to break in. Those I do hunt down and block.
Almost not related to Shoes: I wanted to discover how many of the hundreds of the daily downloads are what I would consider legitimate – real people downloading Shoes or real people using the Shoes packaging. How many bots, or idiot leachers or just evil folks and what are their IP#’s ? Some of them need to go into .htaccess in deny entries.
So I created some ruby scripts to process a collection of Apache (combined format) log files and stuff them into a local Sqlite3 db. I’m not a fan of SQL, but it’s the right tool for this. After many syntax errors I’ve got 7 days of shoes.mvmanila.com log entries (3475) which is enough to pick out the evil-doers, the idiots, the clueless and the friends.
Did I mention I don’t like SQL. I barely know what I’m doing. If someone one wants to help me analyze these logs, contact me at firstname.lastname@example.org and I’ll share the Ruby scripts and database. Because some of us jusst like to know things. I share your pain.
sqlite> select * from logentry where browser='Ruby'; shows me the Shoes (or other Ruby), AKA real people are using packager. If I knew SQL better I could probably figure how to count them or I could parse the results into a ruby hash based on the url (what are people really packaging for). Yes it would be butt easy to bin them into a ‘friends’ table and to populate an ‘evil doers’ table (because they POST instead of GET or they GET on things like index.php which just don’t exist at the site. Create another table for script kiddies. Some might end up in .htaccess deny entries.
Yes, I can track user behavior IF they use this site. I don’t care unless you abuse this site. There is a lot of abuse going on and has been ever since I went live. It always will occur but that doesn’t mean I have to accept it forever. I can clean my house and lock the doors when I want.
Available at the usual places
New with 3.2.24 or here (same place)
- Added show_console command for OSX and Linux to match Windows.
Dumber that dumb console. Works with readline if you don’t expect too
much. Although -w and –console switches do work on the command line
you probably don’t need them now that you can call Shoes::show_console
- OSX: new cshoes script for using Shoes from the command line.
Fixes some annoyances.
wiki: Command line for OSX
Fixed with 3.2.24
- Restore old behavior with ask/alert/confirm auto converting to string
- OSX: Fix issue #08 (again)
- OSX issue #20, 137 – command line incomplete, multiple apps
- dialog works better
- OSX: ask() dialog gets an icon like alert() and confirm()
- Windows: can now find the correct timezone for Time.now
- Windows packaging bug
If you don’t have enough to do, you can try the 3.2.24 beta. Or you could wait a few days and I’ll release 3.2.24. Bug fixes and a feature or two. Mostly a yawn unless it’s your bug or you really like the feature.