Offline Browsing in Linux: wget and some tricks

Ever since I joined Hostgator.com, I’ve been learning a lot of Linux in the hopes that I switch my career into Linux. Hopefully Forensics related.

So this new dilemma I had was to download a website for offline browsing. I went on the hunt for an offline file browser for Linux…. I found that I could use wget to mirror a whole website.

For example, I want to make a copy of blackberrysimunlockcode.com, Here’s how:

wget -m http://blackberrysimunlockcode.com

Here the -m option is telling wget to mirror the website. This is the basic command. But say I need some advanced options. What do I do?

I was trying to get all the script files off of a website to save for later learning and all it was downloading was the index.html and robots.txt
The robots.txt file was blocking user agent wget. To confirm this I used the debug option in wget:

wget -m -d http://blackberrysimunlockcode.com

You’ll get something like:

Not following http://blackberrysimunlockcode.com/privacy.shtml because robots.txt forbids it.

or

Rejecting path sh/eg/talk.sh.txt because of rule `sh’

or

no-follow in index.html

I tried using the option –user-agent “Mozilla” ….. no luck

I tried adding the following in .wgetrc :

## Local settings (for a user to set in his $HOME/.wgetrc). It is
## *highly* undesirable to put these settings in the global file, since
## they are potentially dangerous to “normal” users.
##
## Even when setting up your own ~/.wgetrc, you should know what you
## are doing before doing so.
##

header = Accept-Language: en-us,en;q=0.5
header = Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
header = Accept-Encoding: gzip,deflate
header = Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
header = Keep-Alive: 300
user_agent = Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6
referer = http://www.google.com

…Still no luck.

The trick is to use option -e robots=off

So my new command became:

wget -m -k -e robots=off -w 2 --random-wait -U "Mozilla" -np http://blackberrysimunlockcode.com

Heres what the options do:

-m mirrors website
-k fix links so you don’t get directed to www.blackberrysimunlockcode.com/sh/eg instead of /sh/eg (relative vs absolute)
-e executes command robots=off
-w 2 sets wait time as 2 seconds so you don’t overload server and get ip blocked
–random-wait can be random in 2 secon increments
-U sets user agent
-np no parent, so if the current subdirectory/page links to parent pages, it might crawl whole website

WordPress Hosting Review

Free WordPress Hosting for 3 months with Code: FREE3

They provide 1GB space and 10GB Bandwidth. Reliable Shared Hosting with an easy 1-click wordpress install setup with Softaculous. They are a linux based hosting provider, most probably CentOs.The customer support is 24/7 and excellent. Check them out.

I was wondering, does an Android OS based webhosting exist? I haven’t seen one yet but it would be interesting if someone starts that. So that would be like running an AAMP server, right?

keywords: Free WordPress Host, Reliable Linux Webhosting, Shared Hosting Plan, WordPress Hosting Review, WordPress with Fantastico

Switched WebHost Providers

Cirtexhosting Sucks! Cirtexhosting was just horrible!! I got really crappy uptime. The service was pretty pathetic and I never got a refund for the 99% uptime guarantee. I was losing valuable adsense revenue. I couldnt take it anymore. The prices might have been good, especially since I needed FFMPEG hosting, but I had to call it quits.

I switched to a much better host, HawkHost. Prices are reasonable, Support is great… so far. The CEO himself helped me with a troubleshooting ticket.

ad-hoc network on my Styleflying GPAD G10

To get this working, you will first have to root your styleflying android tablet.

This tablet is not like other android devices which are easily connected to adhoc by editing the tiwlan.ini and wpa_suplicant.conf. In fact, there is no tiwlan file on the device. We will have to use an alternate method, however, we will still be patching our wpa_supplicant.

The first step is to make an adhoc connection from your phone, laptop or desktop and name the ssid as ‘droidhoc’ and make sure you do not use a WEP key (it causes unnecessary problems). Set your ip address to static and use 192.168.0.1 (or you can use dhcp, but sometimes this causes problems). If you set static, then make sure you use a static IP, like 192.168.0.2, on your device under advanced settings. Set the mask to 255.255.255.0 and dns to 192.168.0.1.

The next step is to connect the gpad with adb and I will assume you know how to do that.

now we will type the following:

su (for superuser mode)
mount -o rw,remount -t yaffs2 /dev/block/mtdblock3 /system (mount partition as writable)
cp /data/misc/wifi/wpa_supplicant.conf /sdcard/ (copy configuration file to sdcard)
exit (to exit out of adb)
adb pull /sdcard/wpa_supplicant.conf / (pull the file off the sdcard to edit it)

Now edit the file with your text editor and enter the following after the line ctrl_interface=DIR=/data/system/wpa_supplicant GROUP=system:

eapol_version=2
update_config=1
ap_scan=2

network={
ssid="droidhoc"
scan_ssid=1
key_mgmt=NONE
group=WEP104
auth_alg=OPEN SHARED
priority=99
mode=1
}

Place the wpa_supp file back and change owner, permissions and reboot:
adb push /wpa_supplicant.conf /sdcard/wpanew.conf
adb shell
su
cp /sdcard/wpanew.conf /data/misc/wifi/wpa_supplicant.conf
cd /data/misc/wifi
chown wifi.wifi wpa_supplicant.conf
chmod 777 wpa_supplicant.conf
reboot

Once your phone is rebooted, we will download the HydTech’s adhoc wifi app from the market and run it.(app costs only $1.99)

Enjoy your new adhoc connection!

Fixing the Android market on my styleflying t72 gpad g10

When the android tablets are mass produced, I think all of them are given the same android id so I think google is blocking them and that is why you get downloads stuck at “starting download”. Also, random notifications popup saying download of such and such app was unsuccessful when you havent even tried to download that app. So we need to change our androidid. Either get one from another phone you are using or create one from an emulator. The problem with using your phone id is that the same app downloaded on your slate is downloaded to your phone. I used the id from my HTC Hero.

Here’s what I did (directions from slatedroid.com):

1. connect the Hero using adb
2. Type:
adb shell
su
sqlite3 /data/data/com.google.android.googleapps/databases/gls.db "select * from meta";

3. write down this androidid somewhere!
4. connect gpad with adb
5. Type:
adb shell
su
sqlite3 /data/data/com.google.android.googleapps/databases/gls.db "update meta set intvalue=XXXXXXXXXXXXXXXXXXX where name='androidId'";

(obviously replace XXX’s with your id)
6. restart and enjoy market!