Searching \ for '[OT] Offline WEB Access' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: massmind.org/techref/index.htm?key=offline+web+access
Search entire site for: 'Offline WEB Access'.

Exact match. Not showing close matches.
PICList Thread
'[OT] Offline WEB Access'
2005\07\04@024744 by Buehler, Martin

picon face
i would like to take some 'complete' web sites with me to places, where
i have not web access.

i tried downloading a complete site using quadsucker, which i have used
a few years ago with success, but this does not seem to work with '.asp'
pages, which all have the same address with different arguments.
the downloaded pages do not work, neither in internet explorer nor in
firefox.

is there a tool for doing so, that almost 'eats' everything?

thanx!
tino

2005\07\04@065112 by Spehro Pefhany

picon face
At 08:47 AM 7/4/2005 +0200, you wrote:
>i would like to take some 'complete' web sites with me to places, where
>i have not web access.
>
>i tried downloading a complete site using quadsucker, which i have used
>a few years ago with success, but this does not seem to work with '.asp'
>pages, which all have the same address with different arguments.
>the downloaded pages do not work, neither in internet explorer nor in
>firefox.
>
>is there a tool for doing so, that almost 'eats' everything?
>
>thanx!
>tino

Have you tried Acrobat (full version)?

Best regards,

Spehro Pefhany --"it's the network..."            "The Journey is the reward"
spam_OUTspeffTakeThisOuTspaminterlog.com             Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog  Info for designers:  http://www.speff.com
->> Inexpensive test equipment & parts http://search.ebay.com/_W0QQsassZspeff


2005\07\04@094257 by Randy Glenn

picon face
I've had good luck with httrack - http://www.httrack.com/

On 7/4/05, Buehler, Martin <.....Martin.BuehlerKILLspamspam@spam@keymile.com> wrote:
> i would like to take some 'complete' web sites with me to places, where
> i have not web access.
>
> i tried downloading a complete site using quadsucker, which i have used
> a few years ago with success, but this does not seem to work with '.asp'
> pages, which all have the same address with different arguments.
> the downloaded pages do not work, neither in internet explorer nor in
> firefox.
>
> is there a tool for doing so, that almost 'eats' everything?
>
> thanx!
> tino
>
> -

2005\07\04@104804 by John Ferrell

face picon face
I have been playing with "Web Extractor", the trial version. I have not made
it work properly.
http://www.esalesbiz.com/extra/

I am going back to the documentation to try to determine what I am doing
wrong. I will probably spend the $30 shareware IF I can make the trial work!

My problem seems to be in defining proper boundaries. I cannot seem to keep
it focused where I want it to download. Fortunately it does have a Stop
button. I have about 150G of workspace on a data drive so that has kept me
out of trouble with my system drive. The program initiates multiple threads
to speed things up and it can really swamp you with things you don't care
about. If you are unfortunate enough to fall into a porn cascade it is like
stepping into a black hole! The download seems to deliver virii without
going through Norton.

I am sure that the problems are mine, not the programs, but I have deferred
further testing until I bring up a low impact machine to do any more
testing.

Also, my XP Pro machine has the option of "Make website available offline"
sometimes, but I have not investigated this yet. This link
http://www.oz1bxm.dk/PIC/628uart.htm offers the option but does not follow
the links regardless of the depth set.

This page http://www.qsl.net/zl1bpu/micro/Rotator/Index.htm (a project I
would like to duplicate in PIC) does not offer the option for offline
availability. I don't yet know what determines the offline availability. The
pages saved are easily deleted from the same menu that allows the save or
all at once with the disk clean up in system tools. So far I have not
located where the files are kept so that I can archive a copy of a project.


John Ferrell
http://DixieNC.US

{Original Message removed}

2005\07\04@115539 by Matthew Miller

flavicon
face
On Mon, Jul 04, 2005 at 08:47:42AM +0200, Buehler, Martin wrote:
> i would like to take some 'complete' web sites with me to places, where
> i have not web access.
>
> i tried downloading a complete site using quadsucker, which i have used
> a few years ago with success, but this does not seem to work with '.asp'
> pages, which all have the same address with different arguments.
> the downloaded pages do not work, neither in internet explorer nor in
> firefox.
>
> is there a tool for doing so, that almost 'eats' everything?

I have pretty good success with a program called wget. The program has lots
of commandline switches, but it isn't too hard to figure out. Some sites
expect the HTTP request to contain user agent and referer information, but
with wget you can fool those sites pretty easily. ;)

It's free software, so you can use google to find a download site.

Matthew

--
"I do this really moronic thing that the government doesn't want me to
do.  It is called thinking" - George Carlin

2005\07\04@171647 by James Newtons Massmind

face picon face
If you do find one do NOT use it on piclist.com
http://www.piclist.com/dontripthissite.htm

And I'm NOT going to mention plucker

---
James.



> {Original Message removed}

2005\07\04@172450 by James Newton, Host

face picon face
> [piclist-bouncesspamKILLspammit.edu] On Behalf Of John Ferrell
> My problem seems to be in defining proper boundaries. I
> cannot seem to keep it focused where I want it to download.

This is a common problem with all these programs and has been the bane of my
existence as a web host. When your site has over 3GB of content, most of it
interrelated with links, web rippers get confused and just keep downloading,
and downloading, and downloading....

And other users can't get on because your server is spending all its time
servicing this one jerk. Then your ISP calls up and says "you went over your
quota again so that will be an extra... $$$" and your wife gets the bill...
And then you have to pay for sex somewhere else... <KIDDING>

PLEASE do watch those things? Don't "fire and forget" assuming they will
stop when you expect it.

The PICList CD doesn't cost much, and I will drastically lower the price for
volume or group purchases since updating the CD image is the worst of it.

---
James Newton: PICList webmaster/Admin
.....jamesnewtonKILLspamspam.....piclist.com  1-619-652-0593 phone
http://www.piclist.com/member/JMN-EFP-786
PIC/PICList FAQ: http://www.piclist.com



2005\07\04@172853 by James Newton, Host

face picon face
> I have pretty good success with a program called wget. The
> program has lots of command line switches, but it isn't too
> hard to figure out. Some sites expect the HTTP request to
> contain user agent and referrer information, but with wget you
> can fool those sites pretty easily. ;)

Grumble, grumble. Yes you can... But I... Err... I mean site hosts can still
look for patterns in the way the links are followed and the speed with which
pages are requested or the number of pages requested over time. I do all of
those, patterns and speed are checked automatically. Page count and patterns
are also manually checked when the server notices more activity than
expected.

If you have to rip a site, wget is one of the better ones. It can be set to
wait a while between requests, which I very much suggest you do to avoid
locking up the server with your requests.

---
James Newton: PICList webmaster/Admin
EraseMEjamesnewtonspam_OUTspamTakeThisOuTpiclist.com  1-619-652-0593 phone
http://www.piclist.com/member/JMN-EFP-786
PIC/PICList FAQ: http://www.piclist.com



2005\07\04@183935 by Matthew Miller

flavicon
face
Hi James,

On Mon, Jul 04, 2005 at 02:28:58PM -0700, James Newton, Host wrote:
> > I have pretty good success with a program called wget. The
> > program has lots of command line switches, but it isn't too
> > hard to figure out. Some sites expect the HTTP request to
> > contain user agent and referrer information, but with wget you
> > can fool those sites pretty easily. ;)
>
> Grumble, grumble. Yes you can... But I... Err... I mean site hosts can still
> look for patterns in the way the links are followed and the speed with which
> pages are requested or the number of pages requested over time. I do all of
> those, patterns and speed are checked automatically. Page count and patterns
> are also manually checked when the server notices more activity than
> expected.
>
> If you have to rip a site, wget is one of the better ones. It can be set to
> wait a while between requests, which I very much suggest you do to avoid
> locking up the server with your requests.

I promise, I've never ripped piclist.com; though I frequently do save pages
to disk. ;^) Here is my wget command line:

wget -r -l 5 -w 4 --random-wait -U "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; T312461)" \
-t 1 --referer=$host --ignore-length -T 5 "$1"

The switches "-w 4 --random-wait" are what make it server friendly.

Matthew.

--
"It is my moral obligation to disobey unjust laws." - MLK, Jr.

2005\07\04@203745 by John Ferrell

face picon face
I downloaded a fresh trial copy (Web Extractor) and it works like I want it
to, so it looks like I will spring for it. No doubt in my mind that Wget
will do the same for free, but I have grown accustomed to not having to
think about the interface.

I have no need to download big sites, but frequently non-commercial web
sites have projects that are worth a save, especially if you can preserve
images and the intended order.

When I buy a physical product over the internet I like to keep a copy of the
web documentation as well as whatever came with it. It sometimes helps a
couple years down the road!

John Ferrell
http://DixieNC.US

----- Original Message -----
From: "John Ferrell" <johnferrellspamspam_OUTearthlink.net>
To: "Microcontroller discussion list - Public." <@spam@piclistKILLspamspammit.edu>
Sent: Monday, July 04, 2005 10:51 AM
Subject: Re: [OT] Offline WEB Access


>I have been playing with "Web Extractor", the trial version. I have not
>made it work properly.
> http://www.esalesbiz.com/extra/


2005\07\04@204719 by John Ferrell

face picon face
I can see where it is a problem waiting to be solved. I try to respect the
wishes of the host at all sites.

People in the business of brokering information are in an increasingly
hostile environment.

It would help if projects like the PIC list had a solution to the problem.

John Ferrell
http://DixieNC.US

{Original Message removed}

2005\07\04@215724 by William Chops Westfield

face picon face
On Jul 4, 2005, at 5:50 PM, John Ferrell wrote:
>
> It would help if projects like the PIC list had a solution to the
> problem.
>
Piclist distributes a copy of the site contents on CD(s) for a nominal
cost ($20.)  Isn't that a solution?

BillW

2005\07\04@224140 by John Ferrell

face picon face
It sounds fair to me.
I gather that there still are those who elect to download rather than buy
the cd copy.
Actually, that sounds like a very good bargain.
I believe my ISP reduces my download speed after I hit some number but it is
a pretty big number.

John Ferrell
http://DixieNC.US

{Original Message removed}

2005\07\05@013548 by Buehler, Martin

picon face
no problem. piclist is not the information i usually take on holiday ;-)

************************************************************************
******************************


>{Original Message removed}

2005\07\05@023320 by James Newtons Massmind

face picon face
> no problem. piclist is not the information i usually take on
> holiday ;-)

So what are you saying? The PICList isn't GOOD ENOUGH for you to take on
holiday?

<GRIN>

Sorry, I couldn't help myself.

---
James.



2005\07\05@133452 by Alan Schnittman

picon face

I like Teleport Pro.  <www.tenmax.com/teleport/pro/home.htm>.
There's a free evaluation version, but IMHO it's well worth the $40 price.


At 02:47 AM 7/4/2005, "Buehler, Martin" <KILLspamMartin.BuehlerKILLspamspamkeymile.com> wrote:
>
>i would like to take some 'complete' web sites with me to places, where
>i have not web access.
>
> [snip]
>

2005\07\06@090036 by Howard Winter

face
flavicon
picon face
> piclist is not the information i usually take on
holiday ;-)

Lightweight!  :-)))

Cheers,



Howard Winter
St.Albans, England


2005\07\06@091919 by John J. McDonough
flavicon
face
----- Original Message -----
From: "Howard Winter" <RemoveMEHDRWTakeThisOuTspamH2Org.demon.co.uk>
Subject: RE: [OT] Offline WEB Access


>> piclist is not the information i usually take on
> holiday ;-)

Well, there's someone who just failed the geek test!

--McD

2005\07\11@221016 by M. Adam Davis

flavicon
face
Offline web site access works well static content and directly linked
content.  Web site rippers don't work well on websites that are backed
by databases, which is probably one of the problems you're having.

If you can give us an example website that you seem to have trouble with
I'm sure someone will have a few suggestions on how to handle it.

As much as I dislike abusing websites with rippers, it has been good
especially in cases such as Circuit Cellar Online where the site
essentially languished.  I got all the PDFs of the articles just before
it went completely down and it's great content that, apparantly, Circuit
Cellar doesn't have the full rights to showing on their main site.  It's
in legal copyright limbo as far as I can tell.  Archive.org is good, but
it doesn't get everything...

-Adam

Buehler, Martin wrote:

{Quote hidden}

More... (looser matching)
- Last day of these posts
- In 2005 , 2006 only
- Today
- New search...