Exact match. Not showing close matches.
PICList
Thread
'[OT] Offline WEB Access'
2005\07\04@024744
by
Buehler, Martin
i would like to take some 'complete' web sites with me to places, where
i have not web access.
i tried downloading a complete site using quadsucker, which i have used
a few years ago with success, but this does not seem to work with '.asp'
pages, which all have the same address with different arguments.
the downloaded pages do not work, neither in internet explorer nor in
firefox.
is there a tool for doing so, that almost 'eats' everything?
thanx!
tino
2005\07\04@065112
by
Spehro Pefhany
At 08:47 AM 7/4/2005 +0200, you wrote:
>i would like to take some 'complete' web sites with me to places, where
>i have not web access.
>
>i tried downloading a complete site using quadsucker, which i have used
>a few years ago with success, but this does not seem to work with '.asp'
>pages, which all have the same address with different arguments.
>the downloaded pages do not work, neither in internet explorer nor in
>firefox.
>
>is there a tool for doing so, that almost 'eats' everything?
>
>thanx!
>tino
Have you tried Acrobat (full version)?
Best regards,
Spehro Pefhany --"it's the network..." "The Journey is the reward"
spam_OUTspeffTakeThisOuT
interlog.com Info for manufacturers: http://www.trexon.com
Embedded software/hardware/analog Info for designers: http://www.speff.com
->> Inexpensive test equipment & parts http://search.ebay.com/_W0QQsassZspeff
2005\07\04@094257
by
Randy Glenn
I've had good luck with httrack - http://www.httrack.com/
On 7/4/05, Buehler, Martin <.....Martin.BuehlerKILLspam
@spam@keymile.com> wrote:
> i would like to take some 'complete' web sites with me to places, where
> i have not web access.
>
> i tried downloading a complete site using quadsucker, which i have used
> a few years ago with success, but this does not seem to work with '.asp'
> pages, which all have the same address with different arguments.
> the downloaded pages do not work, neither in internet explorer nor in
> firefox.
>
> is there a tool for doing so, that almost 'eats' everything?
>
> thanx!
> tino
>
> -
2005\07\04@104804
by
John Ferrell
I have been playing with "Web Extractor", the trial version. I have not made
it work properly.
http://www.esalesbiz.com/extra/
I am going back to the documentation to try to determine what I am doing
wrong. I will probably spend the $30 shareware IF I can make the trial work!
My problem seems to be in defining proper boundaries. I cannot seem to keep
it focused where I want it to download. Fortunately it does have a Stop
button. I have about 150G of workspace on a data drive so that has kept me
out of trouble with my system drive. The program initiates multiple threads
to speed things up and it can really swamp you with things you don't care
about. If you are unfortunate enough to fall into a porn cascade it is like
stepping into a black hole! The download seems to deliver virii without
going through Norton.
I am sure that the problems are mine, not the programs, but I have deferred
further testing until I bring up a low impact machine to do any more
testing.
Also, my XP Pro machine has the option of "Make website available offline"
sometimes, but I have not investigated this yet. This link
http://www.oz1bxm.dk/PIC/628uart.htm offers the option but does not follow
the links regardless of the depth set.
This page http://www.qsl.net/zl1bpu/micro/Rotator/Index.htm (a project I
would like to duplicate in PIC) does not offer the option for offline
availability. I don't yet know what determines the offline availability. The
pages saved are easily deleted from the same menu that allows the save or
all at once with the disk clean up in system tools. So far I have not
located where the files are kept so that I can archive a copy of a project.
John Ferrell
http://DixieNC.US
{Original Message removed}
2005\07\04@115539
by
Matthew Miller
|
On Mon, Jul 04, 2005 at 08:47:42AM +0200, Buehler, Martin wrote:
> i would like to take some 'complete' web sites with me to places, where
> i have not web access.
>
> i tried downloading a complete site using quadsucker, which i have used
> a few years ago with success, but this does not seem to work with '.asp'
> pages, which all have the same address with different arguments.
> the downloaded pages do not work, neither in internet explorer nor in
> firefox.
>
> is there a tool for doing so, that almost 'eats' everything?
I have pretty good success with a program called wget. The program has lots
of commandline switches, but it isn't too hard to figure out. Some sites
expect the HTTP request to contain user agent and referer information, but
with wget you can fool those sites pretty easily. ;)
It's free software, so you can use google to find a download site.
Matthew
--
"I do this really moronic thing that the government doesn't want me to
do. It is called thinking" - George Carlin
2005\07\04@171647
by
James Newtons Massmind
2005\07\04@172450
by
James Newton, Host
> [piclist-bounces
KILLspammit.edu] On Behalf Of John Ferrell
> My problem seems to be in defining proper boundaries. I
> cannot seem to keep it focused where I want it to download.
This is a common problem with all these programs and has been the bane of my
existence as a web host. When your site has over 3GB of content, most of it
interrelated with links, web rippers get confused and just keep downloading,
and downloading, and downloading....
And other users can't get on because your server is spending all its time
servicing this one jerk. Then your ISP calls up and says "you went over your
quota again so that will be an extra... $$$" and your wife gets the bill...
And then you have to pay for sex somewhere else... <KIDDING>
PLEASE do watch those things? Don't "fire and forget" assuming they will
stop when you expect it.
The PICList CD doesn't cost much, and I will drastically lower the price for
volume or group purchases since updating the CD image is the worst of it.
---
James Newton: PICList webmaster/Admin
.....jamesnewtonKILLspam
.....piclist.com 1-619-652-0593 phone
http://www.piclist.com/member/JMN-EFP-786
PIC/PICList FAQ: http://www.piclist.com
2005\07\04@172853
by
James Newton, Host
> I have pretty good success with a program called wget. The
> program has lots of command line switches, but it isn't too
> hard to figure out. Some sites expect the HTTP request to
> contain user agent and referrer information, but with wget you
> can fool those sites pretty easily. ;)
Grumble, grumble. Yes you can... But I... Err... I mean site hosts can still
look for patterns in the way the links are followed and the speed with which
pages are requested or the number of pages requested over time. I do all of
those, patterns and speed are checked automatically. Page count and patterns
are also manually checked when the server notices more activity than
expected.
If you have to rip a site, wget is one of the better ones. It can be set to
wait a while between requests, which I very much suggest you do to avoid
locking up the server with your requests.
---
James Newton: PICList webmaster/Admin
EraseMEjamesnewtonspam_OUT
TakeThisOuTpiclist.com 1-619-652-0593 phone
http://www.piclist.com/member/JMN-EFP-786
PIC/PICList FAQ: http://www.piclist.com
2005\07\04@183935
by
Matthew Miller
|
Hi James,
On Mon, Jul 04, 2005 at 02:28:58PM -0700, James Newton, Host wrote:
> > I have pretty good success with a program called wget. The
> > program has lots of command line switches, but it isn't too
> > hard to figure out. Some sites expect the HTTP request to
> > contain user agent and referrer information, but with wget you
> > can fool those sites pretty easily. ;)
>
> Grumble, grumble. Yes you can... But I... Err... I mean site hosts can still
> look for patterns in the way the links are followed and the speed with which
> pages are requested or the number of pages requested over time. I do all of
> those, patterns and speed are checked automatically. Page count and patterns
> are also manually checked when the server notices more activity than
> expected.
>
> If you have to rip a site, wget is one of the better ones. It can be set to
> wait a while between requests, which I very much suggest you do to avoid
> locking up the server with your requests.
I promise, I've never ripped piclist.com; though I frequently do save pages
to disk. ;^) Here is my wget command line:
wget -r -l 5 -w 4 --random-wait -U "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; T312461)" \
-t 1 --referer=$host --ignore-length -T 5 "$1"
The switches "-w 4 --random-wait" are what make it server friendly.
Matthew.
--
"It is my moral obligation to disobey unjust laws." - MLK, Jr.
2005\07\04@203745
by
John Ferrell
I downloaded a fresh trial copy (Web Extractor) and it works like I want it
to, so it looks like I will spring for it. No doubt in my mind that Wget
will do the same for free, but I have grown accustomed to not having to
think about the interface.
I have no need to download big sites, but frequently non-commercial web
sites have projects that are worth a save, especially if you can preserve
images and the intended order.
When I buy a physical product over the internet I like to keep a copy of the
web documentation as well as whatever came with it. It sometimes helps a
couple years down the road!
John Ferrell
http://DixieNC.US
----- Original Message -----
From: "John Ferrell" <johnferrell
spam_OUTearthlink.net>
To: "Microcontroller discussion list - Public." <@spam@piclistKILLspam
mit.edu>
Sent: Monday, July 04, 2005 10:51 AM
Subject: Re: [OT] Offline WEB Access
>I have been playing with "Web Extractor", the trial version. I have not
>made it work properly.
> http://www.esalesbiz.com/extra/
2005\07\04@204719
by
John Ferrell
I can see where it is a problem waiting to be solved. I try to respect the
wishes of the host at all sites.
People in the business of brokering information are in an increasingly
hostile environment.
It would help if projects like the PIC list had a solution to the problem.
John Ferrell
http://DixieNC.US
{Original Message removed}
2005\07\04@215724
by
William Chops Westfield
On Jul 4, 2005, at 5:50 PM, John Ferrell wrote:
>
> It would help if projects like the PIC list had a solution to the
> problem.
>
Piclist distributes a copy of the site contents on CD(s) for a nominal
cost ($20.) Isn't that a solution?
BillW
2005\07\04@224140
by
John Ferrell
It sounds fair to me.
I gather that there still are those who elect to download rather than buy
the cd copy.
Actually, that sounds like a very good bargain.
I believe my ISP reduces my download speed after I hit some number but it is
a pretty big number.
John Ferrell
http://DixieNC.US
{Original Message removed}
2005\07\05@013548
by
Buehler, Martin
no problem. piclist is not the information i usually take on holiday ;-)
************************************************************************
******************************
>{Original Message removed}
2005\07\05@023320
by
James Newtons Massmind
> no problem. piclist is not the information i usually take on
> holiday ;-)
So what are you saying? The PICList isn't GOOD ENOUGH for you to take on
holiday?
<GRIN>
Sorry, I couldn't help myself.
---
James.
2005\07\05@133452
by
Alan Schnittman
I like Teleport Pro. <www.tenmax.com/teleport/pro/home.htm>.
There's a free evaluation version, but IMHO it's well worth the $40 price.
At 02:47 AM 7/4/2005, "Buehler, Martin" <KILLspamMartin.BuehlerKILLspam
keymile.com> wrote:
>
>i would like to take some 'complete' web sites with me to places, where
>i have not web access.
>
> [snip]
>
2005\07\06@090036
by
Howard Winter
> piclist is not the information i usually take on
holiday ;-)
Lightweight! :-)))
Cheers,
Howard Winter
St.Albans, England
2005\07\06@091919
by
John J. McDonough
----- Original Message -----
From: "Howard Winter" <RemoveMEHDRWTakeThisOuT
H2Org.demon.co.uk>
Subject: RE: [OT] Offline WEB Access
>> piclist is not the information i usually take on
> holiday ;-)
Well, there's someone who just failed the geek test!
--McD
2005\07\11@221016
by
M. Adam Davis
|
Offline web site access works well static content and directly linked
content. Web site rippers don't work well on websites that are backed
by databases, which is probably one of the problems you're having.
If you can give us an example website that you seem to have trouble with
I'm sure someone will have a few suggestions on how to handle it.
As much as I dislike abusing websites with rippers, it has been good
especially in cases such as Circuit Cellar Online where the site
essentially languished. I got all the PDFs of the articles just before
it went completely down and it's great content that, apparantly, Circuit
Cellar doesn't have the full rights to showing on their main site. It's
in legal copyright limbo as far as I can tell. Archive.org is good, but
it doesn't get everything...
-Adam
Buehler, Martin wrote:
{Quote hidden}>i would like to take some 'complete' web sites with me to places, where
>i have not web access.
>
>i tried downloading a complete site using quadsucker, which i have used
>a few years ago with success, but this does not seem to work with '.asp'
>pages, which all have the same address with different arguments.
>the downloaded pages do not work, neither in internet explorer nor in
>firefox.
>
>is there a tool for doing so, that almost 'eats' everything?
>
>thanx!
>tino
>
>
>
More... (looser matching)
- Last day of these posts
- In 2005
, 2006 only
- Today
- New search...