Do we need a historian? It's pruning time.

Partial / temporary help in the thread 'image' situation.

Firefox browser has an extension (add-on)
available free on the website called "Text-to-Image"
When turned on
even for the text-only archive I have
it parses the text,
identifies image addresses
& goes out to dl & display them in place of the text address
:cool: coolness
just like Bladeforums does normally.

[this applies only to image addresses posted,
mostly image links from outside bladeforums,
not icons/avatars/page displays.
Just the links]


now if I can find a way to use this or another extension to
parse the text threads & cache all the pics



[color=darkgreen]~
~~~~~~~~~
<>[size=1][i] THEY[/i][/size][size=2] call me [/size][/color][size=4][b] 'Dean' [/b][/size] [color=darkgreen] :)[size=1]-fYI-fWiW-iIRC-JMO-M2C-YMMV-TiA-YW-GL-HH-HBd-IBSCUtWS-theWotBGUaDUaDUaD[/size]
<>[URL=www.bladeforums.com/forums/showthread.php?t=273576] [b]Tips[/b][/url] <> [i][url=www.bahai.org/]Baha'i[/url] [url=www.bahaiprayers.org/indexlong.htm]Prayers[/url] [url=http://www.bahaindex.com/]Links[/url] --[url=www.bahaiprayers.org/indexlong.htm#america]A[/url]--[url=www.bahaiprayers.org/indexlong.htm#tests]T[/url]--[url=www.bahaiprayers.org/indexlong.htm#healing]H[/url]--[url=www.bahaiprayers.org/indexlong.htm#departed]D[/url] [/i][/color]
 
Pics would be nice Dean...but if you even got the HI text, it's a blessing!

.
 
ddean said:
[this applies only to image addresses posted,
mostly image links from outside bladeforums,
not icons/avatars/page displays.
Just the links]


now if I can find a way to use this or another extension to
parse the text threads & cache all the pics[/QUOTE]

The webcrawler can download images for you if you let it. Just add a filter to accept images - *.jpg, *.gif, etc.

To download the bladeforums attachments is more difficult. The problem is that you can't see pics if you aren't logged in - and the way that webcrawler apps work, I think BF won't recognize the program as a logged in user. At least, that was a problem that my scripts had.
 
ddean said:
I think I've got a complete text-only copy
of all three khuk centered forums/archives.
Well under 300Mb total--
so it'll fit on a cd easily.
Took under 3 hr to dl on my dsl connect

Anyone else successful on any front?
It would be nice to know that we have
at least 3 good complete downloads
so nothing falls thru the cracks.

Good job.

Ddean, Khukuri Monster, anyone else with a backup, please try to verify that you have the content and post confirmation in this thread. Picking several random threads from the index and trying to find them should do for verification.

Spark has been very patient and it would be good to get back to him as soon as we can with the information that we have the old threads dulplicated. Also, the longer we drag this out the more bandwidth problems for BF, with multiple people trying to duplicate the site.
 
I'm 98% sure I've got 100% of the text-only threads
from current hi forum & archived hi forum
(I've -only- downloaded the khuk forums )
Will continue checking.

In addition to a cd-rom,
this will fit onto a 512Mb memory chip/card (Palm-top, ramdrive)
& with minimal pruning,
recent dates, etc
may well fit into 256Mb


tried a small test [single thread] mirror last night &
could not get the images to come with the thread
[current thread with available images]
tried several option, but not-a

as mentioned, images are not really a consideration
for the archives.
those pix mostly long, long gone.


anyone trying httrack (& most likely other similar)---
if you -update- a previous site download.....
the default options -delete- old content that
is gone from the original website
to accurately reflect the -current- state of the website.

Always select the option "do not delete old posts"
however it is worded
...under 'build' i think it was

When httrack updates on a site
[unless the server software can tell it
which files are updated or not]
httrack re-downloads almost everything again
so it takes almost as long as the original download.
good reasons for it,
but not for our specific use.


there are options to minimize bandwidth use
by slowing the acquisition to any degree desired


~
~~~~~~~~~
<> THEY call me
'Dean' :)-fYI-fWiW-iIRC-JMO-M2C-YMMV-TiA-YW-GL-HH-HBd-IBSCUtWS-theWotBGUaDUaDUaD
<> Tips <> Baha'i Prayers Links --A--T--H--D
 
Howard Wallace said:
Good job.

Ddean, Khukuri Monster, anyone else with a backup, please try to verify that you have the content and post confirmation in this thread. Picking several random threads from the index and trying to find them should do for verification.

I have a backup of this forum only, not the archives, and only the threads up to about November of 2002. No pictures.

I did verification as part of the procedure to make sure my script was bug-free, so it's all good. There is about 250 MB worth of files but it will compress down much, much smaller because it's mostly repetetive HTML code - an excellent candidate for any sort of compression algorithm.

ddean, your text-only stuff ought to compress a great deal also. I bet it would easily go down to 256 MB using Winzip, or possibly even better, WinRAR (www.rarlabs.com)
 
Dean, thanks for the reminder.
I had used the Wayback Machine , years ago and lost the bookmark and forgot the name.

Just did a search on it. Snapshots in time, contrary to my ol memory, thinking the whole forum was archived.

Sure hope some one can come up with a solution for a DVD or CDs. I'm hopeless in this area.
 
Personaly I think it would be a good idea if someone can duplicate Dean's results, always better to have an insurance policy ainnit.;)

I would but I don't have a clue to what Dean is doing or how to go about it. I'm even scared to backup my registry the way Kim Komando says to do it and I'm sure it's safe and foolproof.:rolleyes: :grumpy:

Again in my opinion, if Dean can't copy onto another CD or other media the hard copy needs to be sent to someone who can duplicate it for those wanting a copy.
There should be a small pittance for the one copying and sending the media on to those wanting it for their trouble and to cover their expenses like was done for Nasty when he copied DIJ's CD.

So what do y'all think? :confused: ;) :)
 
Yvsa said:
Again in my opinion, if Dean can't copy onto another CD or other media the hard copy needs to be sent to someone who can duplicate it for those wanting a copy.
There should be a small pittance for the one copying and sending the media on to those wanting it for their trouble and to cover their expenses like was done for Nasty when he copied DIJ's CD.

If we can get zipped archives I can host them for a while on my website. I have the space and extra bandwidth. Then someone can right click on a link to the zipped archive, and "save target as" to a directory on their computer.

Some AOL users can't see my website though, as AOL has some questions about my character. :D Anyone who can see the rotating graphic in my signature (which links to a graphic on my site) should also be able to download from a link to my site. I think those who can't access my site (like Berkley) already know who they are.

This would involve no extra costs on my part, so the "pittance" I would charge would be $0.00. I could concievably have to shut it down if I ran up against my monthly bandwidth allocation, but I don't think that would happen.
 
I would also be able to host it for free. umich.edu will pick up any bandwidth costs.

When I'm able to get a copy of the archives, I'll put together a little search utility that's more intelligent than the file search, allowing you to search by user name, date, etc.

I kind of wish we had a way to categorize "the best of the best" of the information stored on these archives (or even bladeforums in general!). There is really a lot of good information but it is buried and hard to search. Collect, say, the best information related to sharpening, leather care, etc. (maybe even humor!).

Then, whenever somebody came across a useful bit of information, they could put it in a category so other people would be able to find it easier in the future.

I think the best way to implement something like this would be a system like Wikipedia, where anybody can edit an entry or category.

I guess I got a little off topic there... :D
 
Khukuri Monster said:
I kind of wish we had a way to categorize "the best of the best" of the information stored on these archives (or even bladeforums in general!). There is really a lot of good information but it is buried and hard to search. Collect, say, the best information related to sharpening, leather care, etc. (maybe even humor!).

Then, whenever somebody came across a useful bit of information, they could put it in a category so other people would be able to find it easier in the future.

I think the best way to implement something like this would be a system like Wikipedia, where anybody can edit an entry or category.

I guess I got a little off topic there... :D

Those were basicly my thoughts a few years ago when I put together the FAQ. It has evolved since then, but the way I started it was to go through the archives at BF and KF, grabbing the useful content from posts (with poster's permissions) and organizing it.

It took a lot of time. I did get a lot of the old pictures (with permissions) and you can find a lot of material from the old posts there. You may have to look at the tiny links at the bottom of the FAQ index page to find some of the old content.

One option to consider (with Yangdu's permission) would be for someone to update the FAQ with all the good information that has been posted since about 2000. When you sort out the wheat from the chaff the size decreases dramatically and the ability to find stuff increases. We really do have a wealth of information here, even though there is accompanying "chit-chat."

Is this off-topic for a thread titled "do we need a historian?" I don't think so.
 
Howard Wallace said:
One option to consider (with Yangdu's permission) would be for someone to update the FAQ with all the good information that has been posted since about 2000.

I guess what I was thinking is different in one crucial point. Instead of having one person be able to update the FAQ, use a system like a Wiki so that anybody can update it when they find useful info.
 
I'll add my name to the list of people who have no idea what you all are doing, but really appreciate that you are doing it.

Thanks very much guys. It is important.

Bamboo
 
Howard Wallace said:
If we can get zipped archives I can host them for a while on my website. I have the space and extra bandwidth. Then someone can right click on a link to the zipped archive, and "save target as" to a directory on their computer.

Some AOL users can't see my website though, as AOL has some questions about my character. :D Anyone who can see the rotating graphic in my signature (which links to a graphic on my site) should also be able to download from a link to my site. I think those who can't access my site (like Berkley) already know who they are.

This would involve no extra costs on my part, so the "pittance" I would charge would be $0.00. I could concievably have to shut it down if I ran up against my monthly bandwidth allocation, but I don't think that would happen.


I can see the rotating graphic, but that is it. I get nothing by clicking on it. AOL at it again.
 
lcs37 said:
I can see the rotating graphic, but that is it. I get nothing by clicking on it. AOL at it again.

LMAO! Go into Howard's profile and click on his website, homepage and then see if it works.;)
Neat place to spend some time.:thumbup: :D
 
I've got most of the HI threads. For some reason the archiver didn't grab the first 250 (most recent), but I've got the rest of 'em.

Boy, when they said they were pruning, they weren't kidding...
 
Howard,
I now see your rotating icon, and just logged on to your web page. Maybe vehement cursing works like a form of prayer? :confused:
Berk
 
Back
Top