freenode #wikipedia


2008-01-11 08:28 * Drunken_Idiot slaps brown_cat
2008-01-11 08:28 < lc2> arcimboldo_: that's why i got so many warnings
2008-01-11 08:28 < lc2> arcimboldo_: old revisions of resized FU images are deleted
2008-01-11 08:28 < brown_cat> tanaha: yep, gets that way on the irc (i got an image of leslie, then found out she was a transsexual - that image changed O_O )
2008-01-11 08:28 * brown_cat cuddles kila
2008-01-11 08:28 < arcimboldo_> ok, but trust me that his case is different ...
2008-01-11 08:28 * lc2 shrugs, looks
2008-01-11 08:29 < debian-is-me> Hello
2008-01-11 08:29 < lc2> arcimboldo_: oh i see
2008-01-11 08:29 < debian-is-me> Anyone here?
2008-01-11 08:29 < arcimboldo_> no.
2008-01-11 08:29 < debian-is-me> I want to download and install wikipedia
2008-01-11 08:29 < debian-is-me> How?
2008-01-11 08:29 < kila> brown_cat: hehe get better hearing :P
2008-01-11 08:29 < debian-is-me> I have XAMPP
2008-01-11 08:29 < brown_cat> kila: I can't spell hannah montana (just like the wa education system says, "i tried my best and that's all that matters. Now I will study drama" xD
2008-01-11 08:29 < lc2> arcimboldo_: i see, it is different
2008-01-11 08:29 < lc2> arcimboldo_: shrug, let him delete the warnings
2008-01-11 08:29 < lc2> that won't stop them being deleted
2008-01-11 08:29 < kila> brown_cat: are you talking to Drunken_Idiot
2008-01-11 08:30 < arcimboldo_> well, he deletes the warnings in the images as well ...
2008-01-11 08:30 < lc2> arcimboldo_: okay
2008-01-11 08:30 < lc2> arcimboldo_: hum
2008-01-11 08:30 < lc2> raeeep
2008-01-11 08:31 < Drunken_Idiot> There is a star wars transformers toy :D
2008-01-11 08:38 < debian-is-me> I need to import wikipedia database to my computer
2008-01-11 08:38 < debian-is-me> howto?
2008-01-11 08:38 < lc2> download.wikimedia.org
2008-01-11 08:39 < debian-is-me> Witch one?
2008-01-11 08:39 < debian-is-me> I want it in my mysql DB
2008-01-11 08:39 < lc2> you won't get it
2008-01-11 08:40 < lc2> "We don't provide direct dumps of the new 'page', 'revision', and 'text' tables either because aggressive changes to the backend storage make this extra difficult: much data is in fact indirection pointing to another database cluster, and deleted pages which we cannot reproduce may still be present in the raw internal database blobs. The XML dump format provides forward and backward compatibility without requiring authors of third-party dump processing or s
2008-01-11 08:40 < lc2> http://meta.wikimedia.org/wiki/Data_dumps#What_happened_to_the_SQL_dumps.3F
2008-01-11 08:40 < lc2> tl;dr for good reasons
2008-01-11 08:41 < debian-is-me> So I cannot get an a little old one?
2008-01-11 08:41 < debian-is-me> Not at all?
2008-01-11 08:41 < lc2> the only one you're going to get is the current one with all the old revisions in it
2008-01-11 08:41 < lc2> and that dump's in progress, so you're not going to get that right now, either
2008-01-11 08:41 < debian-is-me> So I cannot have it?
2008-01-11 08:42 < debian-is-me> Because my school has slooow internet
2008-01-11 08:42 < lc2> okay
2008-01-11 08:42 < lc2> ummm
2008-01-11 08:42 < debian-is-me> So I want it on localhost
2008-01-11 08:42 < lc2> you do realise the database dumps are several hundred gigabytes, right
2008-01-11 08:42 < debian-is-me> So I can share, and contribute amongst friends
2008-01-11 08:42 < debian-is-me> That big?
2008-01-11 08:42 < lc2> or some shit like that
2008-01-11 08:42 < debian-is-me> Cannot fit down to 40-50?
2008-01-11 08:42 < lc2> heh
2008-01-11 08:42 < lc2> no.
2008-01-11 08:43 < debian-is-me> I have a few hundred gigs of storage at home
2008-01-11 08:43 < debian-is-me> And I can delete things I dont need
2008-01-11 08:44 < debian-is-me> Cant I use the xml things?
2008-01-11 08:44 < lc2> you have to install mediawiki then import the xml dumps
2008-01-11 08:44 < debian-is-me> I have mediawiki
2008-01-11 08:44 < lc2> downloading the dumps will take forever, importing them alone will probably take several days too
2008-01-11 08:44 < debian-is-me> So witch one?
2008-01-11 08:44 < debian-is-me> Link?
2008-01-11 08:44 < lc2> and as i just said
2008-01-11 08:44 < lc2> oh sigh
2008-01-11 08:45 < debian-is-me> http://download.wikimedia.org/backup-index.html
2008-01-11 08:45 < debian-is-me> There
2008-01-11 08:45 < lc2> well done
2008-01-11 08:46 < lc2> http://download.wikimedia.org/enwiki/latest/
2008-01-11 08:46 < lc2> okay
2008-01-11 08:46 < lc2> the abstracts are 1.7 gig compressed, that's just a few sentences from the lead of each article
2008-01-11 08:46 < debian-is-me> oO
2008-01-11 08:47 < debian-is-me> So I need every single one?
2008-01-11 08:47 < lc2> *compressed*, you can multiply that several times when you take into account decompression, storing it in mysql
2008-01-11 08:47 < lc2> no, you don't
2008-01-11 08:47 < debian-is-me> Ok
2008-01-11 08:47 < debian-is-me> Some of them are gunzip some are text
2008-01-11 08:48 < lc2> if your issue is slow loading
2008-01-11 08:48 < debian-is-me> Shoul I only have one version of every folder?
2008-01-11 08:48 < debian-is-me> file
2008-01-11 08:48 < lc2> uh hold on
2008-01-11 08:48 < Drunken_Idiot> PUZZ 3-D TOWERS MADE-TO-SCALE COLLECTION allows you to see how the world's most impressive towers measure up to one another...and they GLOW in the dark!
2008-01-11 08:49 < lc2> debian-is-me: have you thought about using wapedia.mobi if bandwidth is an issue?
2008-01-11 08:49 < debian-is-me> No
2008-01-11 08:49 < lc2> debian-is-me: because if bandwidth is an issue
2008-01-11 08:49 < lc2> the last thing you want to do is download a wikipedia database dump
2008-01-11 08:49 < debian-is-me> I think FF is trying to open the xml file!
2008-01-11 08:49 < lc2> oh my.
2008-01-11 08:50 < lc2> may i offer thee a protip?
2008-01-11 08:50 < lc2> oh, too late
2008-01-11 08:50 < kenlyric> heh
2008-01-11 08:50 < lc2> you know, if someone needs help *finding* the database dumps (let alone actually installing them)
2008-01-11 08:50 < lc2> then they shouldn't be downloading database dumps
2008-01-11 08:51 < kenlyric> I don't think he is debian...
2008-01-11 08:52 < lc2> wb
2008-01-11 08:52 < debian-is-me> Hello
2008-01-11 08:52 < debian-is-me> It crashed
2008-01-11 08:52 < lc2> debian-is-me: "it"?
2008-01-11 08:52 < debian-is-me> FF
2008-01-11 08:52 < lc2> imagine that
2008-01-11 08:52 < debian-is-me> Can I have the link again?
2008-01-11 08:52 < debian-is-me> It shouldnt have opened the xml file
2008-01-11 08:52 < lc2> debian-is-me: 13:51 < lc2> may i offer thee a protip?
2008-01-11 08:53 < debian-is-me> ok
2008-01-11 08:53 < debian-is-me> what?
2008-01-11 08:53 < lc2> if you need help just *finding* the database dumps, let alone installing them or anything else
2008-01-11 08:53 < lc2> then don't download a database dump
2008-01-11 08:53 < debian-is-me> Why not?
2008-01-11 08:54 < Lycurgus> there ought to be a word for that
2008-01-11 08:54 < lc2> if you need to ask that question, then *definitely* don't download one
2008-01-11 08:54 < debian-is-me> But what else can I do?
2008-01-11 08:54 < lc2> debian-is-me: use wapedia.mobi if you're bandwidth-constrained
2008-01-11 08:54 < Aqwis2> get a new school ;)
2008-01-11 08:55 < debian-is-me> Ok, maybe bandwith isnt really the biggest issue. At tests we are allowed to use our laptops, but not the internet. We are allowed to use any files we have samed on our computers
2008-01-11 08:56 < debian-is-me> So if I save the hole wikipedia, I have answers to everything
2008-01-11 08:56 < lc2> debian-is-me: haha
2008-01-11 08:56 < lc2> hahaha
2008-01-11 08:56 < Aqwis2> just copy relevant articles to files? you surely won't need every article
2008-01-11 08:56 < debian-is-me> Then I will have to work more than oncd
2008-01-11 08:56 < debian-is-me> once
2008-01-11 08:56 < lc2> "i can't use the internet, so instead, i'll save it to my laptop beforehand"
2008-01-11 08:56 < lc2> srsly
2008-01-11 08:56 < debian-is-me> Yes
2008-01-11 08:56 < lc2> you don't see the problem there?
2008-01-11 08:56 < debian-is-me> It is geniously
2008-01-11 08:56 < Drunken_Idiot> Why do Scotsmen wear kilts? Because the sound of zippers scares the sheep away.
2008-01-11 08:57 < debian-is-me> Have a big encyclpoedia on my computer
2008-01-11 08:57 < lc2> debian-is-me: you're basically asking us to help you cheat
2008-01-11 08:57 < Aqwis2> no need to post that in *both* channels, Drunken_Idiot ;(
2008-01-11 08:57 < lc2> all the moar reason i'm not going to help you
2008-01-11 08:57 < Aqwis2> legal, but immoral
2008-01-11 08:57 < Aqwis2> ;<
2008-01-11 08:57 < Lycurgus> hoot Man!
2008-01-11 08:58 < debian-is-me> Immoral? We are allowed to copy anything we please to our harddrives
2008-01-11 08:58 < debian-is-me> And as big a portion of the web as we can
2008-01-11 08:58 < Zach_> WP QOTD: "I'm a doctoral student at NYU, doing research and preparing for a dissertation I'll have to defend at length before a committee of professionals in my field. But I have no clue on how to win an argument on wikipedia."
2008-01-11 08:58 < Aqwis2> then why aren't you allowed to use the internet, debian-is-me?
2008-01-11 08:59 < debian-is-me> Because they want us to prepare
2008-01-11 08:59 < debian-is-me> and reed the stuff
2008-01-11 08:59 < Luna-San> Zach_: Heh, too true. ;) We're a tough nut to crack.
2008-01-11 08:59 < Aqwis2> odd.
2008-01-11 08:59 < lc2> debian-is-me: downloading the whole of wikipedia is not "preparing"
2008-01-11 08:59 < Weaselosaurus> why hello
2008-01-11 08:59 < lc2> that's just lazy
2008-01-11 08:59 < lc2> hi Weaselosaurus
2008-01-11 08:59 < Weaselosaurus> :-)
2008-01-11 09:00 < debian-is-me> <I will be prepared
2008-01-11 09:00 < lc2> debian-is-me: hokay
2008-01-11 09:00 < Drunken_Idiot> Why would anyone want to download the whole wikipedia .. you cant sell it :p
2008-01-11 09:00 < kenlyric> you know, they have the internet on computers now.
2008-01-11 09:00 < debian-is-me> I can!
2008-01-11 09:00 < lc2> Drunken_Idiot: "preparing" for an exam"
2008-01-11 09:01 < debian-is-me> And I will!
2008-01-11 09:01 < lc2> -!
2008-01-11 09:01 < lc2> -(-!)-"
2008-01-11 09:01 < lc2> goodness
2008-01-11 09:01 < Drunken_Idiot> Wikipedia isn't 100% accurate and right in the articles.
2008-01-11 09:01 < lc2> Drunken_Idiot: he's allowed to use computers, but not the internet, so he's just going to bring the internet with him
2008-01-11 09:01 < debian-is-me> kenlyric: I need it to my exams
2008-01-11 09:01 < Aqwis2> Drunken_Idiot, it's probably far more accurate than needed for his use
2008-01-11 09:01 < debian-is-me> It is
2008-01-11 09:01 < debian-is-me> I only need a clue about what to write
2008-01-11 09:02 < Drunken_Idiot> debian-is-me: You can write on NotACow
2008-01-11 09:02 < Drunken_Idiot> I meant NotASpy
2008-01-11 09:02 < nazgjunk> actually preparing gives clues enough, i'd say
2008-01-11 09:02 < kenlyric> actually, this is fairly devious.
2008-01-11 09:02 < Aqwis2> but i still don't see why you can't just download relevant articles
2008-01-11 09:02 < arcimboldo_> Just choose [[Special:Random]] and write about that topic.
2008-01-11 09:02 < debian-is-me> Then I would have to do that for every single test
2008-01-11 09:02 < Drunken_Idiot> Aqwis2: Upload = Consumption high slows down the server.
2008-01-11 09:02 < Drunken_Idiot> NotASpy: Making you famous :D
2008-01-11 09:02 < kenlyric> I'm waiting for the story after the exam where he failed for cheating :)
2008-01-11 09:02 < Aqwis2> Drunken_Idiot, i'm not sure why you tell me that
2008-01-11 09:02 < debian-is-me> 13mb downloaded, oly 3500 left
2008-01-11 09:03 < lc2> debian-is-me: yeah, downloading THE WHOLE OF FUCKING WIKIPEDIA makes more sense than reading one wikipedia article before each test?
2008-01-11 09:03 < debian-is-me> yes
2008-01-11 09:03 < kenlyric> or, you know, actually learning the stuff you're being tested on.
2008-01-11 09:03 < lc2> okay.
2008-01-11 09:03 < debian-is-me> Wikipedia isnt that big
2008-01-11 09:03 < lc2> i don't quite follow your logic, cleric
2008-01-11 09:03 < lc2> debian-is-me: !! it's fucking enormous
2008-01-11 09:03 < debian-is-me> 4.5gb
2008-01-11 09:03 < debian-is-me> my version is 3.3gb
2008-01-11 09:04 < debian-is-me> Not much space
2008-01-11 09:04 * Drunken_Idiot buh bye have to give lectures on wikipedia ( seriously).
2008-01-11 09:04 < lc2> debian-is-me: it's a little bit moar than that.
2008-01-11 09:04 < lc2> i assure you
2008-01-11 09:05 < debian-is-me> You think this is cheating?
2008-01-11 09:05 < kenlyric> it's all text, so it certainly must compress very very small.
2008-01-11 09:05 < Lycurgus> other than the culture of mathematics, and least of all general american culture, I can't think of a single one where the argumentation capability has the significance a rationalist would desire.
2008-01-11 09:05 < lc2> debian-is-me: the stubs alone are 6.2gb
2008-01-11 09:05 < lc2> debian-is-me: and yes, i do
2008-01-11 09:05 < Lycurgus> s/the/
2008-01-11 09:05 < kenlyric> so if it's 3gb compressed...
2008-01-11 09:05 < lc2> (that's 6.2gb compressed)
2008-01-11 09:05 < Lycurgus> s/the//
2008-01-11 09:05 < lc2> kenlyric: if you're finding one which is 3gb you are doing it wrong
2008-01-11 09:05 < kenlyric> I wouldn't be concerned what we think. I'd be concerned what your teacher will think. :)
2008-01-11 09:05 < kenlyric> lc2: he said his was 3gb.
2008-01-11 09:05 < lc2> oops
2008-01-11 09:05 < debian-is-me> no
2008-01-11 09:05 < lc2> kenlyric: yeah, i meant him not you sorry ;\
2008-01-11 09:06 < debian-is-me> google compresses tings to a 1/10
2008-01-11 09:06 < lc2> debian-is-me: the point of having no internet is so that you can't get easy answers from, say, wikipedia
2008-01-11 09:06 * kenlyric crashes into wall
2008-01-11 09:06 < kenlyric> google?
2008-01-11 09:06 < lc2> wikipedia is the internet
2008-01-11 09:06 < Lycurgus> and the WP troll culture or the snake pit of a general doctoral dissertation certainly not
2008-01-11 09:07 < debian-is-me> lc2: google have stored most of wikipedia
2008-01-11 09:07 < lc2> debian-is-me: you're missing my point
2008-01-11 09:07 < debian-is-me> 1% of wikipedia downloaded
2008-01-11 09:07 < kenlyric> are you sure you're debian?
2008-01-11 09:07 < debian-is-me> ?
2008-01-11 09:08 < debian-is-me> oh
2008-01-11 09:08 < debian-is-me> My nick
2008-01-11 09:09 < debian-is-me> But this is so genious that nobody have thought about it before
2008-01-11 09:09 < kenlyric> hahaha
2008-01-11 09:09 < lc2> that's "genius", and no.
2008-01-11 09:09 < kenlyric> I award you 10 points of fail.
2008-01-11 09:09 < Lycurgus> well give ir some credik
2008-01-11 09:09 < lc2> kenlyric: i'll add 5 points of aids to that
2008-01-11 09:09 < Lycurgus> for identification with a product worth identifying with.
2008-01-11 09:09 < debian-is-me> Wish I had an 220TB hd
2008-01-11 09:10 < kenlyric> I once took a test where the teacher allowed us to bring one page filled with notes, formulas, etc.
2008-01-11 09:10 < kenlyric> I printed the page at 3pt font.
2008-01-11 09:10 < lc2> kenlyric: *that* is genius
2008-01-11 09:10 < kenlyric> I think I tried 2 and it all blurred together.
2008-01-11 09:10 < debian-is-me> kenlyric: I got an assignment like that
2008-01-11 09:10 < Lycurgus> was the page size specified?
2008-01-11 09:11 < lc2> Lycurgus: haha
2008-01-11 09:11 < lc2> B0+
2008-01-11 09:11 < lc2> fgj
2008-01-11 09:11 < kenlyric> yes.
2008-01-11 09:11 < debian-is-me> I scaned the etire chapter, and pritented with small writing on both sides
2008-01-11 09:11 < lc2> kenlyric: were microfiche readers allowed?
2008-01-11 09:11 < lc2> ;x
2008-01-11 09:12 < Lycurgus> mindless testing calls for cheating
2008-01-11 09:12 < kenlyric> this wasn't mindless, or cheating.
2008-01-11 09:12 < Lycurgus> like the Kobayashi Maru manoever
2008-01-11 09:12 < kenlyric> knowing the forumulas certainly didn't make the test a breeze :)
2008-01-11 09:13 < debian-is-me> Many of my tests preperation sheets can be downloaded from the net
2008-01-11 09:14 < Lycurgus> debian-is-me: is this a US school?
2008-01-11 09:14 < debian-is-me> done Articles, templates, image descriptions, and primary meta-pages.
2008-01-11 09:14 < debian-is-me> * This contains current versions of article content, and is the archive most mirror sites will probably want.
2008-01-11 09:14 < debian-is-me> * pages-articles.xml.bz2 3.2 GB
2008-01-11 09:14 < debian-is-me> No
2008-01-11 09:14 < debian-is-me> Norwegian
2008-01-11 09:14 < debian-is-me> It contains article content
2008-01-11 09:15 < debian-is-me> Witch is articles
2008-01-11 09:15 < debian-is-me> And it is onlt 3.2gb
2008-01-11 09:15 < Aqwis2> debian-is-me, heh
2008-01-11 09:15 < Aqwis2> debian-is-me, what level?
2008-01-11 09:15 < debian-is-me> level?
2008-01-11 09:15 < Lycurgus> of static wikipedia
2008-01-11 09:15 < Aqwis2> like
2008-01-11 09:16 < debian-is-me> 2008-01-09 00:25:32
2008-01-11 09:16 < Aqwis2> debian-is-me, ungdomsskole/vgs/høgskole/universitet
2008-01-11 09:16 < debian-is-me> vgs
2008-01-11 09:16 < Aqwis2> ah
2008-01-11 09:16 < debian-is-me> Norsk
2008-01-11 09:16 < Aqwis2> 1/2/3nd year?
2008-01-11 09:16 < debian-is-me> 1st
2008-01-11 09:16 < Aqwis2> where?
2008-01-11 09:16 < debian-is-me> Lata
2008-01-11 09:16 < debian-is-me> Alta
2008-01-11 09:16 < Aqwis2> ah
2008-01-11 09:16 < debian-is-me> Hope youre not the IT guy...
2008-01-11 09:17 < Aqwis2> no, i've never even been to Alta
2008-01-11 09:17 < debian-is-me> Good
2008-01-11 09:17 < Aqwis2> :)
2008-01-11 09:17 < debian-is-me> He is quite stupid
2008-01-11 09:17 < debian-is-me> He "formated" my hd, but debian didnt dissapear
2008-01-11 09:17 < Aqwis2> eheh
2008-01-11 09:18 < debian-is-me> When a battery failed, then he reinstalled windows
2008-01-11 09:18 < debian-is-me> After that, since it didnt solve the problem. Then he ordered a new HD.
2008-01-11 09:18 < debian-is-me> Dumbass
2008-01-11 09:19 < lc2> debian-is-me: btw, http://en.wikipedia.org/wiki/Help:Import
2008-01-11 09:19 < Viele-baeren> hi
2008-01-11 09:19 < debian-is-me> I cannot understand why they would hire someone that stupid.
2008-01-11 09:19 < Lycurgus> hi Viele-baeren
2008-01-11 09:20 < Lycurgus> vgs is a vocational track?
2008-01-11 09:20 < Viele-baeren> hi Lycurgus
2008-01-11 09:20 < lc2> hi Viele-baeren
2008-01-11 09:20 < debian-is-me> It is the school betwen university and high school
2008-01-11 09:20 < Viele-baeren> hi lc2
2008-01-11 09:20 < Aqwis2> vgs is high school :p
2008-01-11 09:20 < Lycurgus> oh, like a junior or community college in US
2008-01-11 09:21 < debian-is-me> No, first is primary school
2008-01-11 09:21 < debian-is-me> Then high school
2008-01-11 09:21 < debian-is-me> then college
2008-01-11 09:21 < debian-is-me> then university
2008-01-11 09:21 < Aqwis2> heh, no
2008-01-11 09:21 < debian-is-me> Right?
2008-01-11 09:21 < Aqwis2> vgs is "upper secondary school"
2008-01-11 09:21 < kenlyric> sixth form college
2008-01-11 09:21 < Aqwis2> according to the Ministry of
2008-01-11 09:21 < Aqwis2> Education and Research
2008-01-11 09:22 < debian-is-me> I lived in Scotland a while back
2008-01-11 09:22 < Aqwis2> ages 16-19
2008-01-11 09:22 < kenlyric> I learned it from AbFab!
2008-01-11 09:22 < debian-is-me> ok
2008-01-11 09:22 < debian-is-me> This is brilliant
2008-01-11 09:22 < debian-is-me> Can I update my wikipedia?
2008-01-11 09:23 < debian-is-me> Or would I have to download all over again?
2008-01-11 09:23 < kenlyric> just make your own edits. Create your own reality.
2008-01-11 09:23 < kenlyric> play the jimbo role-playing game
2008-01-11 09:23 < lc2> debian-is-me: download all over again
2008-01-11 09:24 < debian-is-me> Much work
2008-01-11 09:24 < lc2> yes
2008-01-11 09:24 < debian-is-me> How often, once a year?
2008-01-11 09:25 < quanticle> debina-is-me: You have your own local copy of Wikipedia?
2008-01-11 09:25 < lc2> i don't know how often dumps are generated
2008-01-11 09:25 < lc2> quanticle: that's what he's working on
2008-01-11 09:25 < quanticle> lc2: Why?
2008-01-11 09:25 < lc2> quanticle: he's allowed a laptop in tests, but no internet, so he's downloading the internet instead
2008-01-11 09:25 < debian-is-me> exsactly
2008-01-11 09:25 < quanticle> lc2: I like it
2008-01-11 09:25 < lc2> quanticle: quite cunning
2008-01-11 09:26 < quanticle> debian-is-me: Its all about the loopholes, isn't it?
2008-01-11 09:26 < debian-is-me> yes
2008-01-11 09:26 < debian-is-me> If I had a 220TB hd I would have downloaded google
2008-01-11 09:26 < lc2> heh
2008-01-11 09:27 < debian-is-me> Or just use the unsecured wireless network
2008-01-11 09:27 < debian-is-me> Both works fine
2008-01-11 09:27 < Aqwis2> you'd need far more than 220TB =/
2008-01-11 09:27 < debian-is-me> Nop
2008-01-11 09:27 < debian-is-me> In 2006
2008-01-11 09:27 < debian-is-me> Google used 220 for the pages
2008-01-11 09:27 < Aqwis2> this is 2008 though ;(
2008-01-11 09:27 < lc2> i find that difficult to believe
2008-01-11 09:28 < debian-is-me> But it is compressed to a 1/10
2008-01-11 09:28 < lc2> and what Aqwis2 said
2008-01-11 09:28 < debian-is-me> so it is really 2200TB
2008-01-11 09:28 < debian-is-me> Yes, but the internet inst twice the size
2008-01-11 09:28 < lc2> they don't usually operate from disk, anyway
2008-01-11 09:28 < debian-is-me> If I have one 220TB HD I can get a nother one
2008-01-11 09:28 < Aqwis2> the internet isn't twice the size
2008-01-11 09:28 < Aqwis2> the internet is 1000x the size
2008-01-11 09:28 < Aqwis2> at least.
2008-01-11 09:29 < lc2> debian-is-me: don't bet on it, think of all the spam pages from shitbag "SEOs"
2008-01-11 09:29 < debian-is-me> I hate when google returns search pages in it resoluts
2008-01-11 09:29 < lc2> yeah me too
2008-01-11 09:29 < debian-is-me> google has 450000servers
2008-01-11 09:30 < debian-is-me> 4500000*80=
2008-01-11 09:30 < lc2> that's a conservative estimate
2008-01-11 09:30 < lc2> what's with the 80?
2008-01-11 09:30 < debian-is-me> a cheap hd
2008-01-11 09:30 < lc2> most of their servers do not keep stuff on disk
2008-01-11 09:30 < debian-is-me> that si : 36000000gb
2008-01-11 09:30 < Aqwis2> because google using 80gb hds is so likely
2008-01-11 09:31 < debian-is-me> They use it in ram?
2008-01-11 09:31 < lc2> yes
2008-01-11 09:31 < debian-is-me> That is 36000TB
2008-01-11 09:31 < debian-is-me> So the web cannot be much bigger than that
2008-01-11 09:31 < lc2> the web is bigger than that
2008-01-11 09:31 < debian-is-me> uncompressed
2008-01-11 09:31 < debian-is-me> Not the text pages
2008-01-11 09:31 < debian-is-me> video might be
2008-01-11 09:32 < lc2> there's plenty that doesn't get indexed
2008-01-11 09:32 < debian-is-me> but not the cached pages without images
2008-01-11 09:32 < Aqwis2> http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/internet.htm
2008-01-11 09:32 < Aqwis2> "Table 8.1: The size of the Internet in terabytes."
2008-01-11 09:32 < Aqwis2> Total: 532,897
2008-01-11 09:32 < debian-is-me> lc2: But I will never find them anyway
2008-01-11 09:32 < kenlyric> google doesn't have all of the web.
2008-01-11 09:32 < Aqwis2> 532 897 tb
2008-01-11 09:32 < kenlyric> robots.txt
2008-01-11 09:32 < Aqwis2> have fun downloading it
2008-01-11 09:32 < debian-is-me> Aqwis: Movies, images, sound, a lot of useless stuff on an exam
2008-01-11 09:32 < lc2> internet.
2008-01-11 09:32 < lc2> srs business.
2008-01-11 09:32 < debian-is-me> Linux distros alone is several TB
2008-01-11 09:33 < kenlyric> Aqwis2: the web is 92,000
2008-01-11 09:33 < Aqwis2> kenlyric, yes i can see that
2008-01-11 09:33 < Aqwis2> lots of emails
2008-01-11 09:33 < Aqwis2> anyway, that's 2002 numbers
2008-01-11 09:33 < Aqwis2> the numbers are probably far higher now
2008-01-11 09:33 < debian-is-me> surface web 160TB
2008-01-11 09:33 < kenlyric> and what, 95% of email is spam?
2008-01-11 09:34 < debian-is-me> That is about right for what I said'
2008-01-11 09:34 < Aqwis2> what is "surface web", though?
2008-01-11 09:34 < kenlyric> that a buncha-muchu-cruncha lot of spam.
2008-01-11 09:34 < kenlyric> Aqwis2: what it says.
2008-01-11 09:34 < debian-is-me> instant messagin isnt on the web
2008-01-11 09:34 < lc2> Aqwis2: that which isn't the unindexable deep web
2008-01-11 09:34 < kenlyric> Note that the Web consists of the surface web (fixed web pages) and what Bright Planet calls the deep web (the database driven websites that create web pages on demand).
2008-01-11 09:34 < debian-is-me> It is traffick
2008-01-11 09:34 < lc2> yeah what kenlyric said
2008-01-11 09:34 < Aqwis2> hmz
2008-01-11 09:34 < lc2> i think they're full of shit though
2008-01-11 09:34 < lc2> by definition
2008-01-11 09:34 < lc2> the deep web is that which can't be discovered automatically
2008-01-11 09:35 < debian-is-me> 167TB, isnt that quite close, since my numbers is 220TB
2008-01-11 09:35 < lc2> so there's no way they can know
2008-01-11 09:35 < debian-is-me> and that was 2003, mine are from 2005
2008-01-11 09:35 < lc2> debian-is-me: ur numbers is sux
2008-01-11 09:35 < lc2> nobody knows how big the internet is
2008-01-11 09:35 < lc2> ^D
2008-01-11 09:35 < debian-is-me> Aqwiz2: Confirmed my statement
2008-01-11 09:35 < Aqwis2> it's Aqwis
2008-01-11 09:35 < Aqwis2> sigh
2008-01-11 09:35 < debian-is-me> Google has 220TB in 2005
2008-01-11 09:35 < lc2> and besides
2008-01-11 09:35 < Aqwis2> use your <tab>
2008-01-11 09:35 < lc2> there's only one site on the internet that actually matters
2008-01-11 09:36 < lc2> WIKI FUCKING PEDIA
2008-01-11 09:36 < debian-is-me> why use tab?
2008-01-11 09:36 < lc2> let's not talk about the others
2008-01-11 09:36 < debian-is-me> lc2:
2008-01-11 09:36 < debian-is-me> lc2:
2008-01-11 09:36 < Aqwis2> http://blog.karmona.com/index.php/2007/09/26/the-size-of-the-internet/
2008-01-11 09:36 < Aqwis2> "= ~1300* Tera = ~1.3 Peta of known/indexed web which might hide a ~600 Peta of deeper web…"
2008-01-11 09:36 < debian-is-me> It only shows lc2
2008-01-11 09:36 < lc2> Aqwis2: [[WP:RS]]? ;P
2008-01-11 09:36 < Aqwis2> very rough though
2008-01-11 09:36 < Aqwis2> indeed, lc2 :p
2008-01-11 09:36 < debian-is-me> lc2:
2008-01-11 09:36 < debian-is-me> lc2: Why does ta write lc2:?
2008-01-11 09:37 < Sfan00> Where was the Wikipedia growth grpah?
2008-01-11 09:37 < debian-is-me> The size of the Web is 800 million pages, and the biggest search engine only covers about 16% of it.”'
2008-01-11 09:37 < debian-is-me> Bullshit
2008-01-11 09:37 < debian-is-me> Google has 3 billion
2008-01-11 09:37 < kenlyric> written in 1999.
2008-01-11 09:37 < debian-is-me> yahoo has 17 billions
2008-01-11 09:37 < debian-is-me> Oh
2008-01-11 09:37 < Aqwis2> mmm
2008-01-11 09:37 < kenlyric> that's what the "1999" means on the very next line.
2008-01-11 09:38 < Sfan00> Google is in for a nasty shock when the web bubble bursts
2008-01-11 09:38 < lc2> Sfan00: you mean when everyone stops using the web to look for stuff?
2008-01-11 09:38 < lc2> yeah, serious trouble
2008-01-11 09:38 < Aqwis2> hehe
2008-01-11 09:38 < Sfan00> lc2: No when people actualy start to use the web sensibly
2008-01-11 09:38 < Sfan00> ;)
2008-01-11 09:38 < lc2> Sfan00: like going straight to wikipedia?
2008-01-11 09:38 < lc2> you're right
2008-01-11 09:38 < Sfan00> and Search engines are decalred public domain ;)
2008-01-11 09:39 < debian-is-me> google searches wikipedia better than wikipedia
2008-01-11 09:39 < kenlyric> Sfan00: so, you mean "never"
2008-01-11 09:39 < lc2> debian-is-me: yes it does
2008-01-11 09:39 < debian-is-me> Results 1 - 50 of about 6,770,000,000 for e [definition]. (0.13 seconds)
2008-01-11 09:39 < lc2> but that's what the "Go" button is about
2008-01-11 09:39 < debian-is-me> It is almost 7 billion pages
2008-01-11 09:39 < debian-is-me> in google
2008-01-11 09:39 < lc2> Results 1 - 10 of about 12,070,000,000 for a.
2008-01-11 09:39 < Aqwis2> :p
2008-01-11 09:39 < debian-is-me> WTF?
2008-01-11 09:40 < debian-is-me> a is used less than e
2008-01-11 09:40 < Aqwis2> only in english and several other germanic languages afaik?
2008-01-11 09:40 < lc2> GOOGLE DOES NOT AGREE WITH YOUR ASSESSMENT
2008-01-11 09:40 < debian-is-me> Results 1 - 50 of about 1,090,000,000 for e OR a. (0.27 seconds)
2008-01-11 09:40 < lc2> debian-is-me: wow, what the shit
2008-01-11 09:40 < debian-is-me> How is that possibole?
2008-01-11 09:40 < debian-is-me> less for e or a than for e
2008-01-11 09:41 < debian-is-me> Results 1 - 50 of about 3,450,000,000 for e or a. (0.42 seconds)
2008-01-11 09:41 < debian-is-me> With lowercase it is more
2008-01-11 09:41 < Aqwis2> http://re.search.wikia.com/search#e tbh
2008-01-11 09:41 < debian-is-me> http://re.search.wikia.com/search#wikipedia
2008-01-11 09:42 < debian-is-me> Thats a funny resoult
2008-01-11 09:42 < debian-is-me> the wikipedia.de
2008-01-11 09:42 < Aqwis2> en-wikipedia does not exist :)
2008-01-11 09:42 < debian-is-me> instead of .org
2008-01-11 09:42 < debian-is-me> Wikpedia.de is both first and 5
2008-01-11 09:42 < debian-is-me> th
2008-01-11 09:43 < debian-is-me> Maybe not the best search engine in the world?
2008-01-11 09:43 < datura> lol
2008-01-11 09:43 < Aqwis2> it's the other search engines that suck
2008-01-11 09:44 < Aqwis2> irrelevant results > relevant results
2008-01-11 09:44 < debian-is-me> Nutch is best
2008-01-11 09:44 < debian-is-me> Maybe I should just index the web myself
2008-01-11 09:45 < debian-is-me> First i need comprimation method that can compress everything down to 50gb
2008-01-11 09:45 < debian-is-me> 6% of wikipedia downloaded
2008-01-11 09:45 < pengo> debian-is-me: easy. just index wikipedia. the rest of the web is irrelevant anyway
2008-01-11 09:46 < debian-is-me> pengo: I have started the download
2008-01-11 09:46 < debian-is-me> pengo: Done in 11 houres
2008-01-11 09:46 < datura> debian-is-me: but first you need to hack together some 100 stemmers
2008-01-11 09:46 < debian-is-me> stemmers?
2008-01-11 09:47 < debian-is-me> voters?
2008-01-11 09:47 < datura> debian-is-me: http://en.wikipedia.org/wiki/Stemming
2008-01-11 09:47 < debian-is-me> ok
2008-01-11 09:47 < debian-is-me> No
2008-01-11 09:47 < datura> stemming is essential for a search engine.
2008-01-11 09:48 < debian-is-me> I have the entire wikipedia on my computer in 12 houres
2008-01-11 09:48 < debian-is-me> And if I wanted a search engine I would use open source nutch
2008-01-11 09:49 < debian-is-me> Silence...
2008-01-11 09:50 < debian-is-me> Then I'll go eat
2008-01-11 09:52 < Tony_Sidaway> What's all this fuss about rollback?
2008-01-11 09:59 < lc2> Tony_Sidaway: i think someone just wanted to cause some drama
2008-01-11 09:59 < lc2> seriously, most arguments on wikipedia are just people who like starting fights
2008-01-11 09:59 < lc2> i'm not even going to vote on that shit for fear of feeding them
2008-01-11 10:00 < arcimboldo_> that's not true!!! nobody fights!!!
2008-01-11 10:02 < Tony_Sidaway> I'm really rather surprised. I find it difficult to see rollback powers as controversial or meriting a special process.
2008-01-11 10:02 < Tony_Sidaway> If somebody wants rollback, give it to them. If they abuse it. take it away. Full stop.
2008-01-11 10:03 < lc2> indeed
2008-01-11 10:03 < NotACow> mooooooooo
2008-01-11 10:04 < NotACow> Tony_Sidaway: that's nt what it'll turn into though
2008-01-11 10:04 < lc2> oh talking of drama-raisers, hi NotACow
2008-01-11 10:04 < lc2> .. :P
2008-01-11 10:04 < NotACow> Tony_Sidaway: before long RFR will be as controversial as RFA, and a successful RFR will be a full-stop requirement for RFA>
2008-01-11 10:05 < quanticle> NotACow: Its the rule of all societies. At first everything is pretty free and open, but as time goes on, the requirements to get privileges increase
2008-01-11 10:06 < Tony_Sidaway> NotACow: well that's what I mean by "all this fuss". It seems we're getting instruction creep on features that have just been introduced!
2008-01-11 10:07 < quanticle> Tony_Sidaway: How much fuss is there over RFR right now?
2008-01-11 10:07 < Tony_Sidaway> It's being inculcated into the silly pecking order the newbies are organising
2008-01-11 10:08 < Tony_Sidaway> quanticle: there was a six-day discussion (somewhat controversial in itself), the feature was turned on and then somebody started a vote, which turned into an edit war.
2008-01-11 10:08 < datura> *sigh* just drop the damn undo and give everyone a revert button. make it easy to do the right thing for the good editors instead of difficult for the few bad ones.
2008-01-11 10:08 < Tony_Sidaway> Meanwhile there is a request for arbitration (which is stalling)
2008-01-11 10:08 < NotACow> especially since the devs were told dthat there was consensus based on a 2:1 margin
2008-01-11 10:08 < datura> to do bad things.
2008-01-11 10:08 < NotACow> in general 2:1 is not considered "consensus" on wikipedia
2008-01-11 10:08 < quanticle> Tony_Sidaway: Heh, what do you expect? This is Wikipedia
2008-01-11 10:08 < quanticle> NotACow: Who told the devs that?
2008-01-11 10:09 < Tony_Sidaway> Consensus isn't an issue here as far as I'm concerned. It's a good feature to have, and if somebody doesn't want to use it he doesn't have to.
2008-01-11 10:09 < NotACow> quanticle: ryan posthistlewait and majorly, iirc
2008-01-11 10:09 < NotACow> Tony_Sidaway: my opinion is that rollback should be granted to all autoconfirmed users, and removed when abused.
2008-01-11 10:09 < Tony_Sidaway> It's like asking for consensus on the ability to make edit summaries. They can be abused too, and cause permanent damage unlike rollback.
2008-01-11 10:09 < NotACow> granted automatically
2008-01-11 10:10 < White_Cat> Tony_Sidaway well
2008-01-11 10:10 < White_Cat> all users should have move rights rollback and etc
2008-01-11 10:10 < Tony_Sidaway> If somebody actually held a discussion on edit summaries, I bet it would be a close run thing. :)
2008-01-11 10:10 < White_Cat> this can be disabled on abuse
2008-01-11 10:10 < White_Cat> having a vote over it is stupid
2008-01-11 10:11 < White_Cat> Tony_Sidaway people can revert war w/o this tool
2008-01-11 10:11 < Tony_Sidaway> Yes. If somebody edit wars with rollback, he's asking for a good long block.
2008-01-11 10:11 < White_Cat> 3rr applies
2008-01-11 10:11 < quanticle> White_Cat: I agree. I don't know what Ryanpostlethewait and otherrs are going on about. So what if there wasn't 100% consensus? If you have to wait for 100% consensus before doing anything, you'll be waiting forever...
2008-01-11 10:11 < Tony_Sidaway> Unlike other reverts, a rollback leaves no room for an edit summary.
2008-01-11 10:12 < White_Cat> Tony_Sidaway right
2008-01-11 10:12 < White_Cat> why need an edit summary on a revert
2008-01-11 10:12 < Tony_Sidaway> White_Cat: to explain why you're undoing somebody else's edit.
2008-01-11 10:12 < White_Cat> Reverts should not be used
2008-01-11 10:12 < quanticle> White_Cat: Especially because all legitimate reverts will be used to stop vandals
2008-01-11 10:12 < datura> faster revert for everyone has an additional benefit: editwarriors run into 3rr faster *g*
2008-01-11 10:13 < Doc_glasgow> feel free to add [[User:Doc glasgow/List of ways wikipedia is like a public toilet]]
2008-01-11 10:13 < Tony_Sidaway> Anyway I'm (probably) not going to comment on the wiki because it'll just add to the noise.
2008-01-11 10:13 < White_Cat> Tony_Sidaway: http://en.wikipedia.org/wiki/Help:Reverting#Do_not
2008-01-11 10:13 < lc2> Doc_glasgow: i'm going to vandalise it
2008-01-11 10:13 < White_Cat> if people observe that we dont need an edit summary on reverts
2008-01-11 10:14 < White_Cat> Tony_Sidaway people can post their rationale for the revert on the talk page
2008-01-11 10:14 * Tony_Sidaway adds "cos you can meet lots of nasty men and have pervy sex with them" to Doc_glasgow's page.
2008-01-11 10:14 < White_Cat> Reverting is used primarily for fighting vandalism, or anything very similar to the effects of vandalism.
2008-01-11 10:14 < White_Cat> If you are not sure whether a revert is appropriate, discuss it first rather than immediately reverting or deleting it.
2008-01-11 10:14 < White_Cat> If you feel the edit is unsatisfactory, improve it rather than simply reverting or deleting it.
2008-01-11 10:14 < Doc_glasgow> Tony_Sidaway: as long as you sign it, that's finewith me
2008-01-11 10:14 < White_Cat> does that make sense?
2008-01-11 10:15 < lc2> Doc_glasgow: vandalised
2008-01-11 10:15 < White_Cat> Doc_glasgow you want Tony_Sidaway to sign your body?
2008-01-11 10:15 * NotACow is watching woodporn
2008-01-11 10:15 < lc2> :D
2008-01-11 10:15 < quanticle> NotACow: woodpr0n? Is that like pollination or something?
2008-01-11 10:15 < White_Cat> Tony_Sidaway
2008-01-11 10:16 < White_Cat> where is this discussion
2008-01-11 10:16 < NotACow> quanticle: nah, it's a woodturning show :
2008-01-11 10:16 < NotACow> :)
2008-01-11 10:16 < lc2> Doc_glasgow: goodness, thanks for the message on my talk page
2008-01-11 10:16 < NotACow> quanticle: you'd understand if you were a turner
2008-01-11 10:16 < lc2> IRONY MOTHERFUCKER, DO YOU KNOW IT
2008-01-11 10:16 < Tony_Sidaway> White_Cat: WP:RFR and then follow the trail of angry edits
2008-01-11 10:16 < Doc_glasgow> lc2: you are welcome ;)
2008-01-11 10:18 < lc2> Doc_glasgow: ;x
2008-01-11 10:18 < Doc_glasgow> lc2: I've added some vandalism myself, you can add more if you want
2008-01-11 10:18 < quanticle> Tony_Sidaway: WP:WALMART also works
2008-01-11 10:18 < lc2> yayyy
2008-01-11 10:19 < quanticle> lc2: What?
2008-01-11 10:20 < lc2> quanticle: ?
2008-01-11 10:22 < quanticle> lc2: What are you yay-ing about?
2008-01-11 10:22 < quanticle> :)
2008-01-11 10:22 < White_Cat> Tony_Sidaway yea
2008-01-11 10:22 < White_Cat> just block people abusing it
2008-01-11 10:22 < lc2> at the invitation to vandalise
2008-01-11 10:22 < White_Cat> makes more sense
2008-01-11 10:23 < lc2> Doc_glasgow: that's the closest thing i could find to the crude penis drawings that always show up on walls
2008-01-11 10:24 < lc2> there was no "penis graffiti" category on commons :(
2008-01-11 10:24 < pengo> might have to start one
2008-01-11 10:25 < Triona> When I got my rb permissions yesterday, I kept my revert scripts installed since they let me give a reason for reverts that aren't vandalism.
2008-01-11 10:26 < Triona> it would be nice if the devs made a technical change to be able to supply a reason
2008-01-11 10:26 < Tony_Sidaway> Yes, of course. Non-vandalism must never be reverted by rollback.
2008-01-11 10:27 < Triona> and I do make mistakes in rolling back vandals.
2008-01-11 10:27 < White_Cat> Tony_Sidaway
2008-01-11 10:27 < White_Cat> Suggestion: Have two rollback's one for removing nonsense (vandalism, spam, etc) and second for everything else which leaves a note to check the talk page.
2008-01-11 10:27 < Triona> I've made at least 2 this weeke...
2008-01-11 10:28 < Triona> comparatively, it's few and far between though
2008-01-11 10:28 < quanticle> White_Cat: Why would you use rollback to remove something that isn't nonsense? I think a traditional revert would be better in that case...
2008-01-11 10:28 < Tony_Sidaway> White_Cat: it's just intended as a quick-and-dirty way of reverting a vandal's edits. We already have two other way of reverting edits.
2008-01-11 10:28 < Triona> and the edits in question certianly LOOKED like vandalism.
2008-01-11 10:28 < White_Cat> quanticle right
2008-01-11 10:29 < White_Cat> thats what I would do
2008-01-11 10:29 < White_Cat> but this is for people too lazy
2008-01-11 10:29 < Triona> the editor just made them again and provided an edit summary, and they weren't reverted the second time
2008-01-11 10:29 < White_Cat> quanticle I typicaly do not revert
2008-01-11 10:29 < Tony_Sidaway> Triona: I've been on the verge of reverting an edit that looked like vandalism before,then realised it was news I hadn't yet heard.
2008-01-11 10:29 < Tony_Sidaway> Usually deaths.
2008-01-11 10:29 < White_Cat> Tony_Sidaway people die
2008-01-11 10:29 < Triona> A lot of times I continue to check things after I revert.
2008-01-11 10:29 * White_Cat kills quanticle to prove a point
2008-01-11 10:30 < Tony_Sidaway> White_Cat: on the other hand, I believe Ian Paisley's been killed off by vandals on Wikipedia more than once.
2008-01-11 10:30 < Triona> I'm of the camp that believes every millisecond we can shave off reverting a vandal is for the better though...
2008-01-11 10:31 < White_Cat> Tony_Sidaway they can merely see the inevitable :D
2008-01-11 10:31 < White_Cat> Tony_Sidaway WP:CITE applies :P
2008-01-11 10:31 < Triona> so, I'll revert and then ask questions if it's really questionable editing
2008-01-11 10:31 < Tony_Sidaway> White_Cat: yes but an unsourced edit isn't vandalism. We draw a distinction.
2008-01-11 10:31 < White_Cat> yes
2008-01-11 10:31 < Triona> Tony_Sidaway: there's a lot of unsourced edits that ARE vandalism though.
2008-01-11 10:31 < White_Cat> thats why I quoted WP:CITE and not WP:VANDAL
2008-01-11 10:32 < White_Cat> Tony_Sidaway engineers would not have built anything if they considered every exception
2008-01-11 10:32 < Triona> if someone's changing dates left and right in an article, with no citations
2008-01-11 10:32 < Triona> it's probably vandalism
2008-01-11 10:32 < Triona> of the "sneaky" sort

Wikipedia-Watch home page  |  Wikipedia-Watch hive mind

These logs from a freenode IRC channel were emailed to PIR by anonymous
third parties. They are made available by PIR under Section 230 of the CDA.