Google - Power from the Penguin

Discussion and support for all Linux distributions and Unix flavours (FreeBSD, OpenBSD, etc).
Post Reply
bb_matt
Registered User
Posts: 1652
Joined: 10 Nov 2003, 02:00
Location: Jhb

Google - Power from the Penguin

Post by bb_matt »

What can't you find on Google?
Vital statistics


John Naughton
Sunday April 25, 2004
The Observer

Here's a cheap trick to play on an audience - especially one drawn from the business community. Ask them how many use Microsoft software. Virtually every hand in the room will go up. How many use Apple Macs? One or two - at most. How many use Linux? If the audience is drawn from corporate suits, no hands will show. Now comes the punchline: who uses Google? A forest of hands appears. 'Ah,' you say, 'that's very interesting, because it means you're all Linux users.' Stunned looks all round.

The computing engine that powers Google is the largest cluster of Linux servers in the history of the world. If you talk to computer-science folks, you find that they regard this - rather than the number of web pages indexed - as the most interesting thing about the company. Managing such a vast server-farm is a formidable task. For example, how do you implement security patches and operating-system upgrades (much more frequent in Linux than in proprietary systems from Microsoft or Sun) on thousands of servers without causing disruption to service? Google manages to achieve this with sophisticated techniques for rippling changes through the cluster, yet achieves 100 per cent uptime. This is serious stuff, and there are a lot of IT managers out there who would give their eye-teeth to be able to do it half as well.

Google is famous for being a confident, open company. Its clean, uncluttered search page is supposed to be a metaphor for the organisation behind it. But when you start asking questions about its technology, then the water rapidly becomes murky. More than half the company's 1,000 employees are techies, and they are much in demand as seminar speakers in university computer-science departments, where people are curious about Google's technology. Wall Street - with its beady eye on the forthcoming IPO - wants to know what Google does (and more importantly, what it plans to do next). Computer scientists, in contrast, want to know how Google does it.

The two questions are different but increasingly, it seems, interlinked. At any rate, the technical community has begun to realise that presentations by Google techies have been run through some kind of corporate filter before they make it into Powerpoint. The operation of the filter is erratic (it's difficult for PR flacks effectively to censor geeks at the best of times), but it seems that the overall aim is to understate every aspect of Google's technology and technical performance by several orders of magnitude.

How do we know this? Mainly because of internal inconsistencies in the data provided by Google employees. One university presentation, for example, claimed that Google handled 150 million queries a day, and 1,000 per second at peak times. This prompted Simpson Garfinkel of MIT's Technology Review to do some simple calculations. If the system is handling a peak load of 1,000 queries per second, he reasoned, that translates to a peak rate of 86.4 million queries per day - or perhaps 40 million queries per day if you assume that the system spends only half its time at peak capacity. 'No matter how you crank the math', he concluded, 'Google's statistics are not self-consistent'.

Or take the number of servers that Google operates. The only figure the company will admit to is '10,000+'. They also claim to have '4+ petabytes' of disk storage, and have let slip that each server is fitted with two 80 gigabyte hard drives. Now a petabyte is 10 to the power of 15 bytes, so if Google had only 10,000 servers, that would come to 400 Gb per server. So again the numbers don't add up. I could go on, but you will get the point. But what it all comes down to is this: Google has far more computing power at its disposal than it is letting on. In fact, there have been rumours in the business for months that the Google cluster actually has 100,000 servers - which if true means that the company's technical competence beggars belief.

Now the interesting question raised by all this is: why the reticence? Most companies lose no opportunity to brag about their technology. (Think of all those Oracle ads.) Is this an example of Google behaving ultra-responsibly - being careful not to hype its prospects prior to an IPO? Or is it a sign of a deeper commercial strategy? The latter is what Garfinkel suspects. 'After all,' 'he says, 'if Google publicised how many pages it has indexed and how many computers it has in its data centres around the world, search competitors such as Yahoo!, Teoma, and Mooter would know how much capital they had to raise in order to have a hope of displacing the king at the top of the hill.' If truth is the first casualty of war, openness is the first casualty of going public.
http://observer.guardian.co.uk/business ... 22,00.html
Slasher
Registered User
Posts: 7525
Joined: 23 Aug 2003, 02:00
Location: 5th rock from the sun.

Post by Slasher »

wow, impressive...

Many different corporate businessess are however moving to Linux in a way...

Especially firm where there are programmers, and with OpenOffice so easily available nowadays, lots of firms move...

I kno of quite a few...
My BF2142 Stats:
Image


Slasher : Former member of www.PCFormat.co.za
I have reached the end of my near 5 year forum life. Farewell good days...

slasher (at) webmail (dot) co (dot) za
Soap
Registered User
Posts: 942
Joined: 14 Apr 2004, 02:00

Post by Soap »

Let's all get RAID5 now :lol: :lol: :lol: :lol: :lol:
Uranium
Registered User
Posts: 275
Joined: 30 Apr 2003, 02:00

Post by Uranium »

Looks like Google could be giving the NSA a run for their money. (The NSA supposedly has access to more raw processing power than any other organisation on the planet)
UR@|\|1U|\/|
My PC:
AMD Athlon64 3500+
2GB DDR400 RAM
GeForce 6800GT 256MB
120 GB SATA HDD
MrG
Registered User
Posts: 98
Joined: 24 Oct 2004, 02:00
Location: Cape Town
Contact:

Post by MrG »

I use to use Yahoo, but google is soooo much faster.
HTTP://WWW.MRG.ZA.NET
YOUR PC NEEDS MET
User avatar
hamin_aus
Forum Moderator
Posts: 18363
Joined: 28 Aug 2003, 02:00
Processor: Intel i7 3770K
Motherboard: GA-Z77X-UP4 TH
Graphics card: Galax GTX1080
Memory: 32GB G.Skill Ripjaws
Location: Where beer does flow and men chunder
Contact:

Post by hamin_aus »

You mean the guys at Google dont just spend all day designing spiffy new Google banners?
Image
aXe
Registered User
Posts: 27
Joined: 16 Apr 2005, 02:00
Location: Cape Town, South Africa
Contact:

Post by aXe »

I used Yahoo then went to Google (much faster), but now use MSN because it delivers many more results that are actaully what you want
MRG
neon_chameleon
Moderator Emeritus
Posts: 6098
Joined: 27 Feb 2004, 02:00
Location: Durban
Contact:

Post by neon_chameleon »

jamin_za wrote:You mean the guys at Google dont just spend all day designing spiffy new Google banners?
LOL! Necro'd thread here.
Qualifications: BSc Computer Science & Information Technology, BCom Information Systems Honours, ISACA CISA, ISACA CRISC
Experience: Web Design, IT Auditing, IT Governance, Computer Retail, IT Consulting
Interests: Technology, Nutrition, Toasters, BBM, Facebook, Colourful Diagrams
Post Reply