Google
What do people say about Google? What's the freshest news, the brightest comment? Start reading and stay tuned!
 

July 30, 2007

Researcher Buzz

Gigablast Posting Index Numbers

Remember Gigablast? It’s at http://gigablast.com/; it’s a general search engine with a few specialty searches as well. It’s been rather quiet lately, but I’ve noticed that it’s started posting the number of pages indexed in its main search engine.

This is something that all search engines used to do, but which was abandoned a few years ago as some search engines claimed that more wasn’t better, others claimed there were apples-to-oranges comparisons going on, etc. So it was kind of surprising to see Gigablast announcing the size of its index — which at this writing is 12,643,911,680.

(I remember nostalgically the cow I had when Google surpassed a billion pages in its index.)

Knowing that Gigablast is up to 12 billion pages is interesting enough. But it’s also interesting to know that number and then compare the size of Gigablast’s results with the size of Google’s. For example, searching Gigablast for Yahoo brings about 380,719,416 results. Searching Google brings about 707,000,000 results. Searching Gigablast for Obey the Toaster brings 138 results, while Google’s search for the same phrase brings about 364 results. I bet if you ran a couple hundred different searches and compared the results of each of the two engines, you could get a decent idea of how much larger Google’s index is than Gigablast’s.

… but it only matters up to a point. Of course the most important thing about a data pool is how easy it is to search. On the other hand, it’s always good to know how deep that pool is to start with.

by admin at July 30, 2007 12:13 AM under Search Engines-Google

 

July 29, 2007

Researcher Buzz

Wikia Buys Grub

I’m afraid I’ve never really understood why LookSmart owned Grub. It just didn’t make sense. And apparently LookSmart didn’t understand either, because Grub has been sold to Wikia, which has released it as open source.

For those of you playing along at home: Grub is a distributed computing project aimed at spidering the Web. Like Seti@Home, Grub uses spare computer cycles of its users, only in Grub’s case the Web gets crawled instead of aliens searched for. LookSmart bought Grub back in March 2003 and, as far as I can tell, sat on it for over four years. Then Wikia came along and bought it. Wikia, you may know, is the search-engine-from Wiki concept that’s been coming along for the last several months.

Not only has Wiki acquired Grub, but has also released it under an open source license. Not a lot going on yet — seems like the project has been moribund for quite a while. Another one to put on my radar…

by admin at July 29, 2007 11:50 PM under Search Engines

Googling Google

Google set to simultaneously create and dominate a new market?

Adam Bosworth, Google’s brains behind a Google service called “Google Health” that is still in development, has made several presentations about the healthcare industry and how Google may be working to improve it. The proposed service, though still a mystery, caught the attention of Vince Kuraitis back in June — causing him to write [...]

by Garett Rogers at July 29, 2007 10:44 PM under Google Health

Google OS

Google Indexing Many Web Pages in Real-Time

A year ago you had to wait days if not weeks to see your content indexed by Google. Now many web pages are indexed in 5-10 minutes. At least that's the case for many blog homepages, which are updated in almost real time. Here you can see the homepage of a PageRank 4 blog:


...and here's the proof that the blog post that was created 11 minutes earlier (the result is from Google Blog Search):


Google could use the ping feature of the blog search engine to get notifications when a site is updated. This doesn't work for all the blogs, so there may still be a prioritization algorithm.

Update (30 minutes later):

by Ionut Alex Chitu at July 29, 2007 07:41 PM under Web Search

Digg

Google Will Kill Microsoft With Web Apps

"Steve Ballmer says Microsoft has "no choice" but to embrace the web app revolution." Um... what web app revolution? The real revolution will come from virtualization and open source software. Google will likely succeed with a few useful services, but Microsoft's plan to embed spyware and advertising into Windows & Office will be a dismal failure

July 29, 2007 07:40 PM

Google Sued to De-list Scientology Critics' Site

Don't hate Scientology they will sue ya, or worse. The site in question was www.xenu.net

July 29, 2007 06:20 PM

Slashdot

Web Contracts Can't Be Changed Without Notice

RZG writes "The U.S. Court of Appeals for the Ninth Circuit ruled on July 18th that contracts posted online cannot be updated without notifying users (PDF of ruling). 'Parties to a contract have no obligation to check the terms on a periodic basis to learn whether they have been changed by the other side,' the court wrote. This ruling has consequences for many online businesses, which took for granted their right to do this (see for example item 19 in Google's Terms of Service)."

Read more of this story at Slashdot.

by kdawson at July 29, 2007 05:52 PM under court

Google Blogoscoped

Accuracy of Alexa Data

The following chart shows the accuracy of Alexa traffic stats for websites over time:

As you can see, not only is Alexa much more inaccurate than others, their accuracy is also going down over time, and currently at just 12% overall.

Note that above statistic uses gut feeling from a selected sample group (me) as data source. I might switch to the Blogoscoped toolbar as data source in the next release, though I haven’t yet decided which is more reliable.

[By Philipp Lenssen | Original post | Comments]



[Advertisement] Google books at eBay: background info on Google, AdWords, AdSense, Blogger and more...   [Advertise here]

by Philipp Lenssen at July 29, 2007 05:02 PM under Internet

Google OS

Google's Intranet Search Engine

Google Enterprise Blog shows a screenshot of MOMA Next, an experimental front-end for Google's intranet search. Google uses its own search appliance to index more than 100 million internal documents.

The familiar interface gives Google employees easy access to all kinds of data: contacts, shared bookmarks, refinements. Unfortunately, the design is kind of cluttered and the search takes a lot of time.


MOMA is the name of Google's intranet. An ex-Googler tells its story:

"MOMA was designed by and for engineers and for the first couple of years, its home page was devoid of any aesthetic enhancements that didn't serve to provide information essential to the operation of Google. It was dense and messy and full of numbers that were hard to parse for the uninitiated, but high in nutritional value for the data hungry. MOMA displayed latency times, popular search terms, traffic stats for Google-owned properties and, at the center of it all, a large graph with colored lines labeled with the names of Muppet characters. (...)

I came to take it for granted that any information I needed about Google could be found on the intranet, from the status of products in development to the number of employees at any point in the company's history. (...)

Google eventually clamped down on who had access the complete state of the business; ostensibly because such information needed to be restricted unless everyone was going to be registered as an insider and restricted from freely buying and selling the company's stock."

Here's another screenshot from a MOMA search for Googlers (credit: The Back Pack Zac Attak).

by Ionut Alex Chitu at July 29, 2007 03:37 PM

New Data in Google Trends

Google Trends has new data. The last update was in March, so Google should allocate more resources to this project and push new data more frequently.

Some trends: the number of searches for Google Maps grows faster than for Mapquest, the leader in online maps; Facebook has stirred a lot of interest lately, but the number of searches has barely surpassed orkut's searches; Google Reader grew a lot since last year's relaunch, while the interest for Bloglines is constant; YouTube generates more interest than Google. Can you find some interesting trends for this year?




by Ionut Alex Chitu at July 29, 2007 01:27 PM

 

July 28, 2007

Google Weblog

News: Google launches "Features, Not Products" initiative

Sergey Brin is telling employees to stop making old products and start improving new ones. "For example, said Chief Executive Eric Schmidt, Google plans to combine its spreadsheet, calendar and word-processing programs into one suite of Web-based applications."

July 28, 2007 09:03 PM

Customize GTalk

New RSS feed!

Check http://www.customizetalk.com for the location of the new RSS feed.

by wumpus at July 28, 2007 08:03 PM

eWeek

Wikia Details Plans for Search Rival to Google

Wikipedia founder Jimmy Wales says he is putting the building blocks in place for a community-developed Web search service that would rival search engines such as Google or Yahoo.

July 28, 2007 08:02 PM

Google OS

Microsoft's Live Search Adds Face Detection

Microsoft's image search engine added a new operator that lets you restrict the results to faces and portraits. You just need to append filter:face or filter:portrait to your query (for example, [larry page filter:portrait]). The search engine uses face detection algorithms that try to see if an image contains human faces, so you shouldn't expect to only find pictures of a certain Larry Page because that would imply face recognition.


Google added a similar option in May: you can find it in the advanced search interface. Unlike Windows Live Search, Google is a little bit smarter and finds pages that contain the exact name. The first result from Microsoft's search engine shows Larry Ellison from Oracle, the third one shows Larry Lloyd (an English football player) and only the sixth image shows Google's Larry Page.

Google makes mistakes too by including a photo of Marrisa Mayer as the second result for [Larry Page]. The reason? They both appeared in the same phrase: "Biz Week profiles Google hottie Marissa Mayer but doesn't mention that she's rumored to be Larry Page's girlfriend" (hottie links to Marissa Mayer's photo).


Overall, you may find Microsoft's image search engine more interesting because it includes infinite scrolling so you don't have to click on "next", a list of related people which is fairly accurate, a sidebar for image results so you don't have to go back to the results page and a scratchpad that lets you collect interesting images. Unfortunately, Microsoft's index is much smaller than Google's and the relevancy is often lower.

Another search engine that offers face filtering is Exalead. Even if the results aren't great, you'll love the advanced options: regular expressions, defining the width or the height of an image (you can find all the images related to words that start with summer, have 800 pixels width and less than 600 pixels height).

It's interesting to see image search engines becoming smarter and starting to actually analyze images and not just the filenames and the text that surrounds them. Google's acquisition of Neven Vision and the effort to label all the images from the web are also steps in this direction.

by Ionut Alex Chitu at July 28, 2007 07:52 PM under Image Search

Researcher Buzz

Who’s Reading the Google Blogs?

Now that FeedBurner has been acquired by Google, it’s probably not too surprising that Google has FeedBurner tags on some of its blogs (alas, not all, not yet.) I had some fun wandering around the blogs and seeing what was being read the most. (Being read the most in FeedBurner, of course; this doesn’t count how many people are visiting the Web site, watching the pages, etc.)

The Official Google Blog has as you might expect over 440 THOUSAND readers. I’m surprised that the Gmail blog has less than 5000 readers (of course, it’s much more new.) GMail’s FeedBurner readership is only about 500 more than the Orkut blog, which I would not have expected.

Meanwhile, the Google.org blog has less than 300 readers in FeedBurner — very surprising! The Google Mashup Editor blog has less than 500, less surprising since this application is not yet publicly available.

Between the GMail blog and the Google.org, blog, the Public Policy Blog has over 2000 readers.

I can’t wait for the FeedBurner badge to be added to the all the Google blogs. How is the Google Reader blog stacking up against Google Book Search? What about Google LatLong or Inside AdWords? The mind boggles…

This post came from ResearchBuzz, a site with news and information about online data collections. Visit us at ResearchBuzz.com .

by admin at July 28, 2007 06:22 PM under Search Engines-Google

Google OS

Is Google Checkout Confusing?

The Banking Unwired blog writes that Google Checkout's problem is that people have to overcome many barriers before using the service. And to do that they need to be really determined to use Google Checkout.
The benefits to users are many, including a central place to manage all your online purchases, added protection from someone fraudulently using your credit card, and limiting the chance for commercial spam. While this objective remains a noble one, its current incarnation of creating a parallel and optional path for users means a disjointed experience. The benefits of Google Checkout are only truly realized with an all or nothing approach. But getting there might be difficult given the customer experience kinks it has to overcome.
The author finds it strange that you should follow the Google Checkout badge, which may not always be very visible. Most people will choose the default checkout option because it may appear more convenient. They'll also ask questions like: "Who would I call for customer service issues? Where can I track my order or shipping? If I have a payment question or want a refund, where do I go?"
While the many benefits of Google Checkout outweigh its issues, the challenge of Google Checkout is one of adoption, data integration, branding, and how to provide a seamless customer experience. Having it as an optional add-on checkout option, however, raises the interesting prospect of increasing the confusion quotient, which was the original impetus for the need for Google Checkout.
So Google's main challenges would be to increase Google Checkout's awareness and to make the checkout experience better once you decided you want to use Google Checkout (a plug-in or a Google Toolbar option could help). Google is already heavily promoting Checkout in its shopping search engine.

by Ionut Alex Chitu at July 28, 2007 05:53 PM under Google Checkout

Google Blogoscoped

More Google Street View Cars Spotted

   

Gizmodo has a gallery of Google camera vehicles spotted all over the US. They’re likely to be added to Google Maps’ Street View feature one of these days. (Don’t forget to print this sign now, though there’s no official support announced yet...) If you spotted one of these cars too, please share your pic! [Via Googlified.]

[By Philipp Lenssen | Original post | Comments]



[Advertisement] Google books at eBay: background info on Google, AdWords, AdSense, Blogger and more...   [Advertise here]

by Philipp Lenssen at July 28, 2007 01:38 AM under Search

 

July 27, 2007

Search Engine Journal

Wikia Grabs Grub, Distrubuted Web Crawler

wikiagrabsgrub.jpgWikia’s Co-Founder, Jimmy Wales takes a step further towards his vision of the LAMP stack for search, by acquiring Grub, the original visionary distributed search project. Wikia then releases Grub into open source and is now available for testing and downloading.

Wales was excited to announce the positive responses to the acquisition of Grub, during a keynote address:

The desire to collaborate and support a transparent and open platform for search is clearly deeply exciting to both open source and businesses. Look for other exciting announcements in the coming months as we collectively work to free the judgment of information from invisible rules inside an algorithmic black box.”

Michael Grubb, Senior Vice President, Technology, and Chief Technology Officer of LookSmart, owner of Grub has this to say about the acquisition deal:

We are pleased to collaborate with Wikia and believe that Grub will thrive under an open source license. We are happy to be able to assist in the movement to make search a more open proposition and look forward to seeing things progress from here.

With Grub now in open source, search technology experts would have a better hand on its development into full blown user-contributed collaborative and open search system.

by Arnold Zafra at July 27, 2007 11:37 PM under Search Engine News

Googling Google

Google slips: New feature leaked by accident

It sounds like Google will be announcing a Canadian version of Google Finance — they announced it a bit too early and replaced the original article article with a “woops, check back soon” statement. Oops. We hit the button too soon. Watch for news about Google Finance in Canada next Tuesday. But thanks to the internet (or [...]

by Garett Rogers at July 27, 2007 08:58 PM under Google Finance

Google Blogoscoped

Google Video Upload Currently Unavailable

When you go to the Google Video page and hit the “upload your videos” link, as Paige emailed in, you’re currently being forwarded to a “sorry” page reading:

<<Upload and share your videos... just not right now

Dear Google Video uploaders, we are temporarily unable to accept new uploads. In the meantime, here are some of our favorite videos!>>

The last time I tried to upload something using Google’s web interface I also had interruptions that caused the upload to cancel (it did work with Google’s desktop-based video uploader tool, though). If Google brings this back, I wonder how long they’re going to support two separate video uploading options on their services though, as you can already upload videos with YouTube.

[Thanks Paige!]

Update: It’s back up now, as Paige says. [Thanks Paige!]

[By Philipp Lenssen | Original post | Comments]



[Advertisement] 55 Ways to Have Fun with Google (Book)   [Advertise here]

by Philipp Lenssen at July 27, 2007 06:44 PM under Search

ZDNet

Executives speak out on software acquisitions

At San Francisco's Churchill Club, moderator Dave Margulius talks to panelists Douglas Merrill, vice president of engineering at Google, and CIOs David Bergen of Levi Strauss, Doug Schwinn of Hasbro and Randall Spratt of McKesson. The chief information officers debate the pros and cons of software industry consolidation and discuss whether these large mergers are beneficial or preventing innovation.

July 27, 2007 05:52 PM under ZDNet News: Video

Search Engine Journal

The Semantic Web & Its Implications on Search Marketing

The phenomenon of personalized search is an important step forward toward the semantic web. In its own way, personalized search creates a mini semantic web that is based on the preferences and behavior over time of its users. Personalization is the direction in which search engines need to move in order to deliver relevant search results as the amount of information on the Internet continues to increase at an enormous rate.

Shared personalization is the next level of personalized search in that one person can benefit from the experience and knowledge of other users related to specific searches. It is akin to a virtual search engine or a search engine’s topic-specific, shared knowledgebase. Major search players are exploring both these areas which makes the semantic web something that will likely be realized in the near future and will stay for the long term.

Some Implications:

Businesses will need to have a better understanding of who their target audience is, what they want, and how they think about what they want. The semantic web will be better for consumers because they will more easily find products and services that provide precisely what they need. Additionally, the corporate community stands to benefit by spending less energy and time pursuing the wrong prospects and marketing to the wrong channels, provided that they properly shift their online marketing strategies accordingly.

This poses a paradigm shift for search engine optimization. The prominence that factors like keyword matching, keyword density, and ranking in the search engine results page have had will be a thing of the past.

What will be more important are factors like understanding how an audience thinks and behaves online, and latent or hidden relationships between ideas and the ways people express those ideas. Then search marketers will need to incorporate that into where they market, how they market, and how they craft messages.

Challenges for the Semantic Web:

An old adage in computer science says, “garbage in is garbage out.” In short, this will be one of the main challenges for the semantic web to overcome. As input comes from a multitude of users via tagging, behavioral data, and other forms, there will be at least three issues to watch out for:

* Incorrect tagging,
* Malicious tagging,
* and Spam tagging.

To the extent that results returned to an Internet user are influenced by the relevant data from other users, those results may be skewed due to any of (or a combination of any of) those factors. Tapping into a knowledgebase is great when the contributors know what they’re doing. You can proudly say, “It’s powered by people” and “Power to the people.” However, what if the contributors (or a significant portion of the contributors) don’t know what they’re doing or have misinformed opinions. We might get, “It’s powered by third graders” or “It’s powered by the uneducated,” which is not as promising.

You also have the possibility of many contributors abusing the system. Imagine if you were stuck with the tagline “Powered by hate-mongers,” “Powered by criminals,” or “Powered by spammers.” The semantic web holds much promise but there are also important challenges to overcome.

Sociology of the Semantic Web:

Truth in Advertising or the Postmodernization of Search

To the extent that results returned to an Internet user are influenced by the relevant data from other users, do we lose factual accuracy or objectivity in the search results? In one form of postmodernism or pragmatism, meaning is a product of whatever linguistic community you’re in and there is nothing beyond that which you should seek because there is nothing beyond that to be had – no truth with a capital T. In the semantic web, are the contributors akin to the linguistic community and the accuracy of the results from your search akin to the postmodern notion of meaning; no facts with a capital F, no objectivity, no Truth in advertising?

“There is nothing either good or bad but thinking tagging makes it so” – William Shakepeare [Hamlet Act II, Sc. II] (modified).

People often discuss the impact of the Internet and email on culture. There may be similar discussions about the impact of the semantic web on culture as the information that people find, hold on to, and make use of may be viewed as accurate and true, all the while, it is only a product of the collective musings, however ill-informed, of the masses.

I am sure much of the information on the semantic web will be quite useful and accurate because many heavy users of the Internet are highly educated in general. However, I am also sure there will be areas of the semantic web not so accurate in its results because the contributors will not be as accurate in their understanding of the material they tagged.

Michael Marshall is co-founder of Fortune Interactive and creator of SEMLogic, competitive intelligence software for SEO. He has over 19 years experience in information technology covering a wide range of specialties including: web design, software engineering, e-commerce solutions, artificial intelligence, and Internet marketing.

by Michael Marshall at July 27, 2007 02:29 PM under Semantic Search

John Battelle

Think Google Earth Is Cool?

Well, dork that I am, I think Google Earth Enterprise is cooler! More here.... (Go to Searchblog Main)

July 27, 2007 05:21 AM under Random, But Interesting

(Googler) Matt Cutts

Short bits

- This one’s kinda fun. Rand Fishkin was in town, so we invited him over to the Googleplex. We arranged on a simple trade: we’d feed him if he’d give a talk about search/SEO from his perspective. I think everyone benefited. :) You can read his trip write-up.

- Mitchell Baker is asking for suggestions on how Thunderbird should be organized going forward. It’s true that Firefox gets a lot more attention/3rd party development than Thunderbird. I’m torn, because on one hand I think I’ll be using web mail from now on. The idea of locking my data/email into one computer is too fraught with problems for me now. On the other hand, I still think email is nowhere near where it should be. There are so many archival formats (qmail/maildir, mbox and all its many flavors), yet I haven’t seen that many tools that let you distill email into insights. I can look at my server log traffic in an easy graphical view; why can’t I graph my email volume by day-of-the-week? Or take my history of routing emails and auto-suggest “This email should be routed to this team” or “Historically, this other person is an expert on this subject”?

- It looks like Steve from Feedburner is jumping in with both feet at Google:

So far, it’s been a blast. I’ve always kind of been a tech junkie and Google is tech junkie’s heaven, so there you go. More than that, it’s hard to find a work environment anywhere with such a collection of intelligent, talented people. It’s a big company, so of course there are elements of a big company creeping into the culture, but I have to say it’s less like a bigco than any other bigco at which I’ve ever worked.
….

Part of that has been a full dive into using Google tools. Personally, I don’t know how I lived without Apps before this. Anyone with a small company should definitely look at using this internally.

I’m really excited that Feedburner has joined Google. Feedburner is one of those rare companies where I emailed a bizdev person to say “I don’t know if we’re talking to this company, but every experience I’ve had with them has been really positive, they make a great product, and they seem very cool to boot.” (I harbor absolutely no illusions that my email made any difference, but I felt like I needed to chime in because I was actually paying Feedburner money each month and felt like I was getting a great service.)

- My wife has just handed me Harry Potter and the Deathly Hallows after finishing it herself, so I may be scarce for a few days. :)

by Matt Cutts at July 27, 2007 04:50 AM under Google/SEO

 

July 26, 2007

Googling Google

Digg diggs Microsoft, not Google

An announcement made today by Digg’s founder, Kevin Rose, prepares his users for an upcoming change that I’m sure would have caused a stir if nothing was pre-announced. Google has been used by the company to serve advertisements on their website since it was launched, but today Kevin says that they will be switching [...]

by Garett Rogers at July 26, 2007 02:02 PM under Google AdSense

 

July 25, 2007

(Googler) Matt Cutts

SEO tip: Avoid keyword stuffing

Alex Chiu claims to have invented an immortality device:

Alex Chiu web page

Wow, who wouldn’t want to stay young forever? But there’s a snag. Alex claims that Google doesn’t include alexchiu.com in its index because, you know, Google is trying to suppress the immortality device. Here’s part of what one of his pages says:

Alex Chiu message

I wonder if there could be some other reason that the domain doesn’t show up in Google? If you go back to Alex’s eternal life page and look at the bottom of the page, you’ll notice a very small textarea:

Alex Chiu text area

Hmm. It’s just a few pixels by a pixels, but it looks like there’s some text in there. So if you view the source of the page… uh oh:

Alex Chiu text

“Internal vaginal aphrodisia doping hardware?” Huh? And what does a “plasma tv advanced chart” have to do with immortality? It looks like about 50KB of keywords are stuffed into that tiny textarea, from celebrity names to complete nonsense like “tupac kazaa hospital” and “alien cemetary.”

If I were wondering why I didn’t show up in Google, I would review our webmaster guidelines and read the information listed under Don’t load pages with irrelevant keywords. As always, webmasters are free to do what they want on their own sites, but Google reserves the right to do what we think is best to maintain the relevance of our search results, and that includes taking action on keyword stuffing.

by Matt Cutts at July 25, 2007 07:51 AM under Google/SEO

 

Older entries:


2007 (weeks): 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 |