Sunday, January 31, 2010

Google synonyms

Google is an integral part of our lives. Anything we need we search on Google. But have you ever notice that if there is a slight change in the wording of your search terms, it gives you different results? If you have not noticed it earlier, you can do it now.

Open Google.com, write "search engine optimisation", it will show that there are 35 million results, now just replace "s" with "z" i.e. "search engine optimization" there are 5.8 billion results.

In one of the Google's Blog post Helping computers understand Human Language, it says:

What is a synonym? An obvious example is that "pictures" and "photos" mean the same thing in most circumstances. If you search for [pictures developed with coffee] to see how to develop photographs using coffee grinds as a developing agent, Google must understand that even if a page says "photos" and not "pictures," it's still relevant to the search. While even a small child can identify synonyms like pictures/photos, getting a computer program to understand synonyms is enormously difficult, and we're very proud of the system we've developed at Google.

However, our measurements show that synonyms affect 70 percent of user searches across the more than 100 languages Google supports. We took a set of these queries and analyzed how precise the synonyms were, and were happy with the results. For every 50 queries where synonyms significantly improved the search results, we had only one truly bad synonym.

In another blog post naming Understanding the web to make search more relevant, it says:
Answer highlighting helps you get to information more quickly by seeking out and bolding the likely answer to your question right in search results. The feature is meant for searches with factual answers, such as [meet john doe director], [john lennon died], or [what was the political party of president ford]. If the pages returned for these queries contain a simple answer, the search snippet will more often include the relevant text and bold it for easy reference.

Labels:

Tuesday, January 19, 2010

Google Corporate teachings

We all know how Google was started in a Garage by Larry Page and Sergey Brin. But what we dont know is how they turned this Garage company into Multi-Billion Dollar company which has offices in more than Dozen countries having Thousands of employees and 1 of the BEST company to work worldwide.

Today, we will highlight the Corporate Teachings of Google, whereas details can be seen here:
  • Focus on the user and all else will follow.
  • It's best to do one thing really, really well.
  • Fast is better than slow.
  • Democracy on the web works.
  • You don't need to be at your desk to need an answer.
  • You can make money without doing evil.
  • There's always more information out there.
  • The need for information crosses all borders.
  • You can be serious without a suit.
  • Great just isn't good enough.

Labels:

Monday, January 11, 2010

Google Sites

There are many host and application providers, who lets you create your site for free. But the glitch in using those is, you have to host your site on their server or use their Banner. Their Banner will reveal that you have created your site using their software/application, which is quite embarrassing for any business.

Now, Google is offering you to create your sites via Google Sites, which is a free and easy way to create webpages, as well as embed documents, pics, and Videos. Not only that there is a template gallery for Sites; when you create a new site, you can now select a template which you can further customize according to your needs. Available templates cover a wide selection of specific needs; you’ll find one for schools, weddings, families, clubs, restaurants, projects and more.

Apart from templates, it also offers Automatic site Translation. Visitors to your Google Site with a different language setting than your site can hit the translate button on the bottom right of the page to translate the whole site to the language of their choice

You can set a default location for a page template. This makes it simpler for you to keep the pages of your site organized. For example, if you have a recipe template in your family site, set /recipes/ as default location for pages created from that template so all your recipes show up together within your site.

Labels:

Monday, January 4, 2010

Google Now Scanning RSS, Atom Feeds, May Experiment with Real-Time Protocols in Future

According to a post on Google's Webmaster Central blog, Google is now discovering web sites by automatically scanning RSS and Atom feeds. This new process will help Google more quickly identify web pages and will allow users to find new content in search results as soon as it goes live. While not exactly "real-time," using feeds to identify updates to websites is an arguably faster method than the traditional crawling techniques Google has used in the past. And Google may get even faster in the near future - the post also notes that the company may soon explore using mechanisms like the real-time protocol PubSubHubbub to identify updated items going forward.

The blog post doesn't say whether or not RSS and Atom discovery is displacing traditional web crawling for sites that are feed-enabled, but it's likely that, if given the choice, Google will opt for the faster method if available. As Vanessa Fox notes on the SearchEngineLand blog, since it's unknown at this time whether Google is using the feeds in place of traditional web crawling, it may make sense to use full feeds rather than partial ones in order to get your content indexed faster by Google's search engine.

Real-Time Web Crawling in the Future?

Although only briefly mentioned in the post, Google hinted that they may begin looking into other mechanisms such as PubSubHubbub, an open protocol that provides near-instant notifications of change updates. No further details were provided beyond the one sentence, but the announcement clearly shows that Google has seen the writing on the wall and knows that the real-time web is the future. This is one trend the company isn't planning to ignore.

The real-time web, heavily influenced by the speed of Twitter and other other rapid-fire social networking updates, has created a desire among internet users for faster access to information. This desire has, in turn, led to the creation of new real-time protocols such as the above mentioned PubSubHubbub and its counterpart RSSCloud. If Google began to use these technologies for scanning the web, their search results wouldn't just be updated faster - they would be updated in real-time. That means information would become available in the search results listings as soon as it was published to the web.

That, of course, would lead to a whole new series of challenges for the search engine - most notably, how to rank the real-time results? Given that Google's search algorithm has been built on top of the concept of PageRank, a way to determine the relevance of a website by what other sites link to it, ranking search results that are so fresh that there is an absence of links could prove a difficult feat. However, Google is already doing this to some extent now. Over time, the PageRank algorithm has evolved and can now reward sites with fresher, more fitting content and rank them higher than sites with more links on some occasions. And if anyone can figure out the proper algorithm for mixing in real-time content and ranking it appropriately along with static pages, it's got to be Google. In fact, we'll probably soon see exactly how they plan on addressing this issue, when they incorporate Twitter search results into their index, as announced last week

...But Until Then, Google Delivering Faster, Fresher Results Instead

Although the PubSubHubbub mention may have been the most exiting part of the announcement, real-time search results aren't here just yet. In the meantime, we have to just be content with sped up results instead. The post advises website owners who are blocking Google's search bot software known as Googlebot from crawling their RSS/Atom feeds to unblock it via their robots.txt file. If unsure, webmasters can test their feed URLs with the robots.txt tester in Google Webmaster Tools, as the post recommends.

Written by Sarah Perez

Labels: ,