Posts tagged ‘algorithm’

September 3, 2011

Prediction API Part 2

Motivation

In my initial coverage of the Google Prediction API, I was very curious why Google would be so magnanimous as to open up this API for public use. This is a plausible answer from Google:

We do not describe the actual logic of the Prediction API in these documents, because that system is constantly being changed and improved. Therefore we can’t provide optimization tips that depend on specific implementations of our matching logic, which can change without notice.

An older version of a prediction API

Based on some of the user comments in the Google group for the Prediction API, I would guess that it is one of the more difficult of all Google APIs to understand and use. Similarly, it will probably be challenging to get meaningful results.

Requirements

Google advises that all the following are prerequisite for using the Prediction API:

  • an active Google Storage account
  • an APIs Console project with both the Google Prediction API and the Google Storage for Developers API activated

And of course, a Google account! See getting started for further details.

Free but not forever

Nor is the Prediction API free of charge indefinitely. According to the initial terms, usage is free for all users for the first six months, up to the following limits per project:

  • Predictions: 100 predictions/day
  • Hosted model predictions: Hosted models have a usage limit of 100 predictions/day/user across all models
  • Training: 5 MB trained/day
  • Streaming updates: 100 streaming updates/day
  • Lifetime cap: 20,000 predictions

This free quota expires at the end of the six month introductory period. The introductory periods begins the day that Google Prediction is activated for a project in the Google APIs console. Remember that charges associated with Google Storage must be included to figure total cost!

Presumably this is an API that Google won’t be deprecating without replacement any time soon. However, there is a separate Terms of Service for the Prediction API, which does give Google the right to do exactly that. I think that is standard language though, as Google is not contractually bound to support a free, or even paid but unprofitable service unless explicitly specifically stated.

Conclusion about the Prediction API

A great deal more information is available from the Prediction API developer guide including an example application for movie recommendations.

The Google Prediction API is probably best used as a sandbox. It may be helpful for deciding whether one wants to use machine learning for predictive purposes. If one decides to go ahead with this approach, there are probably more suitable alternatives than the Google Prediction API for an application intended for production use.

July 10, 2011

Prediction API

The recent release of the Google Prediction API Version 1.2 seemed oddly, well, magnanimous to me! Given the investment of intellectual capital and resources, I am surprised that Google would be so generous.  Allowing access to the Prediction API means that Google is giving access to its in-house machine learning algorithms to external users.

1939 Ford pick-up truck

1939 Ford pick-up truck will not likely use the Google Prediction API though other Ford products will

The official Google Code blog post, Every app a smart app, dated 27 April 2011, suggested many possible uses for the Prediction API. Some of the more interesting included:

The last item on the list has the potential, but not certainty, of causing serious privacy concerns. I’m guessing that customer feedback based on structured data is another potential use for the API.

I noticed that Ford Motor Company has plans for the Prediction API, specifically for commuters driving electric vehicles (EV). Apparently, there is a fair amount of “EV anxiety” due to limitation on range of travel. The Prediction API could be used to mitigate those concerns. AutoBlog is an online publication for automobile enthusiasts. It featured a great slide show demonstrating how Ford intends to make use of the Google Prediction API.

The Prediction API is available on Google Code. This is not the first release of the Prediction API. I’m uncertain whether versions before 1.2 were restricted in some way. (Google often grants API access to developers initially, and later, after ironing out any bugs or unexpected problems, opens the product to the public.)

Do be aware that a Google Storage account is required for access. Visit the Google API Console to get started.

June 27, 2011

Google Translation Story Continues

Last month, developers whose applications and websites depended on the Google Translate API and the underlying Google machine translation were shocked by an unexpected announcement.

Google Says Translate and other APIs WILL be deprecated

Google APIs are deprecated all the time. Usually they are replaced with comparable services or APIs.

But that morning was not like anything else. That morning became cruel and sad when the world heard the news. The linguists and webmasters were taken aback, shocked and stuttered in disbelief. The world learnt on May 26, 2011 that Google is no longer going to support its free machine translator also known as Google Translate

via Lackuna.com: Slaughtering Machine Translators – Who Is Going To Replace Google? 

The Translate API documentation on Google Code makes the situation very clear:

The Google Translate API has been officially deprecated as of May 26, 2011. Due to the substantial economic burden caused by extensive abuse, the number of requests you may make per day will be limited and the API will be shut off completely on December 1, 2011.

Regional languages of India by geographical location on the map

Regional map of India

Google suggests the Translate Element as an alternative to the API for website translation and similar needs.

Welcome to the Indic web

Deprecation of the Google Translate API does not mean an end to human usage of Google Translate.

This becomes very clear with this June 21 announcement on the official Google blog, Google Translate welcomes you to the Indic web. Google Translate announced support of five languages, in alpha* status: Bengali, Gujarati, Kannada, Tamil and Telugu.  According to the post,

In India and Bangladesh alone, more than 500 million people speak these five languages.

Special fonts need to be downloaded to use Google Translate with these Indic languages. The post has links to get access to these fonts, free of charge.

It is not clear whether these five alpha languages will be included in the deprecated Translate API before it is taken offline permanently on December 1, 2011.

* Google Translate introduced nearly a dozen alpha languages since 2009. At present, Google Translate supports 63 languages.

March 12, 2011

War on Content Farms Now in Progress

Farmer's market, Jul 2009 - 01

Content fresh from the farm

Google Declares War on Content Farms:

Google has announced a major algorithmic change to its search engine. Impact on users will be subtle while dramatically improving the quality of Google’s search results…

Google is targeting content farms.

This update is designed to reduce rankings for low-quality sites — sites which copy content from other websites or sites that are just not very useful…. It will provide better rankings for sites with original content, such as research, in-depth reports, thoughtful analysis and so on.

The change should make it easier to find high quality sites.

Google did not give details of the change, which should impact 11.8% of Google’s queries (currently only in the U.S., with plans to roll it out elsewhere over time), but it does say that it will affect the ranking of many sites on the web.

The list of related articles I have hand selected (just like I dredge through string beans in order to find the very best ones) may be of further interest to those with a sense of humor. Or without a personal stake in content farming.

December 9, 2010

Source Meta Tags to Identify Original Publisher Content

In December 2009, the Official Google Webmaster Central Blog responded to publisher concerns about page rank penalties imposed by Google’s search algorithm due to legitimate cross-domain content duplication. Most websites would rarely (if ever) have valid reasons for displaying identical content on multiple and distinctly different domains.

Journalists of the past

Journalists in Radio-Canada newsroom, via Wikipedia

However, it is a common occurrence for news media sites with multiple syndication channels to legitimately publish duplicate cross-domain content.

Source Meta Tags

Google announced an extra feature for news publishers, to differentiate between the first version of a “breaking story” versus the re-distribution by others that follows. Such redistribution is legitimate, but publishers wanted to make sure that there was a way to give credit where credit was due to the most enterprising journalist for a given news story. Google responded with this suggestion:

News publishers and readers both benefit when journalists get proper credit for their work. That can be difficult, with news spreading so quickly and many websites syndicating articles to others. That’s why we’re experimenting with two new meta tags for Google News: syndication-source and original-source. Each of these meta tags addresses a different scenario, but for both the aim is to allow publishers to take credit for their work and give credit to other journalists.

original versus duplicate

Original versus Duplicate Website Content

Further details about Google’s introduction of “source” meta tags to help find original news was covered in the Google News Blog, and an even more in-depth description can be found in this excellent Search Engine Land article about meta tags including discussion of a recent algorithm patent granted to Google.

UPDATE

There is good reason for Google’s decision to implement these meta tags on a trial basis. Best practice, for both bloggers and publishers alike, requires attribution if using another source’s original work. Most reputable online content producers have credited their source with a link until now. However, there is some concern that they could stop doing that, and instead, merely use the meta tag. That would be a much worse outcome for the original writer, in terms of receiving much-deserved credit for their work.

The meta tags are useful to Google, as they give input to the page rank algorithm (which seeks to reward providers of original content). Yet I do believe that this is a good-faith effort by Google. It would be unfortunate if these new meta tags have the opposite effect from what Google intended.

Follow

Get every new post delivered to your Inbox.