November 14, 2014

LOL targeted search

YouTube is something of a cesspool, with pockets of exceptional quality here and there. Even the higher quality videos have an ephemeral aspect, mysteriously vanishing or being marked Private, from one day to the next. Others succumb to the more prosaic, account suspended due to multiple copyright violations. Illegal uploads of major recording label artists abound, or did. YouTube is also becoming a go-to destination for low-fidelity live concert recordings.

There’s no shortage of fee-based alternatives, so I’m not complaining.

YouTube LOL search algorithm

Google Research developed an aLOLgorithm, “Quantifying comedy on YouTube: why the number of o’s in your LOL matter” to measure YouTube videos’ hilarity. Let’s just refer to it as the LOLgorithm, for my ease of typing. Initially, I thought it was a prior year’s April Fool’s Day post. It isn’t!

I watched three of the five most LOL inducing videos, as determined by the humor-seeking LOLgorithm. I was pleasantly surprised. The LOLgorithm selected videos with themes having universal appeal: A fisherman arguing with a grizzly bear, Annoying Orange, and a charming (well, sort of) video about an Italian man’s language misunderstandings while vacationing in Malta.

Discovery is challenging

Google began by identifying the humorous videos, which is easier said than done.  YouTube’s search engine is not the greatest. I have two theories about that.

First: YouTube was an acquisition. Yes, I realize that many Google services are. There was, still is, a Google Video media player, which offers a better user experience. YouTube just seems… unstable, kludgy. I think, but am not certain, that it crashes less often now with HTML5 than with Adobe SWF.

Second: The content bar is set low. That is, YouTube channel owners can enter any old thing they want as a title, complete with misspellings or contextual mismatches. My current favorite example of an appalling spelling error is a cover of AC DC’s Thunderstruck, performed by The Vitamin String Quartet. The title is listed as TUNDERSTRUK. Looks like the LOLgorithm is working, because that’s what I’m doing now.

Another amusing example of contextual/semantic mismatch is a remixed melody from Brittany. The channel owner is from eastern Europe and thought the song’s origin was Scottish. To make matters worse, he labelled it as dubstep but it was actually hardstyle trance. The comments are full of good-natured corrections, in various languages, and alphabets. I haven’t a clue how any algorithm, even the LOLgorithm, could parse that! Admittedly, it is an edge case.

Methodology

Google started with the semantic meaning of the title, designated by the uploader, and the video description and tags if provided. Next, they used viewer reactions as indicated by comments to categorize the humor videos into sub-genre.

Viewers emphasize their reaction to funny videos in several ways: capitalization (LOL), elongation (loooooool), repetition (lolololol), exclamation (lolllll!!!!!), and combinations thereof.

A “loooooool” indicates greater viewer amusement than a “loool”. The final step was ranking the selected videos by relative funniness. Google described their approach as follows:

We then trained a passive-aggressive ranking algorithm using human-annotated pairwise ground truth and a combination of text and audiovisual features.

Raw view count is insufficient as a ranking metric, as it is biased by video age and possibly by prior viewer exposure on an external website.

LOLgorithm accuracy

The Google Research blog post is terse. The LOLgorithm seems accurate to me.  There’s an alternative explanation, though. Maybe I enjoy similar videos as many other YouTube viewers, and we’re an easily amused and homogeneous lot?  There’s plenty of pre-selection bias.  In other words, most viewers of YouTube comedy videos have a not-too-subtle preference profile, myself included. For example, I’ve been an Annoying Orange channel subscriber on YouTube since 2010.

The video about the Italian tourist reminded me of a literary passage that is hilarious.

Have a look. Maybe it will elicit a LOL or two from you.

May 4, 2014

Bing Search: Webmaster Chicken

Google Search has a well-known competitor, Microsoft Bing. Google is first, but Bing (and Yahoo, who contracts search to Bing now) has the second largest share of U.S. domestic internet search volume. Globally, Google is also first, with Baidu, Bing and Yandex in varying relative share positions depending on geographical locale.

Today’s post is about a (no longer) recent entry on one of the five* official Bing blogs.

screen shot of old version of Bing Webmaster Center

Old version of Bing Webmaster

Chickless in Seattle

The Chicken Has Landed (5 June 2013) offers guidance on how to improve search rank, website quality and traffic volume. It is applicable to e-commerce, blogs and most publicly accessible websites.

Continue reading

March 21, 2014

Google Research fan behavior

Friendly!

I found a broken link. It was important, being the contact URL on Google Research’s official Twitter account! I told them about it. Google Research wasn’t aloof! I was thrilled.

An invitation to join Google+

Google Research finally joined Google+ in August 2012.

Google Buzz chat

Inviting Google Research to Google+

I tried to coax an earlier arrival in July 2011. Click on the image if you would like to read our conversation. I remember feeling bold, and daring!

Odds and Ends

Indirect Content Privacy Surveys: Measuring Privacy Without Asking About It, Symposium on Usable Privacy and Security (SOUPS), 2011.
Abstract (an excerpt that I extracted from the abstract, that is):

The emotional aspect of privacy makes it difficult to evaluate privacy concern. This effect may be partly responsible for the dramatic privacy concern ratings coming from recent surveys, ratings that often seem to be at odds with user behavior…

This is SO true! Dramatically vocalized privacy concerns are highly inconsistent with actual user behavior! The gist of the article was to figure out a way to get at people’s privacy concerns without asking about privacy directly. Merely broaching the subject tends to cause survey respondents to get skittish, thus impacting their answers.

The article DOI, full text, is in this Google Research post.  If that doesn’t work, try the corresponding entry via Google Research’s profile on Google Buzz. The post was active from June 2011 through January 2012. Good luck finding it now. It is accessible sometimes, but not consistently. Odd, no? Maybe not so odd, as Google Buzz was discontinued a few years ago. I miss it.

Chrome browser crash

I know and love that sad little face too.

Yes, he is a sad guy. When Chrome browser crashes, I don’t feel annoyed anymore, just disappointed.

April 14, 2013

Search and tell

Hide from cache

If you don’t want web searchers to be able to access a cached version of your page, use the noarchive meta tag like this:

<meta name="robots" content="noarchive">

The page will still be crawled and indexed by Google, but users will not see a cached link in search results.

Similar to your website

The related: operator displays websites similar to the site you are looking for. It returns the same results as clicking Similar pages next to a result on the search results page.

I was curious about the results returned by Similar pages, as its intent is to return overlapping resources. Specifically, I was worried whether it indicated anything potentially detrimental, for search engine optimization purposes. According to Google, there’s no need for SEO concern, not for the moment:

The quality of the sites returned has no impact on your ranking or on how Google indexes your site.

Webmaster documentation

Another find: Google recently updated its References for Webmasters.

Fan memorabilia

 

December 29, 2012

Google Zeitgeist Snapshot

This is an especially short post, as it is a high-level summary of an even higher level summary. Of course, we all know how meaningful THAT is :~

Google Zeitgeist 2008

Nostalgia

Zeitgeist is a borrowed word, from an English language point of view. It means “signs of the times”. Yes, I realize that zeitgeist is singular, but somehow we seem to have made it plural in the process of adoption from German. Or maybe not, as it is sometimes capitalized, as a proper noun, the Zeitgeist. Perhaps it is one of those mysterious, uncountable words?

Quartz News looked a little more deeply into the annual Google Zeitgeist survey, with thankfully human, not machine, translation and analysis.

Methodology

Quartz took the top results for the 34 countries for which there was data for the Zeitgeist “How to…?” category. He then rank ordered by frequency, chose the most common result for each country, and asked around, to assure that everything was translated correctly.

Do the results accurately capture each country’s national character?

Chrome screenshot

In most instances, I think the answer is, “Yes”.

The number one “How to….?” query for The Netherlands was “How to survive”.

  Continue reading

December 16, 2012

Google translation enigma

My Tumblr friend, Mr. Sheeper, shared a page from a Japanese language website,  アンティーク アナスタシア  I am always happy to hear from him, as he has remained in Japan since the earthquake and nuclear aftermath. In English, the website name is Antiques Anastasia. The focal point is a lovely 18 kt gold slide pendant, in a style evocative of 19th century France.

Translation icon

The original webpage metadata is リュフォニー作 天の元后 レジナ・チェリ 金無垢ペンダント フランス製アンティーク or Pendant of antique gold: Celi Regina, which means “Queen of Heaven”.

This is religious jewelry. The page includes narrative as well as photographs for context. So far, so good.

Original text, prior to Google Translate, side-by-side Latin and Japanese:

Regina Caeli, laetare, Alleluia,
Quia quem Meruisti Partare, Alleluia,
Resurrexit, sicut dixit, Alleluia.
Ora pro nobis Deum, Alleluia.
天の元后よ、喜び給へ。ハレルヤ。
御身産むを許され給へる御子の、ハレルヤ、
自ら言ひ給へるごとくに蘇へり給へばなり。ハレルヤ。
我らがために神に祈り給へ。ハレルヤ。

After Google Translate, Japanese to English, side-by-side Latin and English:  Continue reading

December 1, 2012

Gmail and mobile service related news

There has been an accumulation of minor activity about Gmail recently.

Email art

Gmail Outage

On 11 December 2012, many Google accounts experienced Gmail unavailability. I did not have experience any problems in Arizona. Gmail was definitely offline for at least 45 minutes, when I checked the official Google Apps Status page.

According to GigaOm, continuous deployment was the problem, and Gmail went down during a routine load balancing update. The GigaOm article is good. It includes a two-page PDF document later released by Google, with a detailed explanation of the incident.

For future reference, I suggest bookmarking the Google Apps Status Dashboard. Despite the “Google Apps” page name, the information is relevant to consumers as well as Google Apps business customers. It lists time and cause for disruptions in Gmail and many other Google services.

Verdict of the Herd

There is an unofficial Is Gmail down? service which culls data from multiple sources. It reminds me of an informal version of Herdict, the “verdict of the herd”. Herdict collects and publicly reports on global incidents of filtering, denial of service attacks, availability, and overall internet infrastructure reliability. Input data is crowd-sourced.

Herdict reports on website inaccessibility regardless of cause. After aggregation and trend analysis, it can be useful for gauging regional blockages of websites known for activism and possibly subject to politically motivated internet censorship. “Is Gmail down” is not intended for anything beyond the convenience of the public, though that is always appreciated! It is not crowd-sourced, nor does it give a comprehensive real-­time map of global Internet health. In contrast, Herdict does exactly that. The collected information can even be broken down on a more granular level.

Herdict access service I like the Herdict badge. You can put it on your website to support Herdict activities. Just click on the sheep-shaped image to get one. The Herdict real time interactive map is fun to watch, and its RSS feed is available for free to anyone who wants to use the data. Herdict is run by the Berkman Center for Internet & Society of Harvard University.  Continue reading

May 24, 2012

Google Drive has arrived

Google’s cloud storage service is finally available. It offers the promise of accessing files, even large ones, from any location or device. With Google Drive, you can create new documents, spreadsheets, presentations and charts, and easily share with others. There is the suggestion of collaborative work, by two or more, on the same document simultaneously. In reality, that is rarely feasible. Well, it is difficult to do productively. Shared access is convenient for meetings and small work groups though.

windows 7 uses google storage

Accessing Google Drive with Win 7

As with Google Docs, one may search by keyword, and filter by file type, owner or file size.  Over 30 file types are accessible from your browser. This includes HD video, Adobe Illustrator and Photoshop— this part is key: File access does not require the program to be installed on your computer!

The first 5 GB of storage is free of charge for all Google accounts.

Availability

Google Drive may be used on a variety of computers and devices, including PCs, Macs and Android. iPhone and iPad support is “coming soon”.

Google Drive

Offline?

The fate of Google Docs

Perhaps you are thinking,

This seems so similar to Google Docs!

I was too. Apparently that was by intent, as word has it that Google Drive will replace Google Docs for all users:

Precisely because Drive is just Docs with a new logo, Docs is being phased out. The site still works for now and will continue to work for months, but Google is pushing users away from the Docs URL and app and towards Drive.

I had noticed Docs was prompting me to try Drive recently. I suspect this was the reason. The URL will change from http://docs.gooogle.com to http://drive.google.com. The final changeover date has not been announced. Continue reading

March 6, 2012

Cleaning up for Honeycomb

It is year-end, December 31, 2010.

While everyone at Google enjoys the holidays, someone is still working late at night to gear up for Honeycomb.

Who could it be?


Notice the cleaning bucket that hard-working little Android is using. Yes, it is covered with those distinctive Erlenmeyer flasks that Google Labs was so fond of using.

Nostalgia

This predated the closure of Google Labs by nearly a year.

Cleaning up for Honeycomb by Evoreto UG (haftungsbeschränkt):

Our 3D Android is based on work created and shared by Google and used according to terms described in the Creative Commons 3.0 Attribution License.
This is no official ad and neither related nor endorsed by Google.

Music: Dance of the Sugar Plum Fairy, Kevin MacLeod (incompetech.com)
Licensed under CC/licenses/by/3.0/.

Tags: ,
March 2, 2012

Google Maps: Foreign affairs and social skirmishes

Google Earth and Google Maps are probably the most popular, free online cartography reference tools for the public.

Google Map art

Map markers away! Fighting the cartographic unknown

Foreign affairs

Popularity is not the same as authority[1]:

The lines that Google draws on maps have no government’s imprimatur.

How an online map almost caused a violent conflict

If Google Maps show borders or place names that are different from official or long-established usage, they can confuse, offend or worse[2]:

On Nov. 3, 2010, a Nicaraguan official justified his country’s incursion into neighboring Costa Rica’s territory by claiming that, contrary to the customary borderline, he wasn’t trespassing. For proof, he [cited] Google Maps.

Google should not be involved in geopolitical disputes

Google DOES try to offer meaningful, accurate maps. Continue reading

Follow

Get every new post delivered to your Inbox.

Join 527 other followers