Archive for 2010

Google’s N-Gram Viewer

Dec
28
2010

0
Comments

Google’s N-Gram cache brings their level of near-omniscience–and in particular their knowledge about how the use of language informs human interaction with Search Engines–to a new level. Human language and human behavior (re: consumer behavior) intersect in interesting ways on the Internet, and Google has long been established as the industry leader in mapping and manipulating the site of this interaction.  Cultural theorists have, for a long time, been writing ‘prophetic’ essays about how the Internet is a kind of incarnation of collective memory or a representation of collective consciousness.  Google’s new N-Gram cache & viewer consummates that kind of pipe-dream in some interesting new ways.   At present Google’s N-Gram cache is mostly interesting on a scholarly level–it will not immediately influence the way that businesses compete for Search Engine Rankings. But it gives us some insight into the scope of Google’s long-term ambitions, and for that reason, I think its worth a blog-post.

The N-Gram viewer allows users to search the rising and falling frequency of words as they appear in print over the last five hundred years. Search can be narrowed to any period of years in the past five hundred years, so you can search levels of word-usage from 1500 to present or you can search within a shorter period.  For example, how often did the word Reagan appear in print between 1980 to 1988?

Well, certainly more frequently than it had appeared in the preceding 500 years. No great surprise there.  The use of the word ‘Reagan’ begins to pick up in the mid-60’s and it spikes steeply in the 1980’s.  (In fact, the word Reagan appeared in print more frequently than the words ‘Jesus Christ’ from 1980 until mid-year 2000.  Go ahead, take a look.)  The word Bush fared better than Reagan in the early centuries of Early Modern era, experiencing occasional spikes in usage.  However that probably has more to do with the word for shrubbery appearing at the beginning of sentences than it has to do with certain members of the oil-dynasty from Texas, some of whom have been promoted or elected to various high positions in the United States government in the past 30 years.

Below I’ve called up a comparative n-gram (or ‘bi-gram’) of the words ‘God’ and ‘money,’ spanning the past five hundred years.

As we can see usage of the pronoun ‘God’ in print peaked during the late 1600’s through the early 1700’s, and at the end of the 18th century it began a precipitous decline, the frequency of its usage gradually approaching an almost perfect statistical convergence with the word ‘money’ not too long after the Industrial Revolution.  The usage of the word ‘God’ in print remains at a frequency slightly higher than the word ‘money’ in our present decade.

The appearance of the words ‘Angelina Jolie’ in print, surpassed the prevalence of the words ‘War in Afghanistan’ in early 2002, by a margin that has been growing consistently since that time.

To assemble their N-Gram cache, Google scanned 10% of all books ever published. That’s one out of every ten books, dating back to the invention of the printing press.  That’s an impressive sample and it will allow Google to map the evolution of language in print-form in amazing ways.  This, presumably, will ultimately inform the ‘discernment’ of their algorithm in ways and by means that I am not qualified even to hypothesize about.

It’s interesting that the N-Gram and Google’s Reading-Level filter came out in the same week. At this point, the reading-level filter is not informed by data from the n-gram cache (the reading level filter is informed by a group of teachers who graded sites along specific criteria), but we can imagine that as that tool becomes more nuanced, some bandwidths of data from the N-Gram may begin to come into play, framing the way that Google reads websites, and how we, in turn, encounter the written word.

Fun fact: Did you know that in order to harvest the parchment (sheep-skin) to produce one copy of the first print run of the Gutenberg Bible (the 1st book ever printed) 300 sheep had to be slaughtered?!  In intervening years, with the invention of blogs and so forth, the dissemination of text to an audience has become much less costly!


Google’s New ‘Reader Level’ Feature

Dec
16
2010

0
Comments

As of this week, Google’s Algorithm has developed a modicum of literary taste. Their new ‘Reading Levels’ feature grades the text of websites to filter search, according to whether the prose contained in the search results is ‘basic,’ ‘intermediate,’ or ‘advanced’.

In order to use the reading filter, click in to advanced search on Google’s Home Page.

After you’re in Advanced Search, open up the drop down about halfway down the page that’s labeled ‘Reading Level,’ which is the first option in the ‘Need more tools?’ column.

You can choose not to display your reading level, you can choose to annotate the results of your search with reading levels, or you can choose to filter your results so that only pages graded ‘advanced’(for example) will appear in your search results.

Google’s project manager, Nundu, explains the development as follows:

“The feature is based primarily on statistical models we built with the help of teachers. We paid teachers to classify pages for different reading levels, and then took their classifications to build a statistical model. With this model, we can compare the words on any webpage with the words in the model to classify reading levels. We also use data from Google Scholar, since most of the articles in Scholar are advanced.”

Google’s grading process is pretty steep, by the way. Only 5% of articles from the New Yorker–the chosen publication of the literati on this side of the Atlantic–were scored as ‘advanced.’    57% of their articles scored ‘intermediate,’ and 36% scored ‘beginner.’  My own favorite publication (which I thought would score higher than the New Yorker) scored even worse.

What demand is this new development responding to? Google explains the utility of this tool in their official blog. “This…new advanced search feature…categorizes results by reading level. For example, if you’re writing a college paper on [herbivores] you can refine search to see only advanced material, or if you’re a grade school teacher preparing for a class on [herbivores] you can refine to see only basic material.”

For years people have complained that the internet has been lowering the quality of public discourse. There has been some legitimate concern that since the internet drives the media, and the internet has not previously been able to refine search according to the quality or intelligence of discourse, the quality of news coverage has suffered.  (This is only one aspect of that whole line of thought.  A more important factor is how difficult it is to monetize news consumption online.)

So: up until now, Search Engines have been very literal-minded creatures. They have–if you will–been ‘philistines.’  This means that metrics for attracting audiences, up until this point, have not been able to measure how intelligent the copy on a particular site is.  With the introduction of the reading-level tool, the quality of prose on certain types of sites may see a come-back as one of the factors determining readership.   For example we see that the quality of prose on CNN’s news site outclasses Fox by a roughly 63% margin.

Fox News, in fact, is only faring slightly better than Disney’s site.

I would imagine that this tool will become more nuanced as time goes by.

Is this likely to effect business websites? Well, of course, it depends on the website.  If you’re a locksmith or a pizza parlor, this is not going to effect the volume of your seach traffic.  If you’re a chemical engineering lab that has a blog about new developments in chemistry, Google Reading Level’s opinion of your prose may have some effect on your traffic.


The difference between blog and news

Dec
14
2010

1
Comment

Many of our clients tell us they want a blog, news section or both on their website. It’s common for many to opt for a blog simply because “blog” is a buzz-word, but it’s important to know the difference between the two and which choice is right for their company and which choice will have the best impact.

I’ll break it down for you.

The News section of your site should be a factual timeline of your company. This is where you announce information very specific to your company, such as new hires, upcoming events, or changes to your service or product offerings. You can think of your news section as an area for press releases. Just be sure to present your company the way you want people to think of you: your brand.

For example:

  • Announce new products, services or offerings
  • Announce recent achievements or awards
  • Announce upcoming events

A blog serves as a space to discuss pertinent topics to your industry, not just your company. This area of your site allows you to be a thought leader within your industry and should always encourage open dialog and integration across social platforms. Blogging allows you to share your thoughts, opinions and reviews on a plethora of topics, just make sure to keep it interesting and relevant.

For example:

  • Informative – teach how (easy it is) to use your product
  • Editorial – offer opinions and reviews about topics related to your industry
  • Promotional – announce upcoming sales, specials or contests

Remember your audience and your point-of-view. Your News section is generally going to be more official, and your blog should have a more personal tone. While it is often expected to announce the author of a blog post, that isn’t necessary for a News post. Additionally, while blogging should always allow discussion through comments, a News section usually doesn’t.

Hopefully this clears up the difference between a blog and news, but there’s one very important thing to keep in mind: you have to constantly update them. Merely having a blog or news page does not make it worthwhile, and not keeping them fresh can actually send a negative message. Constantly posting blogs or news keeps your company in mind and establishes that your company is always up to date. Not only that, keeping it fresh makes a significant impact on your Search Engine Optimization!


Google Boost – The New Local Business Advertising Tool

Dec
10
2010

0
Comments

Google just introduced Boost, its newest advertising tool through Google Places. Currently in beta, Boost is available in only a handful of cities and not yet available in Indianapolis. However, in a recent conversation with a Google employee, I learned that Boost will be expanded to include Indianapolis in the not-so-distant future.

So small business owners, perk up your ears.

Designed for local small business owners, the core idea of Boost is simplicity itself. First, the business owner writes a simple business description, chooses business industry categories and sets a budget. Boost then automatically creates an ad campaign for your business, finding relevant keywords and managing your budget to achieve maximum potential. Essentially, Boost is a layman’s Adwords, without all the fuss of keyword research, geo-targeting or spending analysis.

Boost ads appear above the search results in the ‘sponsored ads’ section, or to the right of the search results.

On a local search that generates a map insert, Boost ads receive a blue pin on the map instead of the usual red pin.

Boost ads do not change organic search ranking and all analytics data is collected and viewed through the Places dashboard. Each Boost ad budget must be at least $50 monthly, but can be increased at any time. Ads can also be deleted at any point and the business owner will only be charged for the number of clicks that actually occurred during that time period.

More information about Boost billing and the advertising process can also be found in Google Places Help and on the Google Lat Long Blog.


Christmas & E-Commerce

Dec
9
2010

0
Comments

Love it or hate it: in 21st Century America, Christmas means commerce. Black Friday does not herald the shortening days just before Winter Solstice.  It’s an accounting term that refers to that special time of year when Retail’s annual balance sheets tip from red to black.

Speaking of Retail: in 2010, stores like Best Buy and Walmart maintain dominance of ‘short-tail’ items, but there is a growing dominance by Amazon and other e-commerce sites in long-tail items. (Note to the reader: short tail items are mainstream items, like plasma screen TV’s or whatever Barbie Doll happens to be on the shelf; while long-tail items are niche items, like last year’s plasma screen, or a specific Barbie Doll).

When they said that ‘E-Commerce’ was up and coming in the 90′s, you will have noticed that Target and Radioshack didn’t exactly shutter their doors.  But in 2010, e-commerce is an idea whose time has come.  In most retail markets, anyone who wants to stay competitive has to have an online arm in addition to their storefront.

Below we’re displaying a some of our personal favorite e-commerce websites that we’ve worked on in the past few years.  Merry Christmas to all our SmallBox clients!  Have some link-juice with your cup of cheer!

OCC Outdoors makes outdoor equipment for parks and urban convenience.  Since signing on with SmallBox in early 2010, OCC Outdoors total sales have more than doubled this year! Dave Fagle says that these new sales figures are due, in no small part, to the company’s new internet presence.

Kokomo Opalescent Glass has been a manufacturer of  colored  glass  for  art and architecture for the past 122 years.  They have worldwide distribution.  Having signed on for a complete rebuild with SmallBox this year, beyond an immediate (and sustained) 25% increase in traffic, a 300% increase in views and a doubling of time on site, KOG is now doing thousands of sales per month via their Website. This is bringing fresh revenue into the business and keeping the team busy, scrambling to take calles and orders coming in from the Website.

After surgical design and development interventions and SEO overhaul by SmallBox, Kipp Toys E-Commerce site saw their traffic jump by 290% in the early part of 2010, accompanied by a nearly 30% increase in online sales. In 2010, SmallBox also helped Kipp to launch Coasterstone and Lake Art, and they’ve been doing great. Doctor Todd’s: Relief for-the-FeetTaste of Indiana have been other e-commerce sites the SmallBox worked on in 2010 that have seen really positive results.

Small business doesn’t have to eat Amazon’s dust.  Contact SmallBox if you think that your website could benefit from an e-commerce upgrade in 2011.