Posted by Melanie Phung on Wednesday, October 26, 2005 at 9:33 am
… and I wasn’t invited! The invite-only Google Zeitgeist conference is taking place in Mountain View as I type. Maybe my invite got lost in the mail. Or maybe Google is just bitter because I blew off the Google Dance in August.
Posted by Melanie Phung on Tuesday, October 25, 2005 at 9:49 pm
In response to Danny Sullivan’s piece on why Google Print’s caching is different from reprinting copyrighted work, Dan Thies of SEO Research Labs writes: “The thrust of Danny’s argument, and I agree 100%, is that indexing the content of a book so that it becomes searchable is not the same thing as creating or publishing a copy of the book. He is correct about that, but his post perpetuates a misunderstanding about how search engines work.”
Someone’s saying that Danny Sullivan is presenting incorrect information on how a search engine works?? Them there’s fightin’ words. Okay, to be fair, he’s just clarifying a point in a fairly extensive piece. Both are worth reading if you want some insight into how a search engine queries its index.
Eric Schmidt’s op-ed in the Wall Street Journal about Google Print argues “fair use” from a less technical perspective:
http://googleblog.blogspot.com/2005/10/point-of-google-print.html
In related news: The Open Content Alliance is scheduled today to demonstrate technologies it will use in its own book-digitizing project. The Open Content Alliance — which is backed by Yahoo! — operates under an opt-in premise, rather than Google Print’s opt-out, and I think we’re going to see a pretty big war between the two camps. We’ll see how this plays out. More on this later.
Posted by Melanie Phung on Tuesday, October 25, 2005 at 11:46 am
It must not have seemed momentous because I don’t have a clue about my first time … the very first time I conducted a web search that is. There’s a potentially fun thread getting started over at the Search Engine Watch forum on this topic.
Posted by Melanie Phung on Monday, October 24, 2005 at 12:26 pm
Superb overview of the current blog landscape by Technorati’s Dave Sifry:
http://www.technorati.com/weblog/2005/10/53.html.
Among the data he shares:
- The total number of weblogs tracked continues to double about every 5 months
- The blogosphere is now over 30 times as big as it was 3 years ago, with no signs of letup in growth
- About 70,000 new weblogs are created every day
- About a new weblog is created each second
- 2% - 8% of new weblogs per day are fake or spam weblogs
- Between 700,000 and 1.3 million posts are made each day
- About 33,000 posts are created per hour, or 9.2 posts per second
- An additional 5.8% of posts (or about 50,000 posts/day) seen each day are from spam or fake blogs, on average
Posted by Melanie Phung on Monday, October 24, 2005 at 11:07 am
YSearchblog posted a great look at how to use Yahoo Video Search to subscribe to podcasts for your video iPod.
And if you’re wondering why it took me so long to even mention Yahoo Search… that’s because, I admit it, I don’t pay as much attention to it. Not because I think #2 doesn’t matter, but because Yahoo handcodes the top 20 results for the top 6% of search terms. Since that 6% includes those terms relevant to my job, it’s just not imperative that I follow Yahoo as closely.
Posted by Melanie Phung on Sunday, October 23, 2005 at 7:05 pm
To make up for yesterday’s bad advice, a real tip: did you know that you can type “inurl:password” into the Google search box to find all URLs that contain a password string? Hopefully you’re not entering your passwords into unencrypted forms, but the “inurl” operator is just one of many tools available to searchers and hackers alike.
And it’s one of the examples given in the useful but not-too-overwhelming October 4 SEO Chat article: Hiding Your Sensitive Data From Google and the World.
A Richness of Embarrassment
Don’t worry; the powers that be over at the Googleplex are aware of the security issues. CNet published sensitive information on Google exec Eric Schmidt in July — to demonstrate how easy it is to find such information using the Google search engine. In response to which the search giant blacklisted CNET reporters for a year. (What got less publicity is that Google eventually backed down after a few months.)
Posted by Melanie Phung on Saturday, October 22, 2005 at 1:50 pm
I think I’m starting to put together the big picture. Google announced it will take 300 years to index all the world’s content. Which is pretty ambitious. In order not to let down shareholders, it will have to destroy all the information it can’t index.
That still leaves quite a bit of content that is capable of being indexed, however, and they are going to have to provide additional tools and vertical searches.
Google already knows what you have on your desktop, in your transcripts, in your email and is keeping its eye on that pesky little Microsoft.
Google Everything Under the Sky is still in beta, but browse the new interface and check what information it doesn’t yet have, like bank account passwords, your prison diaries, etc. … before evidence of your existence is erased permanently. You can make it easy by posting everything on a crawlable webpage and submitting the link to Google. You information will be harvested within a week.
Disclaimer: This information is provided for amusement purposes only and does not constitute legal advice. If you are considering taking any action in response to this post, please check with your lawyer and your psychiatrist first.
Posted by Melanie Phung on Friday, October 21, 2005 at 3:16 pm
We’ve got people naming their baby Google (good luck to ya, kid) and debates about whether last weekend’s shift in search engine results should be called an “update” or a “tweak” or mini-”Dance”. In the SEO community, people feel pretty passionate about the semantics and all flock to the forums to engage in debates on the subject.
Personally, it doesn’t make a difference to me. If I did my job right, my sites continue to do well. If the rankings fall due to a change by Google, then it really doesn’t matter to me whether it technically qualifies as update. It’s just time to figure out what I need to be doing better vis-a-vis the competitor. Same as any other day.
I did learn something interesting in this week’s cyberchatter. But, first, let’s back up. When a Google update is spotted, like a hurricane, it gets named… but only if it’s determined to be a bona fide update (and with a sustained wind speed of 64 knots). SEOs talk about updates the way the public talks about hurricanes. You know where Katrina hit, what the landscape looked like afterwards, and you can imagine the kind of work that needs to be done to clean up the mess. Mention the Florida Update to anyone in the SEO industry; everyone will know exactly what you’re talking about and what happened to their sites as a result.
So back to the trivia: Contrary to what you might think, it’s not Google that gives the updates their names. That falls to a guy who runs WebmasterWorld.
I don’t know why he gets to do this. And I don’t know why he called last weekend’s update Jagger (although there seems to be at least one theory). But he did, so that’s what we all call it now.
Posted by Melanie Phung on Thursday, October 20, 2005 at 1:59 pm
New patents can give us a clue to what a search engine may be up to (although sometimes they’re just red herring). Yes, I did actually read the entire patent application filed by Google for information retrieval based on historical data, which was released in the spring, but I think I’m going to skip these six:
http://www.cre8asiteforums.com/viewtopic.php?t=28815
Posted by Melanie Phung on Thursday, October 20, 2005 at 10:53 am
Big to-do in Blogland this week. Some are saying Google has allowed the criminals to take over the asylum. Seems someone was crafty enough to find a way to automate the creation of thousands of blogs on Blogspot aka Blogger (a free blog creation and hosting site owned by Google). Those blogs then steal content from elsewhere on the web to lure users to the site and then bombard them with Google AdSense ads.
Since the blogs have no real content and are designed just to make money via commission on the ads, they’re called spam blogs (= splogs). The result? Lotsa (more) crap on the Internet.
Not familiar with splogs? I dare you to click the “next” button on the upper right of this page to get sent to a different blog. Odds are that if you do this a couple of times, you’ll find some pretty obvious examples. Unless Google has cleaned up the mess already.
Recourse
Matt Cutts - a V.V.I.P. over at Google - gives this tip on spotting and reporting splogs:
You see a low-quality site that is running AdSense. If you run across a site that you consider spammy and it has AdSense on it, click on the “Ads by Goooooogle” link and click “Send Google your thoughts on the ads you just saw”. Enter the words spamreport and jagger1 in the comments field.
Updated Oct. 21
Seems like a lot of the spam has been cleared out so it’s not as easy to find. I’m linking to an example so you can see what blog spam might look like.