terça-feira, 10 de março de 2009

Get up-to-date on Image Search

Webmaster Level: All

Recently at SMX West, I gave an Image Search presentation that I'd like to share with our broader webmaster community. The goal of the presentation was to provide insights into how image search is used, how it works, and how webmasters can optimize their pages for image searchers.

You'll see more information about:
  • Some background on the reach of Image Search
  • Interesting findings on the behavior of image searchers
  • Our efforts at handling multiple image referrers
  • How to best feature images (image quality and placement, relevant surrounding text, etc.)
Take a look and let us know your thoughts in the comments. We'd love to hear from you.



quarta-feira, 4 de março de 2009

Using stats from site: and Sitemap details

Webmaster Level: Beginner to Intermediate

Every now and then in the webmaster blogosphere and forums, this issue comes up: when a webmaster performs a [site:example.com] query on their website, the number of indexed results differs from what is displayed in their Sitemaps report in Webmaster Tools. Such a discrepancy may smell like a bug, but it's actually by design. Your Sitemap report only reflects the URLs you've submitted in your Sitemap file. The site operator, on the other hand, takes into account whatever Google has crawled, which may include URLs not included in your Sitemap, such as newly added URLs or other URLs discovered via links.

Think of the site operator as a quick diagnosis of the general health of your site in Google's index. Site operator results can show you:
  • a rough estimate of how many pages have been indexed
  • one indication of if your site has been hacked
  • if you have duplicate titles or snippets
Here is an example query using the site operator:



Your Sitemap report provides more granular statistics about the URLs you submitted, such as the number of indexed URLs vs. the number submitted for crawling, and Sitemap-specific warnings or errors that may have occurred when Google tried to access your URLs.

Sitemap report

Feel free to check out our Help Center for more on the site: operator and Sitemaps. If you have further questions or issues, please post to our Webmaster Help Forum, where experienced webmasters and Googlers are happy to help.

Posted by Charlene Perez

quarta-feira, 25 de fevereiro de 2009

Canonical Link Element: presentation from SMX West

A little while ago, Google and other search engines announced support for a canonical link element that can help site owners with duplicate content issues. I recreated my presentation from SMX West and you can watch it below:



You can access the slides directly or follow along here:



By the way, Ask just announced that they will support the canonical link element. Read all about it in the Ask.com blog entry.

Thanks again to Wysz for turning this into a great video.

In fact, you might not have seen it, but we recently created a webmaster videos channel on YouTube. If you're interested, you can watch the new webmaster channel. If you subscribe to that channel, you'll always find out about new webmaster-related videos from Google.

segunda-feira, 23 de fevereiro de 2009

Introducing the Google Webmaster Central YouTube Channel

In his State of the Index presentation, Matt Cutts said that one of the things to look for from Google in 2009 is continued communication with webmasters. On the Webmaster Central team, we've found that using video is a great way to reach people. We've shown step-by-step instructions on how to use features of Webmaster Tools, shared our presentations with folks who were unable to make it to conferences, and even taken you through a day in the life of our very own Maile Ohye as she meets with many Googlers involved in webmaster support.

We plan on releasing more videos like these in the future, so we've opened up our own channel on YouTube to host webmaster-related videos. Our first video is already up, and we'll have more to share with you soon. If you want to be the first to know when we release something new, you can subscribe to us using your YouTube account, or grab this RSS feed if you'd like to keep track in your feed reader. Please let us know how you like the channel, and use the comments in this post to share your ideas for future videos.

And while we'll all do our best to make sure Matt Cutts understands that Rick Rolling is so last year, be careful where you click on April 1st.

sexta-feira, 20 de fevereiro de 2009

Best practices against hacking

These days, the majority of websites are built around applications to provide good services to their users. In particular, are widely used to create, edit and administrate content. Due to the interactive nature of these systems, where the input of users is fundamental, it's important to think about security in order to avoid exploits by malicious third parties and to ensure the best user experience.

Some types of hacking attempts and how to prevent them

There are many different types of attacks hackers can conduct in order to take partial or total control of a website. In general, the most common and dangerous ones are SQL injection and cross-site scripting (XSS).

SQL injection is a technique to inject a piece of malicious code in a web application, exploiting a security vulnerability at the database level to change its behavior. It is a really powerful technique, considering that it can manipulate URLs (query string) or any form (search, login, email registration) to inject malicious code. You can find some examples of SQL injection at the Web Application Security Consortium.

There are definitely some precautions that can be taken to avoid this kind of attack. For example, it's a good practice to add a layer between a form on the front end and the database in the back end. In PHP, the PDO extension is often used to work with parameters (sometimes called placeholders or bind variables) instead of embedding user input in the statement. Another really easy technique is character escaping, where all the dangerous characters that can have a direct effect on the database structure are escaped. For instance, every occurrence of a single quote ['] in a parameter must be replaced by two single quotes [''] to form a valid SQL string literal. These are only two of the most common actions you can take to improve the security of a site and avoid SQL injections. Online you can find many other specific resources that can fit your needs (programming languages, specific web applications ...).

The other technique that we're going to introduce here is cross-site scripting (XSS). XSS is a technique used to inject malicious code in a webpage, exploiting security vulnerabilities of web applications. This kind of attack is possible where the web application is processing data obtained through user input and without any further check or validation before returning it to the final user. You can find some examples of cross-site scripting at the Web Application Security Consortium.

There are many ways of securing a web application against this technique. Some easy actions that can be taken include:
  • Stripping the input that can be inserted in a form (for example, see the strip tags function in PHP);
  • Using data encoding to avoid direct injection of potentially malicious characters (for example, see the htmlspecialchars function in PHP);
  • Creating a layer between data input and the back end to avoid direct injection of code in the application.
Some resources about CMSs security

SQL injection and cross-site scripting are only two of the many techniques used by hackers to attack and exploit innocent sites. As a general security guideline, it's important to always stay updated on security issues and, in particular when using third party software, to make sure you've installed the latest available version. Many web applications are built around big communities, offering constant support and updates.
To give a few examples, four of the biggest communities of Open Source content management systems—Joomla, WordPress, PHP-Nuke, and Drupal—offer useful guidelines on security on their websites and host big community-driven forums where users can escalate issues and ask for support. For instance, in the Hardening WordPress section of its website, WordPress offers comprehensive documentation on how to strengthen the security of its CMS. Joomla offers many resources regarding security, in particular a Security Checklist with a comprehensive list of actions webmasters should take to improve the security of a website based on Joomla. On Drupal's site, you can access information about security issues by going to their Security section. You can also subscribe to their security mailing list to be constantly updated on ongoing issues. PHP-Nuke offers some documentation about Security in chapter 23 of their How to section, dedicated to the system management of this CMS platform. They also have a section called Hacked - Now what? that offers guidelines to solve issues related to hacking.

Some ways to identify the hacking of your site

As mentioned above, there are many different types of attacks hackers can perform on a site, and there are different methods of exploiting an innocent site. When hackers are able to take complete control of a site, they can deface it (changing the homepage), erase all the content (dropping the tables of your database), or insert malware or cookie stealers. They can also exploit a site for spamming, such as by hiding links pointing to spammy resources or creating pages that redirect to malware sites. When these changes in your application are evident (like defacing), you can easily spot the hacking activity; but for other types of exploits, in particular those with spammy intent, it won't be so obvious. Google, through some of its products, offers webmasters some ways of spotting if a site has been hacked or modified by a third party without permission. For example, by using Google Search you can spot typical keywords added by hackers to your website and identify the pages that have been compromised. Just open google.com and run a site: search query on your website, looking for commercial keywords that hackers commonly use for spammy purposes (such as viagra, porn, mp3, gambling, etc.):

[site:example.com viagra]

If you're not already familiar with the site: search operator, it's a way to query Google by restricting your search to a specific site. For example, the search site:googleblog.blogspot.com will only return results from the Official Google Blog. When adding spammy keywords to this type of query, Google will return all the indexed pages of your website that contain those spammy keywords and that are, with high probability, hacked. To check these suspicious pages, just open the cached version proposed by Google and you will be able to spot the hacked behavior, if any. You could then clean up your compromised pages and also check for any anomalies in the configuration files of your server (for example on Apache web servers: .htaccess and httpd.conf).
If your site doesn't show up in Google's search results anymore, it could mean that Google has already spotted bad practices on your site as a result of the hacking and may have temporarily removed it from our index, due to infringement of our webmaster quality guidelines.

In order to constantly keep an eye on the presence of suspicious keywords on your website, you could also use Google Alerts to monitor queries like:

site:example.com viagra OR casino OR porn OR ringtones

You will receive an email alert whenever these keywords are found in the content of your site.

You can also use Google's Webmaster Tools to spot any hacking activity on your site. Webmaster Tools provide statistics about top search queries for your site. This data will help you to monitor if your site is ranking for suspicious unrelated spammy keywords. The 'What Googlebot sees' data is also useful, since you'll see whether Google is detecting any unusual keywords on your site, regardless of whether you're ranking for them or not.

If you have a Webmaster Tools account and Google believes that your site has been hacked, often you will be notified according to the type of exploit on your site:
  • If a malicious third party is using your site for spammy behaviors (such as hiding links or creating spammy pages) and it has been detected by our crawler, often you will be notified in the Message Center with detailed information (a sample of hacked URLs or anchor text of the hidden links);
  • If your site is exploited to place malicious software such as malware, you will see a malware warning on the 'Overview' page of your Webmaster Tools account.
Hacked behavior removed, now what?

Your site has been hacked or is serving malware? First, clean up the malware mess and then do one of the following:
  • If your site was hacked for spammy purpose, please visit our reconsideration request page through Webmaster Tools to request reconsideration of your site;
  • If your site was serving malware to users, please submit a malware review request on the 'Overview' page of Webmaster Tools.
We hope that you'll find these tips helpful. If you'd like to share your own advice or experience, we encourage you to leave a comment to this blog post. Thanks!

quarta-feira, 18 de fevereiro de 2009

State of the Index: my presentation from PubCon Vegas

It seems like people enjoyed when I recreated my Virtual Blight talk from the Web 2.0 Summit late last year, so we decided to post another video. This video recreates the "State of the Index" talk that I did at PubCon in Las Vegas late last year as well.

Here's the video of the presentation:



and if you'd like to follow along, here are the slides:



You can also access the presentation directly. Thanks again to Wysz for recording this video and splicing the slides into the video.

quinta-feira, 12 de fevereiro de 2009

Specify your canonical

Carpe diem on any duplicate content worries: we now support a format that allows you to publicly specify your preferred version of a URL. If your site has identical or vastly similar content that's accessible through multiple URLs, this format provides you with more control over the URL returned in search results. It also helps to make sure that properties such as link popularity are consolidated to your preferred version.

Let's take our old example of a site selling Swedish fish. Imagine that your preferred version of the URL and its content looks like this:

http://www.example.com/product.php?item=swedish-fish


However, users (and Googlebot) can access Swedish fish through multiple (not as simple) URLs. Even if the key information on these URLs is the same as your preferred version, they may show slight content variations due to things like sort parameters or category navigation:

http://www.example.com/product.php?item=swedish-fish&category=gummy-candy

Or they have completely identical content, but with different URLs due to things such as a tracking parameters or a session ID:

http://www.example.com/product.php?item=swedish-fish&trackingid=1234&sessionid=5678

Now, you can simply add this <link> tag to specify your preferred version:

<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />

inside the <head> section of the duplicate content URLs:

http://www.example.com/product.php?item=swedish-fish&category=gummy-candy
http://www.example.com/product.php?item=swedish-fish&trackingid=1234&sessionid=5678


and Google will understand that the duplicates all refer to the canonical URL: http://www.example.com/product.php?item=swedish-fish. Additional URL properties, like PageRank and related signals, are transferred as well.

This standard can be adopted by any search engine when crawling and indexing your site.

Of course you may have more questions. Joachim Kupke, an engineer from our Indexing Team, is here to provide us with the answers:

Is rel="canonical" a hint or a directive?
It's a hint that we honor strongly. We'll take your preference into account, in conjunction with other signals, when calculating the most relevant page to display in search results.

Can I use a relative path to specify the canonical, such as <link rel="canonical" href="product.php?item=swedish-fish" />?
Yes, relative paths are recognized as expected with the <link> tag. Also, if you include a <base> link in your document, relative paths will resolve according to the base URL.

Is it okay if the canonical is not an exact duplicate of the content?
We allow slight differences, e.g., in the sort order of a table of products. We also recognize that we may crawl the canonical and the duplicate pages at different points in time, so we may occasionally see different versions of your content. All of that is okay with us.

What if the rel="canonical" returns a 404?
We'll continue to index your content and use a heuristic to find a canonical, but we recommend that you specify existent URLs as canonicals.

What if the rel="canonical" hasn't yet been indexed?
Like all public content on the web, we strive to discover and crawl a designated canonical URL quickly. As soon as we index it, we'll immediately reconsider the rel="canonical" hint.

Can rel="canonical" be a redirect?
Yes, you can specify a URL that redirects as a canonical URL. Google will then process the redirect as usual and try to index it.

What if I have contradictory rel="canonical" designations?
Our algorithm is lenient: We can follow canonical chains, but we strongly recommend that you update links to point to a single canonical page to ensure optimal canonicalization results.

Can this link tag be used to suggest a canonical URL on a completely different domain?
**Update on 12/17/2009: The answer is yes! We now support a cross-domain rel="canonical" link element.**

Previous answer below:
No. To migrate to a completely different domain, permanent (301) redirects are more appropriate. Google currently will take canonicalization suggestions into account across subdomains (or within a domain), but not across domains. So site owners can suggest www.example.com vs. example.com vs. help.example.com, but not example.com vs. example-widgets.com.

Sounds great—can I see a live example?
Yes, wikia.com helped us as a trusted tester. For example, you'll notice that the source code on the URL http://starwars.wikia.com/wiki/Nelvana_Limited specifies its rel="canonical" as: http://starwars.wikia.com/wiki/Nelvana.

The two URLs are nearly identical to each other, except that Nelvana_Limited, the first URL, contains a brief message near its heading. It's a good example of using this feature. With rel="canonical", properties of the two URLs are consolidated in our index and search results display wikia.com's intended version.

Feel free to ask additional questions in our comments below. And if you're unable to implement a canonical designation link, no worries; we'll still do our best to select a preferred version of your duplicate content URLs, and transfer linking properties, just as we did before.

Update: this link-tag is currently also supported by Ask.com, Microsoft Live Search and Yahoo!.

Update: for more information, please see our Help Center articles on canonicalization and rel=canonical.

quarta-feira, 11 de fevereiro de 2009

Help us help you

You're a webmaster, right? Well, we love webmasters! To ensure we give you the best support possible, we've set up a survey to get your thoughts on Webmaster Central and our related support efforts. If you have a few extra minutes this week, please click here to give us your honest feedback.

Thanks from all of us on the Webmaster Central Team.

Google Friend Connect introduces the social bar

Update: The described product or service is no longer available.

In our previous Google Friend Connect posts, we've enjoyed connecting with you, the webmasters, and hearing your feedback about Friend Connect. We're now standing on our own two feet -- find us over at the new Social Web Blog where we just announced the new social bar feature.

The social bar packages many of the basic social functions -- sign-in, site activities, comments, and members -- into a single strip that appears at the top or bottom of your website. You can use it alone, or use it to complement your existing social gadgets, by putting it on the top or bottom of as many of your webpages as you want.

For anyone visiting your site, the social bar offers a snapshot of the activity taking place within your website's community. One click on any these features produces a convenient, interactive drop-down gadget, so users get all the functionality of the Friend Connect gadgets, while you save real estate on your website. With the social bar, visitors can:
  • Join or sign in to your site, view and edit their profiles, and change their personal settings.
  • View recent activity on your website, including new members and posts on any of your pages.
  • Post on your wall or read and reply to others' comments.
  • See the other members of your site, check out other peoples' profiles, and become friends. Users can also find out if any of their existing friends are members of your site.
Watch this quick video to learn how easy it is to add a social bar to your website:


To try out the social bar before deciding whether to add it to your website, visit:
http://www.ossamples.com/socialmussie/

sexta-feira, 30 de janeiro de 2009

Open redirect URLs: Is your site being abused?

No one wants malware or spammy URLs inserted onto their domain, which is why we all try to follow good security practices. But what if there were a way for spammers to take advantage of your site, without ever setting a virtual foot in your server?

There is, by abusing open redirect URLs.

Webmasters face a number of situations where it's helpful to redirect users to another page. Unfortunately, redirects left open to any arbitrary destination can be abused. This is a particularly onerous form of abuse because it takes advantage of your site's functionality rather than exploiting a simple bug or security flaw. Spammers hope to use your domain as a temporary "landing page" to trick email users, searchers and search engines into following links which appear to be pointing to your site, but actually redirect to their spammy site.

We at Google are working hard to keep the abused URLs out of our index, but it's important for you to make sure your site is not being used in this way. Chances are you don't want users finding URLs on your domain that push them to a screen full of unwanted porn, nasty viruses and malware, or phishing attempts. Spammers will generate links to make the redirects appear in search results, and these links tend to come from bad neighborhoods you don't want to be associated with.

This sort of abuse has become relatively common lately so we wanted to get the word out to you and your fellow webmasters. First we'll give some examples of redirects that are actively being abused, then we'll talk about how to find out if your site is being abused and what to do about it.

Redirects being abused by spammers

We have noticed spammers going after a wide range of websites, from large well-known companies to small local government agencies. The list below is a sample of the kinds of redirect we have seen used. These are all perfectly legitimate techniques, but if they're used on your site you should watch out for abuse.

  • Scripts that redirect users to a file on the server—such as a PDF document—can sometimes be vulnerable. If you use a content management system (CMS) that allows you to upload files, you might want to make sure the links go straight to the file, rather than going through a redirect. This includes any redirects you might have in the downloads section of your site. Watch out for links like this:
example.com/go.php?url=
example.com/ie/ie40/download/?

  • Internal site search result pages sometimes have automatic redirect options that could be vulnerable. Look for patterns like this, where users are automatically sent to any page after the "url=" parameter:
example.com/search?q=user+search+keywords&url=

  • Systems to track clicks for affiliate programs, ad programs, or site statistics might be open as well. Some example URLs include:
example.com/coupon.jsp?code=ABCDEF&url=
example.com/cs.html?url=

  • Proxy sites, though not always technically redirects, are designed to send users through to other sites and therefore can be vulnerable to this abuse. This includes those used by schools and libraries. For example:
proxy.example.com/?url=

  • In some cases, login pages will redirect users back to the page they were trying to access. Look out for URL parameters like this:
example.com/login?url=

  • Scripts that put up an interstitial page when users leave a site can be abused. Lots of educational, government, and large corporate web sites do this to let users know that information found on outgoing links isn't under their control. Look for URLs following patterns like this:
example.com/redirect/
example.com/out?
example.com/cgi-bin/redirect.cgi?

Is my site being abused?

Even if none of the patterns above look familiar, your site may have open redirects to keep an eye on. There are a number of ways to see if you are vulnerable, even if you are not a developer yourself.

  • Check if abused URLs are showing up in Google. Try a site: search on your site to see if anything unfamiliar shows up in Google's results for your site. You can add words to the query that are unlikely to appear in your content, such as commercial terms or adult language. If the query [site:example.com viagra] isn't supposed to return any pages on your site and it does, that could be a problem. You can even automate these searches with Google Alerts.

  • You can also watch out for strange queries showing up in the Top search queries section of Webmaster Tools. If you have a site dedicated to the genealogy of the landed gentry, a large number of queries for porn, pills, or casinos might be a red flag. On the other hand, if you have a drug info site, you might not expect to see celebrities in your top queries. Keep an eye on the Message Center in Webmaster Tools for any messages from Google.

  • Check your server logs or web analytics package for unfamiliar URL parameters (like "=http:" or "=//") or spikes in traffic to redirect URLs on your site. You can also check the pages with external links in Webmaster Tools.

  • Watch out for user complaints about content or malware that you know for sure can not be found on your site. Your users may have seen your domain in the URL before being redirected and assumed they were still on your site.


What you can do

Unfortunately there is no one easy way to make sure that your redirects aren't exploited. An open redirect isn't a bug or a security flaw in and of itself—for some uses they have to be left fairly open. But there are a few things you can do to prevent your redirects from being abused or at least to make them less attractive targets. Some of these aren't trivial; you may need to write some custom code or talk to your vendor about releasing a patch.

  • Change the redirect code to check the referer, since in most cases everyone coming to your redirect script legitimately should come from your site, not a search engine or elsewhere. You may need to be permissive, since some users' browsers may not report a referer, but if you know a user is coming from an external site you can stop or warn them.

  • If your script should only ever send users to an internal page or file (for example, on a page with file downloads), you should specifically disallow off-site redirects.

  • Consider using a whitelist of safe destinations. In this case your code would keep a record of all outgoing links, and then check to make sure the redirect is a legitimate destination before forwarding the user on.

  • Consider signing your redirects. If your website does have a genuine need to provide URL redirects, you can properly hash the destination URL and then include that cryptographic signature as another parameter when doing the redirect. That allows your own site to do URL redirection without opening your URL redirector to the general public.

  • If your site is really not using it, just disable or remove the redirect. We have noticed a large number of sites where the only use of the redirect is by spammers—it's probably just a feature left turned on by default.

  • Use robots.txt to exclude search engines from the redirect scripts on your site. This won't solve the problem completely, as attackers could still use your domain in email spam. Your site will be less attractive to attackers, though, and users won't get tricked via web search results. If your redirect scripts reside in a subfolder with other scripts that don't need to appear in search results, excluding the entire subfolder may even make it harder for spammers to find redirect scripts in the first place.



Open redirect abuse is a big issue right now but we think that the more webmasters know about it, the harder it will be for the bad guys to take advantage of unwary sites. Please feel free to leave any helpful tips in the comments below or discuss in our Webmaster Help Forum.

quinta-feira, 22 de janeiro de 2009

Year in Review

2008 was another great year for the Webmaster Central team. We experienced tremendous user growth with our blogs (97% increase in monthly pageviews), Help Center (25%), Help Forums (225%), and Webmaster Tools (35%). We would like to welcome our new users that joined us in '08, and thank our loyal and passionate user base that have been with us for the last couple of years. We focused on two basic goals for 2008, and here's how we think we did:

Goal #1: Educate and grow our webmaster community
  • We had our first ever online webmaster chat in February '08 to answer your top questions, and followed it up with three more. They have been incredibly successful, and we're planning for more this year.
  • We'd like to send a special thank you to our Bionic Posters, who have played a huge part in supporting our growing community.
  • Localization has been a big focus for us, so we launched our blog and Help Center in additional languages, and made Webmaster Tools available in 40 languages. We hope this makes it easier for people in other parts of the world to adopt our tools and gain a better understanding of how search works.
  • We launched a new Help Forum in English and Polish, with a broader rollout planned in other languages this year.
  • Our SEO starter guide was released and it has been one of our most successful articles to date.
  • We placed an emphasis on sharing material via YouTube and created seven video series totaling two hours of content. We kicked off '09 with a bang on the video front with Matt's "Virtual Blight" presentation.
Goal #2: Iterate early and often on Webmaster Tools
Thank you once again and we hope for another exciting and eventful year!

quinta-feira, 15 de janeiro de 2009

Adding a social playlist to your site

As you're building your site, you may be looking for a simple way to provide fresh content that captures the attention of first time visitors and loyal users alike. They say that music brings people together, so what better way to engage your visitors than by inviting them to help build a unique, collaborative soundtrack for your website? Now, social application creator iLike has built a special version of their social playlist gadget for sites using Google Friend Connect.

Visitors can add their favorite songs
iLike's playlist gadget lets you and your visitors shape the site's "musical footprint" as a group. With this application, anyone visiting your website can listen to songs on the playlist, and if they sign in using Friend Connect, they can add their own favorites to the list. Of course, you can also add songs to the playlist, and as the site administrator, you have the ability to remove songs or change the order.

If you already have Friend Connect running on your website, you can add some musical flair in a matter of minutes with just a few clicks. Sign in at www.google.com/friendconnect, click "Social Gadgets," and you'll find the iLike "Playlist gadget" in the gallery.


Select the "Playlist gadget," and Friend Connect will automatically generate a snippet of code for you to copy-and-paste into your website's HTML. While you're there, you may also consider adding the "Wall gadget"—music can be a great conversation starter!

This iLike gadget is fully integrated with your existing Friend Connect account, so you can edit your website's playlist, moderate wall posts, and manage membership all from a single interface.

Like all of the social applications that work with Friend Connect, iLike's application is built using OpenSocial, and it's a great example of how a social application can foster a sense of community around a website. Any site using Friend Connect can host gadgets created by the OpenSocial developer community.

If you're a site owner who wants to begin adding social features to your website, visit Google Friend Connect. No programming is required!

If you're a developer interested in building a social application to run on the tens thousands of websites that are now using Google Friend Connect, learn more at www.opensocial.org.

quarta-feira, 14 de janeiro de 2009

Seamless verification of Google Sites and Blogger with Webmaster Tools

Note: Verification of Blogger blogs in Webmaster Tools has changed significantly. Please see the more recent blog post "Verifying a Blogger blog in Webmaster Tools" for more details.


Verifying that you own a site is the first step towards accessing all of the great features Webmaster Tools has to offer, such as crawl errors and query statistics. The Google Sites and Blogger teams have worked hard to make site verification as simple as possible. In the following videos, I'll walk you through how to verify sites created in Google Sites and Blogger.

Google Sites:


Blogger:


These videos are available in our Help Center if you have additional questions about verifying a Google Site or Blogger blog with Webmaster Tools. And as always, you can find me and many other Googlers and webmasters in our Webmaster Help Forum.

terça-feira, 13 de janeiro de 2009

A new Google Sitemap Generator for your website

It's been well over three years since we initially announced the Python Sitemap generator in June 2005. In this time, we've seen lots of people create great third-party Sitemap generators to help webmasters create better Sitemap files. While most Sitemap generators either crawl websites or list the files on a server, we have created a different kind of Sitemap generator that uses several ways to find URLs on your website and then allows you to automatically create and maintain different kinds of Sitemap files.

Google Sitemap Generator screenshot of the admin console

About Google Sitemap Generator


Our new open-source Google Sitemap Generator finds new and modified URLs based on your webserver's traffic, its log files, or the files found on the server. By combining these methods, Google Sitemap Generator can be very fast in finding these URLs and calculating relevant metadata, thereby making your Sitemap files as effective as possible. Once Google Sitemap Generator has collected the URLs, it can create the following Sitemap files for you:

In addition, Google Sitemap Generator can send a ping to Google Blog Search for all of your new or modified URLs. You can optionally include the URLs of the Sitemap files in your robots.txt file as well as "ping" the other search engines that support the sitemaps.org standard.

Sending the URLs to the right Sitemap files is simple thanks to the web-based administration console. This console gives you access to various features that make administration a piece of cake while maintaining a high level of security by default.

Getting started


Google Sitemap Generator is a server plug-in that can be installed on both Linux/Apache and Microsoft IIS Windows-based servers. As with other server-side plug-ins, you will need to have administrative access to the server to install it. You can find detailed information for the installation in the Google Sitemap Generator documentation.

We're excited to release Google Sitemap Generator with the source code and hope that this will encourage more web hosters to include this or similar tools in their hosting packages!

Do you have any questions? Feel free to drop by our Help Group for Google Sitemap Generator or ask general Sitemaps question in our Webmaster Help Forum.

segunda-feira, 12 de janeiro de 2009

Preventing Virtual Blight: my presentation from Web 2.0 Summit

One of the things I'm thinking about in 2009 is how Google can be even more transparent and communicate more. That led me to a personal goal for 2009: if I give a substantial conference presentation (not just a question and answer session), I'd like to digitize the talk so that people who couldn't attend the conference can still watch the presentation.

In that spirit, here's a belated holiday present. In November 2008 I spoke on a panel about "Preventing Virtual Blight" at the Web 2.0 Summit in San Francisco. A few weeks later I ended up recreating the talk at the Googleplex and we recorded the video. In fact, this is a "director's cut" because I could take a little more time for the presentation. Here's the video of the presentation:



And if you'd like to follow along at home, I'll include the actual presentation as well:



You can also access the presentation directly. By the way thanks to Wysz for recording this not just on a shoestring budget but for free. I think we've got another video ready to go pretty soon, too.