The calendar is gone.
Click here to view posts


Site map
Looking around google's Webmaster Tools I found a well hidden link to google's help page about site map formatting. Looking at the Sitemap protocol (just a simple xml file) I set off to create a method that would take my dynamic website and create a nice little map for it. After all if you can not find it in google, you really don't need it.


I am sure I can come up with a general method to do this, but currently it is specific to how I have created my website.

You have to start your xml docuemnt. I really like using rexml/document, you will also need zlib class
data=REXML::Document.new('<?xml version="1.0" encoding="UTF-8"?>')
base=data.add_element("urlset") 
base.attributes["xmlns"]="http://www.google.com/schemas/sitemap/0.84"</pre>
<br>These values are found on the Sitemap protocol website<br><br>
Then I add in my static content values<br><pre>
el=base.add_element("url")
el.add_element("loc").text="http://www.stephenbeckeriv.com/"
#Time needs to be in w3c formatting, For the life of me I can not use Time's w3cdtf
el.add_element("lastmod").text=Time.now.strftime("%Y-%m-%d")
el.add_element("changefreq").text="weekly"
el.add_element("priority").text="0.5"
#turns out they do not like svn sites 
#el=base.add_element("url")
#el.add_element("loc").text="http://www.svn.stephenbeckeriv.com/code"
#el.add_element("lastmod").text=Time.now.strftime("%Y-%m-%d")
#el.add_element("changefreq").text="weekly"
#el.add_element("priority").text="0.5"</pre>
<BR><BR>Then I add in the context part of my site<br>
<pre>
@context_list = Post.find(:all,:select=>"context")
arr=[]
@context_list.each{|x|arr.push(x.context)}
arr.uniq.each{|context| 
   el=base.add_element("url")
   el.add_element("loc").text="http://www.stephenbeckeriv.com/#{context}"
   el.add_element("lastmod").text=Time.now.strftime("%Y-%m-%d")
   el.add_element("changefreq").text="weekly"
   el.add_element("priority").text="0.5" 
} </pre>
<br> I know I can uniq my results with the sql, but I could not remember the code for this. <br><BR>
Then I add my dynamic content<br><pre> 
      @post=Post.find(:all)
      @post.each{|post| 
        el=base.add_element("url")
#if you read my first post you know why this works         
el.add_element("loc").text="http://www.stephenbeckeriv.com/#{URI.encode(post.title.gsub(" ","_"))}"
        a=Time.parse("#{post.created}")#it did not like having the date formatting on one line.
        el.add_element("lastmod").text=a.strftime("%Y-%m-%d")
        el.add_element("changefreq").text="never"
        el.add_element("priority").text="0.8"
      }</pre>
<br><BR>We Now have all the web pages for my site. Now we need to save this to a file in public. There are some limits for the file. You can not have over 500,000 urls, the file must be less then 10MB. If this is a problem, you have to use the site map index files. I do not cover this. <br><br><pre>
      result=""
      data.write(result)
      dir=Dir.pwd
      #local dir.does not have /public
      dir<< "/public/sitemap.xml.gz" if !dir.include("/public")
      #server does have public
      dir<< "/sitemap.xml.gz" if dir.include("/public")
      Zlib::GzipWriter.open(Dir.pwd+"/public/sitemap.xml.gz"){|file|
        file.write result
      }


And you are done! Make this a method on your admin console and submit the link to google. A few things I want to do with this is provide a link to the xml, and create a dynamic site map with ajax using the same ideas. If someone wanted to present a map to my site they could use the xml and format it how they like. I am thinking about creating a general map that looks at the views and routes.rb to try and create the proper links.


Update:
Turns out when i added my site to the webmaster tools in google i used http://stephenbeckeriv.com/ not http://www.stephenbeckeriv.com/ which it does not like my site map because I added the www. How fun and flexible.