~

Composing URL’s in XSLT

In Uncategorized on September 28, 2010 at 8:11 pm

Composing a complex URL can be very ugly in XSLT, using a very long formula of concatenations and encodings.
E.g the REST URL for this YQL query:

select * from html where url=”http://www.doorstroming.net/index.php/actua/49-qnationalisme-is-nationalismeq.html” and
xpath=’//*[contains(concat( " ", @class, " " ), concat( " ", "MsoNormal", " " ))]‘

Is ugly as sin:

http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22http%3A%2F%2Fwww.doorstroming.net%2Findex.php%2Factua%2F49-qnationalisme-is-nationalismeq.html%22%20and%0A%20%20%20%20%20%20xpath%3D’%2F%2F*%5Bcontains(concat(%20%22%20%22%2C%20%40class%2C%20%22%20%22%20)%2C%20concat(%20%22%20%22%2C%20%22MsoNormal%22%2C%20%22%20%22%20))%5D’&diagnostics=true&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys

Two possible improvements:

Dedicated URL composition template

Create a template that can be called like this:

<xsl:call-template name=”url”>
<xsl:with-param name=”decom”>
<url>
<base>http://query.yahooapis.com/v1/public/yql</base>
<query>
<name>q</name>
<value><xsl:value-of select=”$query”/></value>
</query>
<query>
<name>diagnostics</name>
<value>false</value>
</query>
<query>
<name>env</name>
<value>store://datatables.org/alltableswithkeys</name>
</query>
</url>
</xsl:with-param>
</xsl:call-template>

I’m not elaborating this template here. It should be quite straightforward, but there may be a more generic and even more elegant solution:

String Replacement Template

Using the replace template that was created in the XML Translation project.

<xsl:call-template name=”replace”>
<xsl:with-param name=”string”>
<xsl:text>http://query.yahooapis.com/v1/public/yql?q=$query&diagnostics=true&env=$env</xsl:text>
</xsl:with-param>
<xsl:with-param name=”parameters”>
<query><xsl:value-of select=”encode-for-uri($query)”/></query>
<env><xsl:value-of select=”encode-for-uri(‘store://datatables.org/alltableswithkeys’)”/></env>
</xsl:with-param>
</xsl:call-template>

This is the replace template:

<xsl:template name=”replace”>
<xsl:param name=”string”/>
<xsl:param name=”parameters”/>
<xsl:choose>
<xsl:when test=”count($parameters/*) &gt; 0″>
<xsl:variable name=”string2″>
<xsl:value-of select=”replace($string,concat(‘\$’,name($parameters/*[1])),normalize-space($parameters/*[1]))”/>
</xsl:variable>
<xsl:call-template name=”replace”>
<xsl:with-param name=”string” select=”$string2″/>
<xsl:with-param name=”parameters”>
<xsl:copy-of select=”$parameters/*[position() &gt; 1]“/>
</xsl:with-param>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select=”$string”/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

Vorstalarm

In Uncategorized on September 25, 2010 at 7:59 pm

YQL Open Table Design

use “http://github.com/vicmortelmans/yql-tables/raw/master/weather/weather.frost.xml”  as weather.frost;select * from weather.frost where location = “Ringsaker”

(note: this location is somewhere in northern Scandinavia, where it’s more likely to freeze :)

Google Weather API

Google’s secret weather API:

http://www.google.com/ig/api?weather=antwerpen

YQL Open Table Execute

var url = “http://www.google.com/ig/api?weather=$location“;
url = url.replace(/\$location/,encode-uri(location));
xml = y.query(“select * from xml where url=’http://www.google.com/ig/api?weather=antwerpen‘”).results;
var seq = new String();
for each (var lowtemp in xml..forecast_conditions.low.@data) {
seq += (lowtemp > 32 ? ‘+’ : ‘-’);
}
seq = seq.substr(0,3);
var output = new String();
switch (seq) {
case “+++”:
output = “Aanhoudende dooi in $location”;
break;
case “++-”:
output = “Overmorgen gaat het vriezen in $location”;
break;
case “+-+”:
case “+–”:
output = “Morgen gaat het vriezen in $location”;
break;
case “—”:
case “-+-”:
case “–+”:
output = “Aanhoudende vorst in $location”;
break;
case “-++”:
output = “Morgen dooit het in $location”;
break;
}
output = output.replace(/\$location/,location);

YQL XML open table returns no data when strange characters are in the XML

When running a query on the Google weather API for a city in France, very often the returned XML contains strange characters, like e.g. in “Avignon, Provence-Alpes-Côte d’Azur”. When querying for this xml through YQL XML open table, nothing is returned!

The Google API is not providing encoding information:

<?xml version=”1.0″?>

Based on an octal dump, I’d say it’s UTF8:

0000300   s   -   C 303 264 t   e       d   &   #   3   9   ;   A   z

Yahoo weather YQL tables

select * from weather.woeid where w in (select woeid from geo.places where text=”paris”)

Not very useful… only one day forecast, and e.g. for ‘Paris’ returning all cities with the name.

Creating a pipe works better:

http://pipes.yahoo.com/pipes/pipe.run?_id=4f9865c59104e4f105c331c5fbee4116&_render=rss&location=antwerpen

The pipe first gets the woeid for the location using the YQL woeid open table (see above), then loads the weather info from the Yahoo Weather RSS feed.

The result is formatted html, so extracting the low temp values requires text parsing.

World Weather Online API

Registration required, a simple API, but it won’t take ‘Antwerpen’, but only the english name of this flemish city… not so nice!

http://www.worldweatheronline.com/feed/weather.ashx?q=antwerp&format=xml&num_of_days=3&key=81e058dc89161948102509

So let’s use the YQL geo.places open table for getting the coordinates of our location, and feed this to the longitude/latitude World Weather Online API:

http://www.worldweatheronline.com/feed/weather.ashx?q=51.22,4.40&format=xml&num_of_days=3&key=81e058dc89161948102509

This operation chain is performed in this pipe.

This pipe, on it’s turn, can be used as source for YQL open table execute scripting, via the JSON open table query:

select * from json where url=”http://pipes.yahoo.com/pipes/pipe.run?_id=70a400f5404b6a454caf850de3f62d33&_render=json&location=paris” and itemPath=”//tempMinC”

The final weather.frost open table allows for an extra parameter: the target language.

use “http://github.com/vicmortelmans/yql-tables/raw/master/weather/weather.frost.xml”  as weather.frost;select * from weather.frost where location = “Antwerpen” and language=”nl”

Yahoo Pipe

The wrapper pipe is needed to turn the YQL query output into RSS format.

http://pipes.yahoo.com/pipes/pipe.info?_id=145e579bd145fb0b01f9473f69b7e4c1

RSS:

http://pipes.yahoo.com/pipes/pipe.run?_id=145e579bd145fb0b01f9473f69b7e4c1&_render=rss&language=en&location=Paris

encoded:

http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.run%3F_id%3D145e579bd145fb0b01f9473f69b7e4c1%26_render%3Drss%26language%3Den%26location%3DParis

Todo

It would be nice if the language could be automatically detected, based on the provided city name! (but what about Brussels?).

Google Gadget

The original Frost Alert article refers to the Google Webpage where the Google gadget is published.

Managing the site is done in sites.google.com, but that’s OK.

Editing the gadget is done in the iGoogle Gadget Editor. This is the gadget code:

<?xml version=”1.0″ encoding=”UTF-8″?>
<Module>
<ModulePrefs title=”Vorstalarm!” height=”401″ width=”576″ author=”Vic Mortelmans” author_email=”vic.mortelmans@telenet.be”/>
<Content type=”html”><![CDATA[
<script type="text/javascript">
htmlTemplate = '<a href="http://www.addtoany.com/subscribe?linkname=Vorstalarm!&amp;linkurl=http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.run%3F_id%3D145e579bd145fb0b01f9473f69b7e4c1%26_render%3Drss%26language%3Dnl%26location%3D_LOCATION_" target="_blank"><img src="http://static.addtoany.com/buttons/subscribe_120_16.gif" width="120" height="16" border="0" alt="Subscribe"/></a>';
function displayGetRSS () {
var html = htmlTemplate.replace(/_LOCATION_/,document.getElementById('input').value);
document.getElementById('content_div').innerHTML = html;
}
</script>
<div>
<img src="http://farm4.static.flickr.com/3520/4058180670_ed979a4099_o.jpg" width="576" height="401" alt="breughel_hunters" />
<div style="position: relative; background: white; top: -350px; left: 193px; width: 190px; height: 110px; padding: 10px" >
<div style="text-align: center">
<form>
<p>
<b>Woonplaats:</b>
<br/>
<input id="input" name="location" onkeyup="displayGetRSS()" type="text"/>
</p>
</form>
</div>
<div id="content_div" style="text-align: center">
<img src="http://static.addtoany.com/buttons/subscribe_120_16.gif" width="120" height="16" border="0" alt="Subscribe"/>
</div>
</div>
</div>
]]>
</Content>
</Module>

Note that it seems the gagdet must be published (and re-inserted into the webpage) before changes take effect !?

Google docs spreadsheet as database

In Uncategorized on September 22, 2010 at 7:50 pm

I have a spreadsheet containing a terminology list, translated in multiple languages:
https://spreadsheets.google.com/ccc?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&hl=nl

Access via list-based feed

Documentation: http://code.google.com/intl/nl/apis/spreadsheets/data/3.0/reference.html#ListFeed

Example:
https://spreadsheets.google.com/feeds/list/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/1/public/basic
(note: public visibility requires the spreadsheet to be published)
(note: the key, I got when publishing the document)

Result:

<feed>

<title type=‘text’>Blad1</title>

<entry>

<title type=‘text’>christmas</title>

<content type=‘text’>nl: kerstmis, fr: noel</content>

</entry>

<title type=‘text’>easter</title>

<content type=‘text’>nl: pasen, fr: paques</content>

</entry>

</feed>

I have my data, but I’ll have to do some more regex-ing to parse the entries of the feed…

A specific row can be selected using a list feed structured query:
https://spreadsheets.google.com/feeds/list/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/1/public/basic?sq=en%3Dchristmas

Access via Google Visualization API Query Language

Documentation:
http://blog.ouseful.info/2009/05/18/using-google-spreadsheets-as-a-databace-with-the-google-visualisation-api-query-language/
http://code.google.com/intl/nl/apis/visualization/documentation/querylanguage.html

Example:
http://spreadsheets.google.com/a/google.com/tq?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE
Hmmm… smells like JSON
http://spreadsheets.google.com/a/google.com/tq?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&tq=select%20C%20where%20A%20%3D%20′easter’
The column identifier is the column index (‘A’, ‘B’, …) and not the label. That’s a pity. Could be solved by having another table that makes the relationship between the language names and the column indices, but that’s not very clean.

Access via table feed

Documentation:
http://code.google.com/intl/nl/apis/spreadsheets/data/3.0/reference.html#TableFeed

Example:
https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/tables
https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/records/0
Nothing there…? This is a bit silly. Seems that you can access the data using this very simple URL-based API, but to
create a table, you must use the DATA API and perform a POST request containing the table definition and  with authenticated header.
So setting a ‘range’ in the GUI is not sufficient (as I first assumed).

Further study of the Spreadsheets Data API is needed.

If you are curious about how the Google Data APIs work at the basic level using XML and HTTP, you can read the Protocol Guide. This guide details the requests and responses that the Data API servers expect and return. To learn more about the structure of these requests and responses, read the Reference Guide. This guide defines the API’s feed types, HTTP request parameters, HTTP response codes, and XML elements.

To make working with the API easier, we have a number of client libraries that abstract the API into a language-specific object model. There are Developer’s Guides for Java, .NET, PHP, and Python as well as sample code.

So I guess I’m left with these alternatives:

  • building a small java application to create the table
  • start fiddling about with http console or curl-like tools to POST the request manually

Let’s not fiddle… And take a tour with the Java client library.

Here  are examples:



Client library download (installation is unzipping).

There’s also an Eclipse plugin provided, so I probably should not waste effort using NetBeans.

  1. Installation of the plugin
  2. TemplateDemo – creating a first project (targeting Spreadsheet rather than Documents)

Got this error: “java.lang.NoClassDefFoundError: com/google/common/collect/Maps”, because I also need the Google Collections (a.k.a Guava).
Installation is unzipping, and adding guava-r06.jar and guava-src-r06.zip to the project’s libraries.

Now the program runs and gives me a list of my spreadsheets:

Getting Spreadsheet entries…

Terminology liturgical days

spiegelserre

krantenwaaier

Lexfeed defects

Kindermissaal

klapluik

multigrade

Total Entries: 7

A few modifications of the example in “create a table” and some browsing through the client’s library javadoc.

// Create a new Spreadsheet service
SpreadsheetService myService = new SpreadsheetService(“My Application”);
myService.setUserCredentials(args[0],args[1]);

TableEntry tableEntry = new TableEntry();

FeedURLFactory factory = FeedURLFactory.getDefault();
URL tableFeedUrl = factory.getTableFeedUrl(“0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE”);

// Delete table if it already exists
TableFeed tableFeed = myService.getFeed(tableFeedUrl, TableFeed.class);
for (TableEntry table : tableFeed.getEntries()) {
String currTitle = table.getTitle().getPlainText();

if (currTitle.equals(“Terminology”)) {

table.delete();

break;

}

}

// Specify a basic table:
tableEntry.setTitle(new PlainTextConstruct(“Terminology”));
tableEntry.setWorksheet(new Worksheet(“Blad1″));
tableEntry.setHeader(new Header(1));

// Specify columns in the table, start row, number of rows.
Data tableData = new Data();
tableData.setNumberOfRows(200);
// Start row index cannot overlap with header row.
tableData.setStartIndex(2);
// This table has only one column.
tableData.addColumn(new Column(“A”, “en”));
tableData.addColumn(new Column(“B”, “fr”));
tableData.addColumn(new Column(“C”, “nl”));

tableEntry.setData(tableData);
myService.insert(tableFeedUrl, tableEntry);

To change the number of rows (if defined as ‘0’, there are no rows – obviously?), I have to remove the table if it already exists (see bold above).
When retrieving the table feed, I must supply a table number, which seems to increment when tables are deleted and added. So now I can retrieve the data from:
https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/records/1

But damn me! The records are still printed like “fr: kerstmis, nl: noel, en: christmas”…

A specific record can be retrieved using a structured query:
https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/records/1?sq=en%3Dchristmas

Cell based feed

First I need the worksheets feed

https://spreadsheets.google.com/feeds/worksheets/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/private/full

In the returned feed, this tag can be found:

<id>https://spreadsheets.google.com/feeds/worksheets/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/private/full/od6</id>

The ‘od6’ is the ID of the worksheet, to be used in the cell based feed URL:

https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/private/full

The return contains individual cells:

A1

dinsdag 21 september 2010 21:36

en

B1

dinsdag 21 september 2010 21:36

fr

C1

dinsdag 21 september 2010 21:36

nl

Or as xml:

<feed>
<entry>
<title type=‘text’>A1</title>
<content type=‘text’>en</content>
<gs:cell row=’1′ col=’1′ inputValue=‘en’>en</gs:cell>
</entry>
<entry>
<title type=‘text’>B1</title>
<content type=‘text’>fr</content>
<gs:cell row=’1′ col=’2′ inputValue=‘fr’>fr</gs:cell>
</entry>
<entry>
<title type=‘text’>C1</title>
<content type=‘text’>nl</content>
<gs:cell row=’1′ col=’3′ inputValue=‘nl’>nl</gs:cell>
</entry>
<entry>
<title type=‘text’>A2</title>
<content type=‘text’>christmas</content>
<gs:cell row=’2′ col=’1′ inputValue=‘christmas’>christmas</gs:cell>
</entry>
<entry>
<content type=‘text’>kerstmis</content>
<gs:cell row=’2′ col=’2′ inputValue=‘kerstmis’>kerstmis</gs:cell>
</entry>
<entry>
<title type=‘text’>C2</title>
<content type=‘text’>noel</content>
<gs:cell row=’2′ col=’3′ inputValue=‘noel’>noel</gs:cell>
</entry>
<entry>
<title type=‘text’>A3</title>
<content type=‘text’>easter</content>
<gs:cell row=’3′ col=’1′ inputValue=‘easter’>easter</gs:cell>
</entry>
<entry>
<title type=‘text’>B3</title>
<content type=‘text’>pasen</content>
<gs:cell row=’3′ col=’2′ inputValue=‘pasen’>pasen</gs:cell>
</entry>
<entry>
<title type=‘text’>C3</title>
<content type=‘text’>paques</content>
<gs:cell row=’3′ col=’3′ inputValue=‘paques’>paques</gs:cell>
</entry>
</feed>

This should be queriable by a clever XPath via YQL.

feed/entry[gs:col/@col = (<column where first row is $language>) and gs:row/@row = (row where first column is $text)]/content

/feed/entry[gs:col/@col = (/feed/entry[gs:cell/@row = ‘1’ and content = $language]/@col) and gs:row/@row = (/feed/entry[gs:cell/@col = ‘A’ and content=$text]/@row)]/content

But -alas!- there seems to be an issue. The XML table isn’t returning any data at all:

select * from xml where url=”https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/private/full”

Even after putting https in front of the yql URL… (I suspect it’s because the google feed URL is on https)…

> The explanation is easy: this feed requires authentication…

>> Wait a minute… after reading this, I just did ‘Publish as webpage’ and selected ‘Atom’ and ‘Cells’ as options, and got this URL:

https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/public/basic

This URL is public, so it works in YQL !!

select * from xml where url=”https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/public/basic”

I can also get the data in CSV format (been reading this and this)

https://spreadsheets.google.com/pub?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&hl=nl&single=true&gid=0&output=csv

And perform this query on the CSV open table,

select * from csv where url=”https://spreadsheets.google.com/pub?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&hl=nl&single=true&gid=0&output=csv”

which on it’s turn can be fed into the XML table….
- or I should be able to add to the query ‘and columns = “en,fr,nl” and en=”christmas”’, the column names list being whatever is appropriate for the queried Spreadsheet.

Or wait,

  1. publish the spreadsheet as CSV
  2. turn the CSV into XML using YQL CSV open table
    REST URL
  3. query the XML using YQL XML table

The query:

select * from xml where itemPath = “/query/results/row/*[../*[name(.) = name(../../row[1]/*[. = 'en'])] = ‘easter‘ and name(.) = name(../../row[1]/*[. = 'nl'])]” and  url = “http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20csv%20where%20url%3D%22https%3A%2F%2Fspreadsheets.google.com%2Fpub%3Fkey%3D0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE%26hl%3Dnl%26single%3Dtrue%26gid%3D0%26output%3Dcsv%22&diagnostics=false”

This is equivalent to:

select nl from key where en = ‘easter

Google docs spreadsheet as database

Google docs spreadsheet as database

Access via list-based feed

Access via Google Visualization API Query Language

Access via table feed

Cell based feed

I have a spreadsheet containing a terminology list, translated in multiple languages:

https://spreadsheets.google.com/ccc?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&hl=nl

Access via list-based feed

Documentation: http://code.google.com/intl/nl/apis/spreadsheets/data/3.0/reference.html#ListFeed

Example:

https://spreadsheets.google.com/feeds/list/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/1/public/basic(note: public visibility requires the spreadsheet to be published)

(note: the key, I got when publishing the document)

Result:

<feed>

<title type=‘text’>Blad1</title>

<entry>

<title type=‘text’>christmas</title>

<content type=‘text’>nl: kerstmis, fr: noel</content>

</entry>

<title type=‘text’>easter</title>

<content type=‘text’>nl: pasen, fr: paques</content>

</entry>

</feed>

I have my data, but I’ll have to do some more regex-ing to parse the entries of the feed…

A specific row can be selected using a list feed structured query:

https://spreadsheets.google.com/feeds/list/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/1/public/basic?sq=en%3Dchristmas

Access via Google Visualization API Query Language

Documentation:

http://blog.ouseful.info/2009/05/18/using-google-spreadsheets-as-a-databace-with-the-google-visualisation-api-query-language/

http://code.google.com/intl/nl/apis/visualization/documentation/querylanguage.html

Example:

http://spreadsheets.google.com/a/google.com/tq?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE

Hmmm… smells like JSON

http://spreadsheets.google.com/a/google.com/tq?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&tq=select%20C%20where%20A%20%3D%20′easter’

The column identifier is the column index (‘A’, ‘B’, …) and not the label. That’s a pity. Could be solved by having another table that makes the relationship between the language names and the column indices, but that’s not very clean.

Access via table feed

Documentation:

http://code.google.com/intl/nl/apis/spreadsheets/data/3.0/reference.html#TableFeed

Example:

https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/tables

https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/records/0
Nothing there…? This is a bit silly. Seems that you can access the data using this very simple URL-based API, but to
create a table, you must use the DATA API and perform a POST request containing the table definition and  with authenticated header.

So setting a ‘range’ in the GUI is not sufficient (as I first assumed).

Further study of the Spreadsheets Data API is needed.

If you are curious about how the Google Data APIs work at the basic level using XML and HTTP, you can read the Protocol Guide. This guide details the requests and responses that the Data API servers expect and return. To learn more about the structure of these requests and responses, read the Reference Guide. This guide defines the API’s feed types, HTTP request parameters, HTTP response codes, and XML elements.

To make working with the API easier, we have a number of client libraries that abstract the API into a language-specific object model. There are Developer’s Guides for Java, .NET, PHP, and Python as well as sample code.

So I guess I’m left with these alternatives:

  1. building a small java application to create the table
  2. start fiddling about with http console or curl-like tools to POST the request manually

Let’s not fiddle… And take a tour with the Java client library.

Here  are examples:

  1. http://code.google.com/p/gdata-java-client/source/browse/trunk/java/sample/spreadsheet/worksheet/WorksheetDemo.java
  2. http://ga-api-java-samples.googlecode.com/svn/trunk/src/v1/SpreadsheetExporter.java

Client library download (installation is unzipping).

There’s also an Eclipse plugin provided, so I probably should not waste effort using NetBeans.

  1. Installation of the plugin
  2. TemplateDemo – creating a first project (targeting Spreadsheet rather than Documents)

Got this error: “java.lang.NoClassDefFoundError: com/google/common/collect/Maps”, because I also need the Google Collections (a.k.a Guava).

Installation is unzipping, and adding guava-r06.jar and guava-src-r06.zip to the project’s libraries.

Now the program runs and gives me a list of my spreadsheets:

Getting Spreadsheet entries…

Terminology liturgical days

spiegelserre

krantenwaaier

Lexfeed defects

Kindermissaal

klapluik

multigrade

Total Entries: 7

A few modifications of the example in “create a table” and some browsing through the client’s library javadoc.

// Create a new Spreadsheet service

SpreadsheetService myService = new SpreadsheetService(“My Application”);

myService.setUserCredentials(args[0],args[1]);

TableEntry tableEntry = new TableEntry();

FeedURLFactory factory = FeedURLFactory.getDefault();

URL tableFeedUrl = factory.getTableFeedUrl(“0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE”);

// Delete table if it already exists

TableFeed tableFeed = myService.getFeed(tableFeedUrl, TableFeed.class);

for (TableEntry table : tableFeed.getEntries()) {

String currTitle = table.getTitle().getPlainText();

if (currTitle.equals(“Terminology”)) {

table.delete();

break;

}

}

// Specify a basic table:

tableEntry.setTitle(new PlainTextConstruct(“Terminology”));

tableEntry.setWorksheet(new Worksheet(“Blad1″));

tableEntry.setHeader(new Header(1));

// Specify columns in the table, start row, number of rows.

Data tableData = new Data();

tableData.setNumberOfRows(200);

// Start row index cannot overlap with header row.

tableData.setStartIndex(2);

// This table has only one column.

tableData.addColumn(new Column(“A”, “en”));

tableData.addColumn(new Column(“B”, “fr”));

tableData.addColumn(new Column(“C”, “nl”));

tableEntry.setData(tableData);

myService.insert(tableFeedUrl, tableEntry);

To change the number of rows (if defined as ‘0’, there are no rows – obviously?), I have to remove the table if it already exists (see bold above).

When retrieving the table feed, I must supply a table number, which seems to increment when tables are deleted and added. So now I can retrieve the data from:

https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/records/1

But damn me! The records are still printed like “fr: kerstmis, nl: noel, en: christmas”…

A specific record can be retrieved using a structured query:

https://spreadsheets.google.com/feeds/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/records/1?sq=en%3Dchristmas

Cell based feed

First I need the worksheets feed

https://spreadsheets.google.com/feeds/worksheets/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/private/full

In the returned feed, this tag can be found:

<id>https://spreadsheets.google.com/feeds/worksheets/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/private/full/od6</id>

The ‘od6’ is the ID of the worksheet, to be used in the cell based feed URL:

https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/private/full

The return contains individual cells:

A1

dinsdag 21 september 2010 21:36

en

B1

dinsdag 21 september 2010 21:36

fr

C1

dinsdag 21 september 2010 21:36

nl

Or as xml:

<feed>

<entry>

<title type=‘text’>A1</title>

<content type=‘text’>en</content>

<gs:cell row=’1′ col=’1′ inputValue=‘en’>en</gs:cell>

</entry>

<entry>

<title type=‘text’>B1</title>

<content type=‘text’>fr</content>

<gs:cell row=’1′ col=’2′ inputValue=‘fr’>fr</gs:cell>

</entry>

<entry>

<title type=‘text’>C1</title>

<content type=‘text’>nl</content>

<gs:cell row=’1′ col=’3′ inputValue=‘nl’>nl</gs:cell>

</entry>

<entry>

<title type=‘text’>A2</title>

<content type=‘text’>christmas</content>

<gs:cell row=’2′ col=’1′ inputValue=‘christmas’>christmas</gs:cell>

</entry>

<entry>

<content type=‘text’>kerstmis</content>

<gs:cell row=’2′ col=’2′ inputValue=‘kerstmis’>kerstmis</gs:cell>

</entry>

<entry>

<title type=‘text’>C2</title>

<content type=‘text’>noel</content>

<gs:cell row=’2′ col=’3′ inputValue=‘noel’>noel</gs:cell>

</entry>

<entry>

<title type=‘text’>A3</title>

<content type=‘text’>easter</content>

<gs:cell row=’3′ col=’1′ inputValue=‘easter’>easter</gs:cell>

</entry>

<entry>

<title type=‘text’>B3</title>

<content type=‘text’>pasen</content>

<gs:cell row=’3′ col=’2′ inputValue=‘pasen’>pasen</gs:cell>

</entry>

<entry>

<title type=‘text’>C3</title>

<content type=‘text’>paques</content>

<gs:cell row=’3′ col=’3′ inputValue=‘paques’>paques</gs:cell>

</entry>

</feed>

This should be queriable by a clever XPath via YQL.

feed/entry[gs:col/@col = (<column where first row is $language>) and gs:row/@row = (row where first column is $text)]/content

/feed/entry[gs:col/@col = (/feed/entry[gs:cell/@row = ‘1’ and content = $language]/@col) and gs:row/@row = (/feed/entry[gs:cell/@col = ‘A’ and content=$text]/@row)]/content

But -alas!- there seems to be an issue. The XML table isn’t returning any data at all:

select * from xml where url=”https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/private/full”

Even after putting https in front of the yql URL… (I suspect it’s because the google feed URL is on https)…

> The explanation is easy: this feed requires authentication…

>> Wait a minute… after reading this, I just did ‘Publish as webpage’ and selected ‘Atom’ and ‘Cells’ as options, and got this URL:

https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/public/basic

This URL is public, so it works in YQL !!

select * from xml where url=”https://spreadsheets.google.com/feeds/cells/0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE/od6/public/basic”

I can also get the data in CSV format (been reading this and this)

https://spreadsheets.google.com/pub?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&hl=nl&single=true&gid=0&output=csv

And perform this query on the CSV open table,

select * from csv where url=”https://spreadsheets.google.com/pub?key=0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE&hl=nl&single=true&gid=0&output=csv”

which on it’s turn can be fed into the XML table….

- or I should be able to add to the query ‘and columns = “en,fr,nl” and en=”christmas”’, the column names list being whatever is appropriate for the queried Spreadsheet.

Or wait,

  1. publish the spreadsheet as CSV
  2. turn the CSV into XML using YQL CSV open tableREST URL
  3. query the XML using YQL XML table

The query:

select * from xml where itemPath = “/query/results/row/*[../*[name(.) = name(../../row[1]/*[. = 'en'])] = ‘easter‘ and name(.) = name(../../row[1]/*[. = 'nl'])]” and  url = “http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20csv%20where%20url%3D%22https%3A%2F%2Fspreadsheets.google.com%2Fpub%3Fkey%3D0Au659FdpCliwdG44Q2htMWJEQUxVQ3NfRlZUdlZaalE%26hl%3Dnl%26single%3Dtrue%26gid%3D0%26output%3Dcsv%22&diagnostics=false”

This is equivalent to:

select nl from key where en = ‘easter

Follow

Get every new post delivered to your Inbox.