My Life Is A House

Monday, September 23, 2013

Final report - Google Tasks API support to libgdata/Evolution - there's trees behind forest

So here comes last final summary post about my this year's project. This year I took a break from working with Gstreamer related things like Jokosher or libwaveform. It started quite innocent - I really wanted to add Google Tasks support to Evolution. Yeah, I know, many now assume that NSA reads all our stuff (be serious - it isn't), but really, my task list for all things open source and free software or that I have to buy a kitchen sink really not top secret information. Still, it would be nice to add tasks support for let's say ownCloud in the future too. Another huge reason for this was that I wanted to improve my C knowledge a lot, and for this reason alone these were four months good spent. Also thanks to my mentor Philip I learned one thing or two about writing much cleaner code, thanks to his regular pokes at reviewing commits I got some of these rules in my blood. It's worth to remember that every language and even every project in same language could have different writing styles.

First look at the libgdata gave false sense of assurance, but in fact I stumbled on a lot of issues I have never thought of before. First we had to choose how to implement general JSON support in libgdata. There's lot of places which would require a lot of code refactoring if we would do it full intrusion style. Instead, we choose to "sneak in" support, sometimes using clever hacks in places, as functionality of code was the same and adding something would mean duplicate a lot of stuff. When my last patches will be comitted to master I'll probably write in-depth summary about some of these hacks, in case someone will want to join in improving JSON support in future. I must admit that when first confusion weared off, libgdata code seems very nicely written and it was quite easy to test my code when I got my test right. One of more annoying problems I had was with Google authorization, as we looked for ways how to get authorizer created, as it kept coming back with errors. Google Tasks and other newer APIs quite insist on using OAuth 2.0. In the end I used GDataGoaAuthorizer in test code (which involves manually place client id in code), but it's not a real solution and I will try to get it working with OAuth 1.0 so we can get Google Tasks support tested automatically.

As Google moves to next generation of protocols which uses JSON, I hope my work on this will allow to add support for them in libgdata more easily. I have my branch published at GitHub here and you can follow GNOME Bugzilla bug #657539 who has all my patches attached.

If honest, I hadn't researched Evolution part of my project to full extent before I started hack on eds and "friends", and in result I was a little bit afraid of complexity when discovering that adding GT support will be more challenging than I thought. It took lot of time for me to fully understand how things works in Evolution and EDS, but slowly I pulled picture together. As I mentioned in my previous blog post, all Google ESource objects resides in modules/google-backend. I added tasklist to EGoogleBackend collection object, and defined new backend 'gtasks', in which implementation complexity hides. Evolution relies on this backend implementing several virtual functions. Implementing them properly would end with Evolution automatically recognizing ESource, and reading objects from it to show in task list.
Current state of my Evolution and EDS fixes is not fully ready for "screenshot prime time", however it is positive I will finish implement required changes till the end of September. Currently I finally got my jhbuild working and fixed bugs for last days like crazy. You can see final (for GSoC anyway) state of my evolution-data-server branch here.

Friday, September 20, 2013

Report #4 - Google Tasks API support to libgdata/Evolution - hacking Evolution Data Server

As Google Summer Of Code 2013 finish line closing in I'm completing evolution-data-server part of my project, and cleaning up second part of libgdata changes. Evolution part of project turned out more challenging than I expected, as it has many "moving parts", which I have to take into account.

First of all, Evolution-data-server has central daemon, but it also has of factory registry and factories for addressbook and calendar. In Evolution all communcation between elements are done via D-Bus. To this factory registry we should use ESource to add another resource of information. All related Esources can be stored together in ECollectionBackend subclass. In my case all Google sources can be found in EGoogleBackend (which in turn is hidden in modules/google-backend directory), which I have extended to support tasklists already. However, it's not whole story. Essentially, ESource is just a shell. You add extensions to it to ensure different functionality. Also each ESource has backend, who does all heavy lifting - actual data retrieval and synchronization with remote source.

As in many systems Evolution included tasks are seen as part of calendaring system. In result all calendaring backends include support (if possible) for calendars and tasks at once (notes and journals also can be added) and can be found in calendar/backends. This time though I'm making brand new backend just for Google Tasks, as it has completely different way to access Google APIs and retrieve results. I don't have to worry that calendar may try to use this backend, because in EGoogleBackend you can identify in code that only tasklists should use 'gtasks'.

Currently I'm trying to get 'gtasks' working with OAuth2 authorization. For this I follow how Contacts 'google' backend is implemented. I would like to use GOA extention for source (which EDS has), but as it means changing whole Google source collection backend, I will have to stick with OAuth2 for now.

Additionally to get backend working for read only mode, I have to implement open, get_object, get_object_list, start_view virtual classes. Their functionality is rather simple and I have bunch of them already implemented. As starting platform I used 'caldav' backend, but this caused some issues with understanding fully how everything works (thanks my mentor Milan for clearing up most of things). A little bit more complicated is synchronization between local storage and actual server, but it's still doable.

During weekend I will write one or two final entries for GSoC with details how everything works together so far. Then I hope to hand in my branches to mentors for review and commit. Then all work will continue on GNOME git server after GSoC.

Sunday, September 01, 2013

Report #3 - Google Tasks API support to libgdata/Evolution - compiling, testing and believing

Again after longer pause there's some status update about Google Tasks support in libgdata/Evolution. We got all August past us, and there have been some good news and surprises along the way, so it's worth to blog about it.

First, in a crazy travel stunt which included traveling in the car non-stop for 18 hours to and back with only one driver, my friend Rudolfs Mazurs (and our significant ones) in torching 37C hotness in a car without conditioning, we went to first GUADEC in my life in Brno. As I had to miss first two days due of theatre performance I had to do on Friday, we arrived at Saturday's evening. After regrouping at dormitories and arriving late at conference place Faculty of Information Technology at the Brno University of Technology, we met Gstreamer's crowd - Transmageddon hacker and unofficial leader of sorts Christian Schaller, my past year's mentor and current de facto maintainer Sebastian Dröge and other Gstreamer core guys with addition of Zeeshan Ali of Boxes and Maps fame, and went for hunting food. For driving 18 hours without sleep it was nice ending of the day, with chats going around about beautifully geeky stuff like Doctor Who.

Last day consisted of lot of interesting speeches, I missed Sunday's keynote, but got essence of it later from buzz and back at home watching it in recording. I attended such good presentations like provocative and excellent Stef Walter's "More secure with less "security"", or nicely expressed Jeff Fortin's "Pitiv and GES, towards 1.x", which included most precise video clip explaining migration from Gstreamer 0.10 to 1.0 and bug hunting involved ever (yeah, that was *that* bad :)).

In the middle of all this I also attended Evolution hackfest and met and had a chat with my mentors libgdata maintainer Philip Whitnall and Evolution superhacker Milan Crha. Unfortunately due of sudden changes in planning - and willing to avoid driving back in a deadly heat during daytime, so in result we decided to leave on Monday's evening, not Tuesday as planned - my visit was cut short. Overall GUADEC was a blast, and I really enjoyed it even for such short visit.

I started to work on libgdata testing during GUADEC, and as usual ran into several interesting things. First of all, when Google recommends you something when using their APIs and products, you better use it, because trying other kinda "supported" things will yield inconsistent results. In this case for new JSON based APIs Google "recommends" to use OAuth2. In result with one week of trial and error I settled for using GNOME Online Accounts as way to build Authorizer (using a GDataGoaAuthorizer subclass), which I need to build working Service object (in this case GDataTasksService) so I can check if things are really working as they should. That included modifying GOA code for including Google Tasks authorization scope in get_scope method of GoaGoogleProvider, and building it with separated application keys using --with-google-client-id and --with-google-client-secret configure flags (I needed them because you have to enable tasks support for your application in Google API Console, it's not done yet for GNOME application, which is managed by Debarshi 'Rishi' Ray). In result I had my own GOA daemon, which I run instead of system's installed. I also replaced several symlinks to libgoa and libgoabackend libraries as I installed goa in /opt. Now I was ready to run the tests.

Getting my own code to actual testing was kinda fun, because I learned a lot. While I'm not totally new to C and Glib, there's still lot of things for me to learn. It was most rewarding thing too, because to see it actually working after long session of debugging and compiling is very uplifting moment.

While I was testing GDataTasks code Philip started to merge my core libgdata improvements (along with his own additions) to include in special branch and after 0.14 release - in master. Currently my plan for next week is to complete libgdata Google Tasks support by finishing insert/delete/update methods to GDataTasksService for both tasks and tasklists. Before that though I will rebase my work on Github on newest version of master as it include all JSON I need already.

On Wednesday I will blog about EDS aka Evolution data server changes, and what else I will need to have actually working Google Tasks support in Evolution. My GitHub branch for libgdata can be find here and for evolution-data server here.

Sunday, July 28, 2013

Report #2 - Google Tasks API support to libgdata/Evolution - how to make things stick together

As promised, here comes second blog post with more details on how I implementing JSON support. It is still ongoing as I improve code while adding Google Tasks support itself. As I already described, libgdata is is the GLib wrapper/glue library to Google GData protocol family, which is using XML. As they're slowly phasing it out, all newest APIs are using JSON as the communication format. To complicate things a little bit, they are not using unified query parameters (at least not in a documented way), and have changed names of these parameters. They also got rid of several key attributes which presumably were phased out because of lack of real applications for them. But first of all, let's talk how parsing of normal Google API response is done in libgdata.

How to parse JSON data for libgdata

As we know, Google has enabled many services to use external APIs. As library libgdata covers most of those that can be used in desktop environment. For each service library has its Service, Query, Feed and multiple Entry (for more, look at this overview page in libgdata documentation). Service get's all action as it ensures connection and querying. Query have all the information about, well, query itself (and little bit more, see below). And Feed and Entry classes (and their sub classes) are where the work of actual parsing of information and storing it takes place.

Requests themselves are done by __gdata_service_query (which is called by gdata_service_query, which in turn is called from any query function from Service), using libsoup library. My current change when receiving an answer is to look for headers - when receiving application/atom+xml follow current code path, and when receiving application/json, parse content using my implemented support. Parsing of any content happens using virtual functions, which are chained together. For example, Tasklist object received from Google has several members which are common with any other objects, and few which are unique to this object type. First, feed will be parsed (as a concept it doesn't appear in Tasks API, but list acts same way, although it's attribute set is minimized). Then when parsing list of items/entries from this feed, libgdata will first turn to GDataTaskList parse_json virtual function, which will read those members which it acknowledges, and then pass parsing to parent class if it can't find any, which in this case is GDataEntry. Feed classes are similarly chained, although it seems I won't need one for Tasks support. In the end both Feed and Entry passes the torch to their parent class Parsable, which then collects any members not parsed by upstream and call g_debug with message about unknown attribute. In the end when everything is parsed (answer into Feed, objects into Entries), object is returned to query function and after that to library user.

Query or not to query

While Query class itself is rather simple (it it just a collection of query parameters as properties and a set/get function for each), it has important advanced feature - pagination support. For that Query stores 'next' and 'previous' tags from the XML feed. For new protocols with JSON, things got a little bit complicated, as feed returns only 'nextPageToken' member. In overall, similar to other classes, Query class is chained trough get_query_uri with it's parent, which stores general search parameters, and together they build a query which is then passed to __gdata_service query for retrieving actual data. For now 'previous' support for JSON protocols will be broken, but in the future it could be implemented by storing history of previous pages of query. I also had to disable chaining for Tasks support because as I said at the begining, uri parameters are named quite differently. Instead of that, I just get value of property from parent class GDataQuery, and build uri parameter at GDataTasksQuery level.

Making it work together

As I wrote previously, it all comes together when user requires a specific query, for example, requesting all task lists available to authorized user (in this case gdata_tasks_query_all_tasklists in service/gdata-tasks-service.c). Nothing much has changed in this logic, as we pass all parameters to gdata_service_query, recieve output from libsoup and hope that library will figure out which code path to take after detecting content type. This and other query functions are which are used mostly by library user - Evolution in my case.

Saturday, July 27, 2013

Late at the party aka first report of Google Tasks support to libgdata/Evolution

This is reasonably but terribly late my first report on my project for this summer - this spring was my last in my dear uni and therefore my schedule got little bit screwed up. However now I'm back on track and crunching code as mid term review closes in.
This time I'm adding Google Tasks support to libgdata and also Evolution. As vivid Google Tasks user myself I always wanted to try my hand in this, and therefore jumped on first possibility to do it as my this years GSoC project. My mentors are superhackers Philip Withnall and Milan Crha of Evolution fame. I have used lot of their work each day (using Evolution and GMail integration for last two years, Evolution as email client for 8 years I think) and it is a little bit strange to delve into code which I have been dependent to for years. For those who don't know libgdata is that library which allows you to get all things from Google services using their API in GNOME desktop (altough libgdata is very platform neutral with small dependency set). It is written using C and GObject, and currently uses libxml to parse XML responses from Google.
I started with understanding how Google Tasks API looks like and how it will fit in libgdata universe. Biggest change what comes with implementing it is response format change from XML to JSON. There's tons of JSON libaries for C available, however only one works well within GLib and GObject universe - JSON-GLib. By version number it feels not to be finished, however it is universally available installed by default in all major distributions. It also has what looks like complete documentation.
Main work for first two weeks was to figure out how to properly implement JSON support without breaking or changing current API much. I will have to define lot of new virtual functions for base classes, therefore ABI will be broken anyway. Google Tasks JSON API is very simple comparing to rest of newest API family, therefore adding general JSON support to libgdata I also have to check other APIs (for Calendar example), so they would fit in better when added later.
Currently I have started to read and hack Evolution - more concretely evolution-data-server - code so I can have at least visual demo for midterm review and my 4 minutes of fame in GUADEC :) I will do a little bit more detailed blog post about my changes in libgdata code base before submitting my mid term evaluation. My current work on libgdata can be found at my GitHub account https://github.com/Pecisk/libgdata

Monday, November 05, 2012

State of libwaveform after GSoC

Hi everyone, this is long waited status update of libwaveform - library, who aims to provide all necessary tools to collect, store and draw waveform data. After dealing with studies I have started to hack again. First, more about where I was at the end of the summer - I had general reading part aka WaveformReader ready, using Gstreamer level element, and also generic waveform drawing widget WaveformWidget using gathered data. However drawing just was without any zoom support. So first thing I wanted to improve and work on post summer was to provide method how to deal with zooms.

First of all I started to think about various ways how zoom level can be measured. I went wrong direct several times, but in the end I settled for same old 'pixels per second'. So for WaveformWidget object I have variables which store current, default and max/min zoom levels by this method, and methods which increases and decreases zoom level - currently by simply multiplying and dividing by two - which can be called using callbacks from clicking on buttons, for example (see demo/demo.py to test this out). Now when I have changeable zoom level, I must decide how to draw waveform in given situation. This is what I have in mind for now.

For example, if I have waveform data reading interval 0.1 second long (or 100 000 000 ns), and I have 10 pps (pixel per second) zoom level, that means that wave will have width 1 pixel. That is not enough in a case if I have a waveform (usual one) where minimum width of wave side is 2 pixels - so I will require reading interval 0.2 seconds long from WaveformData (because of it's next reading interval upwards we calculate adding two 0.1 readings together), and we will use step of 2 pixels (because 1 sec /0.2 sec per reading is 5 readings and 10 pixels/5 readings is 2 pixels for reading).

This works nicely if I have waveform max and min width in sync with zoom levels/steps. However, I'm still struggling with scenario when you have 2.5, 3.13, e.t. pixel width if you have some odd zoom levels (for example 71 pixels/sec). There are two scenarios - either I enforce particular zoom levels, calculated "next best thing" by given numbers from user (similar like size request in Gtk+), or just warn users about not having pixel precise waveform alignment. For now I will work on second one.

Next big target is to fix Gstreamer level element to have sub-buffer intervals, otherwise for wav file instead of 100 000 000 ns interval I get 120 000 000 ns - and so on. Currently code has condition test for given interval size, but it doesn't work and actual interval gets rounded by buffers n times.

For feedback first my question is - what do you think about zoom model I offer here and how I plan to draw waveform according to it? What could be potential pitfalls (I have thought of several corner cases, but fresh view wouldn't hurt). Also I would like to know common values you use for default, min and max zoom levels.

How to get libwaveform - current development branch is 'postsummer' on https://github.com/Pecisk/libwaveform. It's buildable on Fedora 18 without any big dependencies (you need usual suspects from Gtk+/Glib land, also gobject-introspection-devel if not mistaken), just use --prefix=/usr with ./autogen.sh so it would be installed correctly in your system. For other systems probably best is to use/install it with jhbuild. To check out recent functionality see demo/demo.py.

Tuesday, August 21, 2012

Libwaveform final Google Summer Of Code report

Finally it's a last day of this year's Google Summer Of Code - it's when I need to calm down, look back what I have done and what's still in my todo list. As I plan to carry on with developing this library, end of summer project is merely a milestone, albeit very important one.

Before this summer I had been working with Gtk+ and GObject/GLib on various projects (but mainly on Jokosher, which I try to continue to port to Gst 1.0), however never so seriously as in this project (and never using C). It took some time to get full swing of how things are done in C/GLib/GObject combo. As usual, C requires no taking things for granted, as usual successful compiling doesn't mean it works really as it should. Seriously improving my knowledge of C, GObject, Gtk+ and how to debug all this is my major gain from this project.

My project aim was/is to create usable set of elements which together would allow provide a widget which would draw a waveform. Those elements are WaveformReader, WaveformData and WaveformDrawing (as I have named them in my code for now). Reader handles reading of the sound levels from files (also decoding them accordingly, all using Gstreamer 1.0), Data provides data model and method to add information to it (so far), and Drawing is a generic widget (derived from GtkDrawingArea) which goes trough data in data model and draws it using cairo library.

I have done most of stuff I initially planned for Reader, however new ideas and issues popped up. I have one method to read levels, which takes source URI, interval, start and finish time as parameters, which works as expected. However while testing this I got unexpected behaviour from 'level' element - as I found out, it can't parse data for smaller interval than buffer, so if you even set interval different than default, you can't expect to get results precisely for that buffer size. This was something too hard to fix for me during project time, but I plan to do so with a help from my mentor Sebastian. I also need to expand 'level' element with having sort of 'lowest peak' value for buffer, as it helps to draw nicer waveforms. This also is on my todo list for post-summer coding.
For error handling in library I choose GError. For Reader I added several errors and have tested it trough Python exception handling - you can see example in demo/demo.py script.

I had lot of ideas and theories how data model should look like, however I settled for quite simple now (see PDF here, I will add more improved version later). Most difficult part is related how to treat more detailed data which are needed for zooming in. I'm planning to return to improving data model when I will get fully working stack within theoretical application (like Pitivi or Jokosher). I have started to create TODO file, which naturally and slowly turns into design document. Currently Data has adding method which detects invervals and time and put readings in right places. For storing readings for each channel I use boxed structure (which is quite buggy I admit and will get cleanup treatement postsummer) named WaveformLevelReading. For GI annotations GList of readings is never transfered fully, as it's quite a chunk of data and copying it wouldn't be very smart resource wise. In current state of things Reader create GList of readings, and when adding it to data model, it practically becomes a part of data model, without copying.

For Drawing - widget part - I first stuck at creating widget itself as an object in Python. For some strange reason manually written annotation with (transfer full) for constructor method failed and object weren't transfered, ending it's run into segfault wall in next second I was trying to access it. Thankfully, just removing annotation worked. It was strange as it was copy pasted from other object were it worked. After that I tried to follow instructions from various tutorials - ways to create custom GtkWidgets have changed considerably over years, so probably it would be wise to create some general intro tutorial for doing it in C/Vala/Python - and finally succeeded in having created object which didn't segfaulted and which I could add to generic GtkWindow. After that I started to dig into Cairo stuff, which resulted in very simple waveform which I have now. I tried to replicate Jokosher waveform style with bottom part filled up, but didn't succeeded so far, I will have to look at that later. It's clear that widget part will need most work with style support, customization, and mostly allowing create subclasses so people can draw overlays. Post-summer my first job will be making zooming work, also implementing async methods of updating waveform data and drawing.

As usual, biggest issues were theoretical ones - understanding how some of more complex stuff work required more time than I planned to. However acknowledging that it's better to dig trough them now than later I did several rewrites and studied lot of Gstreamer code (thanks to Sebastian about giving references and reviews with comments), as I will need to provide patches to Gst or understand how to write async methods. I also have to admit that I was overoptimistic about project's timetable - as usual, studies made things difficult at the beginning of the summer.

In overall, it was very refreshing experience (although I didn't finish lot of things I wanted to - especially implementing zoom logic) and I'm happy that I got some foundations (and beefed up knowledge) in place so I can continue to hack on this library. In more general sense it was definitely worth it because of increased knowledge which will help me to grow as a coder and will enable me to help with code and patches other projects like Gstreamer and Gtk+ in the future.

Project can be found in github.com/Pecisk/libwaveform, see 'postsummer' branch for futher changes for now (as long as I submit code to Google). I will post more updates after week break which I need for my studies. Stay tuned.