Training in Tanzania

On the last Monday of April, I found myself nervously standing in a room of about 15 people from the e-Government Agency and National Bureau of Statistics in Dar es Salaam. They were waiting for me to start training them in Python and CKAN. I’ve been programming in Python since 2011, but I’ve never actually trained people in Python. On the first day, I didn’t have any slides. All I had was one [PDF][pdf] from Wikibooks which I was using as material. I didn’t even cover the whole material. By the end of the day though, I could sense that it was sinking into the attendees a bit.

It all started with an email from my manager asking me if I was available to do a training in Tanzania in April. After lots of back and forth, we finalized on a date and a trainer to assist in the trainings, and I flew in. Dar es Salaam, strangely, reminded of growing up in Salalah. I got in a day early to prep for the week and settle in. The trainer looking groggy on a Monday does not bode well!

People who train often don’t tell you this – Trainings are exhausting. You’re most likely to be on your feet all day and walk around the room helping people who’re lagging behind. Looking back, the training was both fun and exhausting. I enjoyed talking about Python, though I feel like I need more practice to do it well. The CKAN training, I was pretty satisfied with the outcome, by the end of the week, the folks from e-Gov Agency went in and setup a server with CKAN!

Note to self: Write these posts immediately after the trip before I forget 🙂

Migration Update – 1

About 2 weeks ago, I kicked off my “evil” plan to move as many things as possible off Google Apps. I’ve managed to move my Contacts, Calendar, and files off Google services so far.

I setup ownCloud for contacts, calendar, and files. It was incredibly painless to setup. I have the owncloud, CalDAV-sync, and CardDAV-Sync apps installed on my Android phone and it seems to work great. Good enough that the only thing I’m syncing from my Google account is email.

The ownCloud app was straight forward. I checked the option to instantly upload pictures. This allowed me to disable picture syncing with Google Photos.

The next app I tried was CardDAV-Sync. I tried the free one first. It didn’t actually sync anything to my server. Searching around a bit suggested that I might have to import the contacts to the server first. So, I backed up the contacts to a file and synced that to the ownCloud instance. When I clicked on the VCF file on the ownCloud server, it let me import the contacts from it immediately. The problem with Google syncing all my contacts it that there were 1000+ contacts that I had to clean up, purge, and finally arrive at close to

  1. I should delete more, but I haven’t had spare time to do that.

CalDAV was fairly easy, exported the calendars, imported them into ownCloud, installed the app, and removed the Google calendars from being displayed.

Here’s the status so far on my roadmap:

  1. [Done] Sign up for fastmail.
  2. [Ongoing] Move all the Gtalk contacts to Jabber on fastmail.
  3. [Done] Set up ownCloud for docs, contacts, and calendar.
  4. [Done] Copy documents, contacts, and calendar entries to ownCloud.
  5. [Todo] Setup mutt to use with fastmail.
  6. [Todo] Archive emails from gmail.
  7. [Todo] Turn lights off at Google Apps account.
  8. [Todo] Set reply-to headers for gmail.com account to fastmail account.
  9. [Todo] Regular backup/archiving strategy for ownCloud.

I’m using a Google Spreadsheet to track my budget and this is where I anticipate trouble. I haven’t found an online tool that I can use as well I have managed with this spreadsheet that I’ve perfected over the last few years. If anyone has suggestions, please let me know.

Additionally, this is not cheaper than using Google for sure. I’m definitely paying more in terms of server space and backup space for this.

Git Tips You Probably Didn’t Know

I’ve been using `git` for quite a while now and some of it’s features continue to amaze me. Here’s a few things I learned recently.

I’ve been using git for quite a while now and some of it’s features continue to amaze me. Here’s a few things I learned recently.

Finding only your changes to master.

When you’ve made a change against a master that’s moving often, you find that simply doing git diff master doesn’t give the right diff. It shows you the difference between your branch and current master. That’s not what you want in most cases. The master branch would have changed and now the command also shows those changes. GitHub does the right thing and the right command in this case is the following:

git diff master... 

Copying your changes in one branch into another

Recently, our designer Sam Smith started working on making CKAN more responsive. His Branch was based off master and I wanted to make a new branch based off release-v2.2 with his changes on top of it. My instinct was to make a patch.

git diff master...responsive > ../responsive.patch 

Then, apply the patch onto a different branch. This would surely work, but I’m using a version control system! It should be smart about this. The good folks in #git pointed me in the right direction.

First find the last commit from the responsive branch that you don’t want to copy.

git checkout responsive git log master..responsive 

The last commit in that log is what you want to copy. In this case, it’s 7587c6e8fe49c809ef7357b6f88496bd06ac93b9, so now you want to do git log 7587c6e8fe49c809ef7357b6f88496bd06ac93b9^. The first commit is the one you don’t want to keep.

git checkout responsive git checkout -b reponsive-2.2 git rebase 7587c6e8fe49c809ef7357b6f88496bd06ac93b9 --onto release-2.2 

Thus, responsive-2.2 is a new branch with responsive changes based on top of release-v2.2!

The more you know

Better Problem Definition

I’m a core developer on [CKAN][1] at Open Knowledge, the most widely used data catalog software. Early this year, we released version 2.2 of CKAN with a complete overhaul…

I’m a core developer on CKAN at Open Knowledge, the most widely used data catalog software. Early this year, we released version 2.2 of CKAN with a complete overhaul of the filestore. Amusingly, right after that, we started getting more and more complaints about data loss from the old filestore from on the ckan-dev list. One of the many folks, helped narrow it down to a particular file called persisted_state.json.

This file is created by a library called ofs. Every time a new file is added to the filestore, OFS does the following:

  • Read the persisted_state.json file.
  • Convert the JSON to a Python dict.
  • Add an element to this dict with the metadata of the new file.
  • Convert the dict back to JSON.
  • Write this new JSON to persisted_state.json file.

This causes concurrency problems when things were added to the filestore at high frequency and eventually lead to data loss. Oh joy.

Technically, this wasn’t a bug in CKAN’s codebase. We already solved the core problem at this point by switching to a new filestore which did not use ofs. We couldn’t abandon our users though and I volunteered to find a fix. I read through ofs code and I thought of solving the problem there. After an hour or two of reading up on concurrency and documentation on the python, I still didn’t have a working solution. Eventually, I asked myself what I was looking to solve.

My original problem: “OFS is not thread-safe, causing data loss”. I then realized, that’s not what I wanted to solve. A better problem to solve was: “OFS is not thread-safe, causing data loss. Our users need their data.”. So, I wrote a script that would re-generate the persisted_state.json file with just enough metadata to start working. It isn’t a complete fix, but it was a productive fix. The script was “dramatically” called ofs-hero.

Lesson Learnt: Defining the problem properly helps you solve it better.

Pycon India 2013

Finally, I made it to a Pycon India! The last 2 years, I’ve been pulling a sankarshan. I walked in just in time for Kenneth Reitz’ keynote. Kenneth talked about…

Finally, I made it to a Pycon India! The last 2 years, I’ve been pulling a sankarshan. I walked in just in time for Kenneth Reitz’ keynote. Kenneth talked about writing an API from his experience in writing and maintaining requests. There was good deal of information about coding practices, managing contributors, and avoiding burnout. Key things I remember: Documentation, documentation (yes, I’m repeating it again, because it’s important!), clean APIs, extendability, and learning to say no (without being a dick).

A conference is not just about the content, it is also a great opportunity to catch with people I’ve known online and don’t meet that often. For instance, I met haseeb at the registration counter. Later, I met Runa and Sankarshan. I don’t think I’ve met Runa since conf.kde.in in 2011! And I hadn’t met Sankarshan at all. Other usual suspects include Kushal Das, who I’m glad to report did not have any sort of untoward accident (and I hope I didn’t jinx it), Souvik, Sneha, Noufal, Devi, Anand, Ramki, Vivek and so many more people that I can’t even remember all their names!

As is usual, I spent more time in hallway conversations than in actual sessions. After the keynote, I sat in two sessions, the first one was Applications of Python in Robotics by Lentin Joseph. He actually had a robot on the table when he was presenting which got me hooked on to the talk. I don’t have a lot of experience in hardware, so all the information was a little dry for me. Lentin did a demo at a conference AND IT WORKED! Well, sort of. The robot tracked the yellow ball and moved (albiet slowly, because the table cloth didn’t offer a lot of traction).

The other interesting session was Let’s talk testing with Selenium by Anisha. I’ve known the Selenium project ever since I started contributing to Mozilla, though, I haven’t actually used it. Anisha’s session was information packed and made it look easy as well. In the coming weeks, I’m going to take a look at it to see how I can use it at the day job.

Before I left Pycon, I was convinced to join the PSSI (Python Software Society of India), though I couldn’t stay for the AGM.

Ah, I forgot to mention that hanging out with friends in town for Pycon started on Friday evening. I met, among others, Sengupta and Harshad of Instamojo (they’re hiring btw, if you’re into python, get in touch with them!), Jaidev, Parth, Bala, and Nivedita. We met at Egg Factory and we were chatting and making twss jokes.

Overall, Pycon 2013 was a great conference and I look forward to many more years of attending it 🙂

Pulling a Sankarshan (verb): The act of purchasing a conference ticket and not attending the conference. Tickets may be refunded.