summaryrefslogtreecommitdiff
path: root/src/SW
diff options
context:
space:
mode:
authordakkar <dakkar@thenautilus.net>2013-04-12 18:46:08 +0100
committerdakkar <dakkar@thenautilus.net>2013-04-12 18:46:08 +0100
commit0f0ff3d16ee9ebfaac26e8e899ae160ffe4cb855 (patch)
tree05712753fe1e4a5733e903311a1c5b562b17a6f8 /src/SW
parentfix shebang (diff)
downloadthenautilus-0f0ff3d16ee9ebfaac26e8e899ae160ffe4cb855.tar.gz
thenautilus-0f0ff3d16ee9ebfaac26e8e899ae160ffe4cb855.tar.bz2
thenautilus-0f0ff3d16ee9ebfaac26e8e899ae160ffe4cb855.zip
oyster docs
Diffstat (limited to 'src/SW')
-rw-r--r--src/SW/oyster/.gitignore2
-rw-r--r--src/SW/oyster/document.en.rest.txt107
l---------src/SW/oyster/du2html.xsl1
3 files changed, 110 insertions, 0 deletions
diff --git a/src/SW/oyster/.gitignore b/src/SW/oyster/.gitignore
new file mode 100644
index 0000000..08d33d4
--- /dev/null
+++ b/src/SW/oyster/.gitignore
@@ -0,0 +1,2 @@
+/document.en.du.xml
+/document.it.du.xml
diff --git a/src/SW/oyster/document.en.rest.txt b/src/SW/oyster/document.en.rest.txt
new file mode 100644
index 0000000..99698ec
--- /dev/null
+++ b/src/SW/oyster/document.en.rest.txt
@@ -0,0 +1,107 @@
+===========================
+ Logging your Oyster trips
+===========================
+:CreationDate: 2013-04-12 16:53:02
+:Id: SW/oyster
+:tags: - perl
+ - software
+
+Since I moved to London at the start of 2010, I've been using a Oyster
+card to pay for my journeys on buses and Underground trains. I've
+bought annual travelcards, because they free me from having to think
+about "topping up", and also because they seemed to be cost-effective
+with respect to the "pay as you go" option.
+
+Then, about a year ago, I noticed that I could get a list of my recent
+trips from the TfL website. It was thus necessary that I wrote a
+program to save that data on a machine under my control. So I did.
+
+Storing the journeys
+--------------------
+
+Let's look at the `source for the |oyster| program`_.
+
+It uses a SQLite database to store both the actual journey data and
+the credentials for the website account, so I don't have to write them
+on the command line, and I don't need a separate configuration
+file. The database, by default, is called ``oyster.db`` and resides in
+the same directory as the script (line 16). Command-line options
+(lines 18-31) allow to specify a different database, and to set the
+credentials on the first run (lines 76-83).
+
+The check for the presence of the database (lines 39-51) is not
+particularly smart: try to read data, and if it fails, assume the db
+is new. The database schema (lines 54-75) is also very
+straightforward: a table for the credentials, and a table for the
+journeys. The columns for the latter come from the CSV that TfL
+provides.
+
+To get to the CSV, the script needs to pretend to be a person browsing
+the website, so we use ``WWW::Mechanize`` and log into the site (lines
+85-103). After reaching the "Journey History" page (line 105), we
+should click a link that runs some JavaScript, which is hard to do
+without an actual browser. Instead, we extract the URL from the
+JavaScript that would be run (lines 107-113), and download it directly
+(lines 116-118).
+
+Now we only have to parse the CSV: we use ``Text::CSV_XS``, and
+``DateTime::Format::Strptime`` for the times (lines 120-129). Looping
+over all the lines, we convert the times into ``DateTime`` object,
+adjusting for a few minor issues: bus journeys don't have end times,
+since you don't "tap out" of buses (line 139); also, if you tap in
+before midnight, but tap out after, the CSV still shows a single date,
+but the end time will appear to be about 23 hours before the start
+time (line 144-146).
+
+Inserting into the database is trivial (lines 149-158). Since we set a
+``unique on conflict replace`` constraint on the ``journeys`` table,
+we don't have to worry about inserting twice the same trip: SQLite
+will take care of it.
+
+Using the data
+--------------
+
+As interesting as it is to just look back at where I've been, I found
+a better use for all those database records: I wanted to check how
+much I would have spent if I had not been using the annual travelcard.
+
+Calculating that is not exactly easy: the rules are convoluted, not
+obvious to find, and probably not completely documented. Also, I may
+have interpreted them wrong. Anyway, I wrote `another program`_ to
+read the database and give me the total cost.
+
+Of course it uses the same defaults and command line options as the
+other one (lines 130-148).
+
+The main execution flow loops over all the journeys (lines 174-200),
+grouping them by day; for each day, we store the journey and some
+aggregate information: the minimum and maximum zones travelled, if
+there were any tube journeys, and if there were any journeys during
+peak time. Parsing the "journey description" is annoying but simple
+(lines 185-192); extracting the day of a journey is trickier (lines
+15-24), since TfL considers a day to start at 4:30am. Similarly, a
+journey is during peak time (lines 26-35) if it starts between 4:30am
+and 9:30am. I could find no mention of an evening peak time on TfL's
+website. Mapping a station to its zone should be easy, but TfL does
+not provide a nice text file with this information, so I had to write
+it by hand (lines 37-118) taking the data from Wikipedia; the program
+currently does not account for those stations that, being on the
+border between two zones, count differently depending on which side
+you're arriving from.
+
+After that first loop, we compute the cost of each day (lines
+257-287): we add the cost of each journey in the day, using the bus
+fare or the tube fare (peak or non-peak) for the zones travelled
+(fares on lines 202-254 taken for the website). If the sum for a day
+is above the daily cap for the set of journeys (only bus, or non-peak
+for the zone range travelled in the day, or peak &c), we cap
+it. Adding all the daily costs gives us the total.
+
+I'll have to get a couple more months of data, since I moved from zone
+3 to zone 5 last June, but it looks like that, for the journeys I
+make, the annual travelcard is about 15% more expensive than the "pay
+as you go" option. Either that, or I got my calculations wrong…
+
+.. |oyster| replace:: ``oyster``
+.. _`source for the |oyster| program`: http://www.thenautilus.net/cgit/oyster/tree/oyster
+.. _`another program`: http://www.thenautilus.net/cgit/oyster/tree/calc-price
diff --git a/src/SW/oyster/du2html.xsl b/src/SW/oyster/du2html.xsl
new file mode 120000
index 0000000..e2487e0
--- /dev/null
+++ b/src/SW/oyster/du2html.xsl
@@ -0,0 +1 @@
+../../../templates/du2html.xsl \ No newline at end of file