Friday, July 4, 2014

Real-Time Limit Order Book

In order to begin looking into market microstructure I implemented a Limit Order Book. The (very rough) implementation has 2 modes of operation:

1) It continuously reads from a stream of buy/sell order events (new order, modify order, cancel order) to produce a diabolical ascii chart of order book depth along with time and sales. (shown in the below .gif)

2) It can re-construct an order book from archived order book events. In both cases, a log file is kept from which 225 indicators can be extracted into a csv file. The intention of this project was simply to derive this .csv file for use elsewhere, hence the mess. I intend it to be a starting point for something better.

I do not have access to Level 2 quotes for a traditional ECN (typically this is very expensive due to the sheer volume of data and the potential advantage it can provide). So this implementation is based on data from Bitcoin exchanges, specifically streaming data via a websocket from Bitstamp. New (planned) implementation will abstract this away to a higher level protocol, FIX protocol?.

Divided into 2 sections, the left hand side of the book shows the depth of BUY limit orders (bids), while the right hand side shows the depth of SELL (asks) limit orders. At the top of bid side of the book on the left is the current best bid, while the top of the ask side of the book is the current best ask. Each row is a price level, decreasing/increasing for the bid/ask side respectively. Each level consists of 5 columns: The percentage difference from the best bid/ask (the depth), the cumulative sum of volume/liquidity up until and including the price level, the number of orders enqueued at this price level, the amount of volume/liquidity available at this individual price level, and finally, the actual price of this this level. For example, on the bid side, an entry of: 0.42%, 157.40063272, 1, 59.68000000, 656.26, indicates that this price level is 0.42% less than the current best bid, 157.4 units of volume would need to be sold before this level is reached, there is 1 order at this level (to buy 59.68 units) @ 656.26 dollars.

Trades are shown on the right hand side, in the third column. The purpose of this implementation is to reconstruct/infer trades given a stream of bid/ask limit order events. These trades are the result of this inference: showing the trade direction (BUY = a buy market order hit an ask limit order) (SELL = a sell market order hit a bid limit order). The implementation can sometimes match makers to takers: A maker is an order to buy/sell placed at a price limit _in_ the order book - i.e., the trader is providing liquidity, whereas a taker is an order to buy/sell that will not be placed in the order book (it will consume maker orders until it is filled) - the trader is said to be removing liquidity. As such they pay the price: the market spread + the Volume Weighted Average Price for the market impact.

As mentioned, the code is _very_ rough. However for anyone interested (it is useful for monitoring the bitstamp order book in real time), the code + a detailed description of the derived indicators is available here

Tuesday, June 3, 2014

2-Opt Algorithm for TSP Art

While working on an optimisation problem involving very large scale instances of the Travelling Salesman Problem, I thought it would be an interesting expermient to generate a tour over a set of points obtained from image dithering.

It turned out that this had been done before by Craig S. Kaplan and Robert Bosch, who produced excellent results using a combination of Voronoi Stippling and the Concorde TSP Solver. Nonetheless, having started the project; thought it would be fun to continue.

I first implemented the 2-Opt local search heuristic which is used in the Fast Local Search component of the Guided Local Search algorithm. The resulting optimised code (available at the end of this post if anyone is interested) produced the following "2-Optimal" tour over the 9,882 cities in Greece in ~2.4 seconds. A tour is 2-optimal if no 2 edges can be swapped to yield an improvement on the overall tour length.

To generate a set of points from an image for use as input to the 2-opt algorithm, I implemented the Floyd-Steinberg dithering algorithm which given as input an image, outputs a dithered image. For example, using the famous Lenna test image as input, the following dithered image is produced:

Next, using the filled pixels from some select images (shown below) as points, I applied the 2-Opt heuristic to produce some tours.

Some results are shown below along with a time-lapse video (25fps) showing the optimisation process. I'm using 2 Mucha artworks as input here simply because I thought they would produce interesting results.

The first example (Poetry, 1898) was chosen for it's contrast, which the algorithm reproduces nicely.

The second example (Le Pater, 1899) was chosen for it's shadow - I thought it may produce a Chiaroscuro like effect. However, the initial dithering results in very dense regions of points, which makes Floyd-Steinberg better suited to the higher contrast example in this case. Note that Bosch and Kaplan's results are superior to mine as they use Voronoi Stippling which results in a reduced and better distribution of points, which is good from both a visual and computational point of view.

This (horribly encoded) video, shows the 2-opt algorithm optimising a tour over the points in the first Mucha image. Initially the tour is random. The algorithm then progressively improves the tour using 2-opt moves, swapping edges until the total distance, if one travelled from point to point, visiting each point exactly once, can no longer be minimised.


Fast Local Search Java implementation:
Fast Local Search Lisp implementation:
Dithering implementation (Lisp):
TSP Art code (Lisp + Bash):

Sunday, December 1, 2013

Trading intensity on bitcoin exchanges

Like the FX markets (although obviously with only an insignificant tiny fraction of traded volume), bitcoin is traded 24/7. I was interested to determine a "trading day"/ open-close for 3 exchanges: mtgox (i am guessing mainly US based traders), bitstamp (europe), and finally, btc-china.

First, I obtained time-and-sales/trade (timestamp/price/volume) data for these 3 exchanges. Second, I split the trade data into 1 hour bins for the past -5 days from today (2013-12-01), resulting in 24*5 bins containing the total volume traded for each hour for each of the 3 exchanges in question. Next, as shown on the left in the below graph, I plotted the total volume along with a correlogram showing the autocorrelation of the hour by hour volume for the btc-china, mtgox and bitstamp exchanges (in that order) from top to bottom.

Immediately, it is obvious that the trading intensity varies depending on the time of day (as expected), and this intensity follows some kind of pattern or signal, the presence of which is confirmed in the left hand side correlograms for each of the exchanges.

The data was then split into 24*1 hour bins for each of the 3 exchanges, such that for each exchange a 24 hour cross section of volume data is created, where each 1-hour bin contains the average of the total traded volume for that hour for the past -500 days. For example, the first bin (12am - 1am) contains the average of all volume traded between 12am and 1am for the past -500 days. The result of this aggregation is shown in the right hand side of the above graph, colour coded by exchange. Here each series has been smoothed with a spline function, showing the relationship between the trading intensities of each of the 3 exchanges. As expected, the cross section of the traded volume over the 24 hour period corresponds roughly to the geographical location of exchange clientele. Yielding the following "trading days":

- mtgox: 10am-6pm UTC (5am-1pm central standard time)
- bitstamp: 4am-4pm UTC (5am-5pm central european time)
- btc-china: 9pm-2pm UTC (5am-10pm china standard time)

The shorter trading day for mtgox could be because the assumption that most of the clientele are US based is wrong, and that the data includes volume from elsewhere, including europe, which may explain the intersection of the peaks for mtgox and bitstamp between 4-6pm. An interesting observation from the btc-china cross-section is the existence of an intra-day peak at around midday localtime, and then an afternoon dip followed by a peak of activity at around 10pm local time indicative of 2 trading sessions: morning and evening.

Next, I was interested to see if in addition to an intra-day, hour-by-hour cyclical pattern, there existed an inter-day (day-by-day) pattern of trading intensity. To visualise this, I "expanded" the 3 cross-sections into 7*24*1 hour bins, such that the first bin contains the average of exchange traded volume on sundays 12am-1am. Resulting in the following graph:

In this graph (apologies for the lack of axes: x = hour of week, y = volume), it is possible to see the lagged relationship between the 3 exchanges, for example, the trough in bitstamp trading activity preceeds the mtgox troughs (european trading starts first). In addition, it is relatively clear that on all 3 exchanges, trading activity declines at the weekend (left and right (sunday,saturday,...,saturday,sunday) tails of the graph) and seems to peak on wednesdays (in the centre of the graph).

The trading session of bitstamp (4am-4pm) best suites me (Side note: I would consider mtgox, but after examination of that particular exchange it seems highly amateur, and the fact that it was written apparently in PHP? is highly suspicious.) As such, here is a closer look at the volume cross section of bitstamp:

In the above graph, which shows the weekly cross section of volume for the past -500 days (24*7 1 hour bins) for the bitstamp exchange, the mid-week peak trading activity is more obvious. Also, more interesting is the slight peak of activity (in comparison to the previous day) on friday's at around noon - this could be due to traders closing out weekly positions? Of course closer examination is needed to answer that question.. Collapsing the weekly cross section into 24 1 hour bins for weekday activity and another 24 hour cross section for weekend activity, yields the following graphs:

Here, the left graph shows the cross section of hourly averages excluding weekend data, whereas the right-hand-side graph is restricted to weekend data. Both graphs depict volume traded only on the bitstamp exchange. It is interesting to note the differences between the 2 cross sections. Firstly, the weekday data is more pronounced, and arguably more characteristic of some kind of "wave", whereas the weekend data, in addition to exhibiting less traded volume on average, is flatter and peaks slightly later during the day. (noted that of course the weekend data contains less data points..., so it is not really a "fair" comparison)

Random thoughts

Of course these graphs simply serve as visulisations of the trading data, although for the purposes of determining approximate trading sessions and quickly visualising the time lag relationships they suffice. It would be interesting to have a look to see if a cross section of intra-day volatility (quantified as standard deviation of period returns) would result in similar characteristics and if the well known "Volatility smile" is present as on conventional markets.

Although expected, the fact that there even exists a pattern in the intra-day traded volume is interesting in it's own right. The ACF/correlogram graphs show the presence of a signal which is most obvious in the final graph depicting the weekday 1 hour average volume cross section data for the bitstamp exchange.

The signal resembles a harmonic, which I think is probably the Circadian rhythm - In other words, since human activity is synchronised with the harmonic of the rising and setting sun ..with a few exceptions ;) , the signal reveals the presence of human activity on the exchanges.

If the presence of this harmonic is interpreted as the tell-tale-sign of human activity (interestingly the same harmonic is present in the cross section of traffic congestion in an earlier, non-related, post on on Athens traffic), then it would be interesting to see if it's presence decays over time in proportion to the presence of increased algorithmic trading, as I predict will soon rapidly account for the majority of bitcoin trading activity.

Since humans are likely to be acting on some kind of knowledge ("informed trading"), Perhaps the presence of this signal could be quantified into an additional "Order Flow Toxicity" indicator.

Friday, February 8, 2013

fun with gnuplot

gnuplot> plot 'tmp' using 3 with boxes linecolor rgb "#000333" fill solid, 'tmp' using 2 with boxes linecolor rgb "#336699", 'tmp' using 1 with points lt 1 linecolor rgb "#99ccff", 'ts.dat' with lines lw 3 linecolor rgb "white"

gnuplot> plot 'tmp' using 3 with boxes fill solid lt 3, 'tmp' using 2 with boxes fill solid lt 2, 'tmp' using 1 with boxes fill solid lt 1, 'ts.dat' with lines lw 3 linecolor rgb "white"

Sunday, January 20, 2013 v2 - City Public Transport Explorer

This is the second version of, brings significant improvements over the first version including updated routing data, incorporation of the original taxi app, estimated live departures, timetables and spider diagrams (all transit options to/from a specific location).

By far the most useful feature of, missing from traditional trip planners, is the Route Explorer function. Trip planners do a great job of getting you from A to B, however there are situations where data should be presented in a different way. In particular, the main "problem" of public transport in any city is that riders may feel they are not in control, and the transit system, especially in the eyes of a visitor, may seem to be a mystery. In other words, the rider should be offered a similar level of control as if they were driving a car, free from the uncertainty of scheduled services and the ability to orientate themselves by means of exploration.

The Route Explorer tries to address this need by offering the rider the ability to explore the public transport system even without prior-knowledge of the city or known destination. All routes can be visualised (plotted on the road and not just a straight line between stops) and explored, enabling the rider to become orientated and knowledgeable of their options as opposed to ridged, time dependent itineraries.

Spider diagrams

Sometimes you may not be interested in planning a trip from A to B, you may wish to simply see your options. Inspired by the bus spider diagrams at London underground stations, this tool helps you visualise all of your options from any given point on the map. Ultimately, this means you can never get lost.

The idea is that you can quickly see all of your options with 1 click. The tool scans all options within a 500m radius of the specified point. Routes are colour coded and weighted according to frequency. Higher frequency routes are colour coded in Black, average frequency routes in Blue and low frequency routes in red. In fact, all routes on are ordered by frequency with 1/7 of routes colour coded black, 2/7 of routes blue, and 4/7 routes in red. 14% + 28% + 58% = 100%. This has the effect that the more important routes stand out, and helps to cluster the information in a meaningful way. In the example above, it is immediately clear that if you were interested in getting to the centre of Athens, you should take the 040 bus or number 1 trolley. The lower frequency routes tend to be local services, as shown above, and can be ignored if you are interested in traveling to another district.

The spider diagram tool also works in reverse, showing all transit options to a specific location. This can be very useful, if for example, you are looking for a convenient meet-up point with friends or wish to show customers directions to your place of business. The spider diagram is also interactive: A list of routes are displayed along with the diagram on the right hand side of the map. Routes are grouped by nearby bus stops or stations, and moving the mouse over any route will cause it to be highlighted on the map - this is very useful for more complicated diagrams.

Stop information

Clicking on a stop from the map or route details will show all outgoing services from that stop on the map along with a queue of next departures.

The example above shows the services from bus stop ΡΗΓΙΛΛΗΣ in the Kolonaki district of Athens. The diagram shown on the map is similar to the transit option spider diagram, except that it shows services from a stop rather than from a location on the map. The purpose of the diagram is to help you quickly understand spatially where you can go from a particular stop; I believe this is much more useful than a simple list of services. In this example, it is immediately obvious that the 815 goes to Tavros, although it is colour coded red (as opposed to more frequent Black), so you are likely to be waiting quite some time.

Monday, January 14, 2013

How far from Omonia / Athens?

OpenTripPlanner have released OpenTripPlanner Analyst to analyse city transportation network GTFS data. The tool provides a visualisation of public transport travel times from a specific location. The following map shows how far you can travel from the centre of Athens by bus and metro in 30,60,90,120 minutes. Plan to include this as an overlay on at some point.

Friday, May 18, 2012

real time stocks with gnuplot

Just for fun -- snatch real time price quotes from NASDAQ, plot the results, create audible alert on new trend or if low/high of day is breached: Here, I'm watching $VELT

$ cat >

echo "NASDAQ"
echo "======"
lynx --source |egrep "_[A-Z].*span" |sed 's/.*id="//g; s/\">/ /g; s/\" style.* / /g; s/<\/span>.*$//g' |egrep -v "_NO|_Market" |sed 's/_TradeTimeStamp /Time Stamp\t=\t/; s/_LastSale /Last\t\t=\t/; s/_NetChange /Change\t\t=\t/; s/_PctChange /%\t\t=\t/; s/_Volume /Volume\t\t=\t/; s/_PreviousClose /PreC\t\t=\t/; s/_TodaysHigh /HOD\t\t=\t/; s/_TodaysLow /LOD\t\t=\t/' >velt.tmp
cat velt.tmp
last="`tail -1 velt.log |awk '{print $3}'`"
echo `cat velt.tmp |grep "Time" |awk '{print $4 " " $5}'` `cat velt.tmp |egrep "Last" |awk '{print $3}'` >>velt.log

echo "(`cat velt.log |grep "." |tail -20 |awk '{print $3"+"}' |tr -d '\n' |sed 's/\+$//'`)/20" |bc -l >ma20.tmp
echo "(`cat velt.log |grep "." |tail -50 |awk '{print $3"+"}' |tr -d '\n' |sed 's/\+$//'`)/50" |bc -l >ma50.tmp

ma20=`cat ma20.tmp`
ma20=`printf "%.2f" $ma20`
ma50=`cat ma50.tmp`
ma50=`printf "%.2f" $ma50`

hod=`cat velt.tmp |grep HOD |awk '{print $3}'`
lod=`cat velt.tmp |grep LOD |awk '{print $3}'`
result=$(awk -vn1="$last" -vn2="$hod" 'BEGIN{print (n1>n2)?1:0 }')

if [ $result -eq 1 ]; then
        echo "exceeded high of day" |festival --tts 2>/dev/null &
        result=$(awk -vn1="$last" -vn2="$lod" 'BEGIN{print (n1>n2)?1:0 }')
        if [ $result -eq 0 ]; then
               echo "low of day breached" |festival --tts 2>/dev/null &

if [ $ma20 != $ma50 ];then
result=$(awk -vn1="$ma20" -vn2="$ma50" 'BEGIN{print (n1>n2)?1:0 }')

if [ $result -eq 1 ]; then
        if [ -e velt.trend ] ; then
                trend=`cat velt.trend`
                if [ $trend -eq 0 ] ; then
                        echo "positive trend. " $ma20 " over " $ma50 |festival --tts 2>/dev/null &
                        echo "1" >velt.trend
                echo "positive trend. " $ma20 " over " $ma50 |festival --tts 2>/dev/null &
                echo "1" >velt.trend
        if [ -e velt.trend ] ; then
                trend=`cat velt.trend`
                if [ $trend -eq 1 ] ; then
                        echo "negative trend. " $ma20 " under " $ma50 |festival --tts 2>/dev/null &
                        echo "0" >velt.trend
                echo "negative trend. " $ma20 " under " $ma50 |festival --tts 2>/dev/null &
                echo "0" >velt.trend

Fire this up and run every 5 seconds:
$ watch -n5 ./

Watch with gnuplot:
$ echo "pause 1; replot; reread;" >loop_forever.gnu
$ gnuplot -e set label ''; set key off; set obj 1 rectangle behind from screen 0,0 to screen 1,1; set obj 1 fillstyle solid 1.0 fillcolor rgbcolor 'black'; plot 'velt.log' using 3 with lines, 'velt.log' using 3 with lines smooth bezier lw 3;load 'loop_forever.gnu' -p

Monday, January 16, 2012

Athens 24 Hour time lapsed transit video

Using the data from I've made this 24 hour time-lapsed visualization of the public transit system in Athens from Saturday 4am to Sunday 4am. The blue dots represent the metro vehicles, bus = red, trolley = yellow, tram = orange and proastiakos = green.

I made it by taking a screenshot every 5 seconds in order to check the performance of a new feature I am working on for my app and it turned out to look pretty interesting, so I used some encoding tools to create a video:

sudo ../../ ;xset b off ;xset s off ;xset -dpms ;for((i=0;;i++)) ;do import -window root $i.png ;sleep 5 ;done
for((i=9;i<=15149;i++)) ;do convert $i.png -resize 84% -crop 1280x720+6+75 $i.png ;done
for((i=9,j=1;i<=15149;i++,j++)) ;do cp -v $i.png img`printf %05d $j`.png ;done
ffmpeg -r 55 -i img%05d.png -i song.mp3 -acodec copy -s hd720 -vcodec libx264 -bf 0 -crf 16 -threads 0 -b 5000 athens.mp4

Check it out on youtube here in high-definition :)

Saturday, January 7, 2012

Athens public transport planner

I've finally created a Beta version of my Athens public transport trip planner, schedule and routing app.

The app supports the following main features:
  • Browse the Athens public transport system by route - A service list under the "Network" tab is automatically populated depending on the intersection of routes in the bounding box of the map view. (Shown above) - Services are sized according to trip frequency.
  • Estimated live departure times from each stop in Athens (shown in the popup window above) - This is useful if you want to minimise the time waiting at bus stops.
  • Browse the stops of each route including the first and last departures depending on the current day (Metro line 2 shown above).
  • Quickly plan a route by right clicking on a source and destination position on the map (shown above). Alternative routes become highlighted on-mouse-over. More advanced planning features are also available, such as the ability to specify departure/arrival times and exclude specific modes of transport.
I started the app in 2009 when I moved to Athens (I've recently left). I found it difficult to make any sense of the public transport system. With the exception of the Metro, the rest of the transport means seemed a mystery to me. The only available information at the time were lists of stops and streets from the OASA website. Furthermore, there were even inconsistencies between time table data in Greek and English. The only semi-useful information seemed to be some outdated pdf route maps probably created some time around the Athens olympics.

At the time, I thought it would be a nice idea to create a trip planning app using data which I planned to scrape and create myself. This seemed pretty straight forward at the time and I estimated I could complete the task by devoting a few weekends to it over a period of 4-6 months. However, given that I'm posting this now (January 2012) its pretty obvious that I underestimated the complexity of the job. I ended up working on this app for about 3 years, spread over the odd weekend here and there whenever I had nothing to do, which is quite rare if you live in Athens :)

I began by focusing on harvesting the data I would need for the app. This involved designing a database schema to hold routes, time tables and stop information. I populated this schema by defining the Metro, ISAP and Proastiakos stops and routes by hand - creating line-strings and stop locations. I then populated the Proastiakos time table data by hand (literally) by inputting time table data I found in the stations. The metro and ISAP data was a little more interesting, since the only available data was frequency based. So given frequencies, first and last departures, I created some throw away code to calculate complete schedules for all stations by simulating the minimal number of vehicles needed to maintain the published frequencies given the length of each route.

The next step was daunting, and having spent quite some time to generate just the rail data, I knew that generating data for the entire Athens bus network would be the most time consuming part. Nonetheless, I continued with the hope that I could find a nice way to automate and speed up the data generation process. I started by snatching and parsing route descriptions from OASA which described each route by a textual ordered list of stops and corresponding list of streets. Given this raw data, the next task was to somehow generate a set of line-strings and stop locations for every single route.

To do this, I created a kind of bus route creation tool (shown above) using google maps api (at the time v2) directions and geocoding services. The tool worked by geocoding the list of roads for each route, and then with the help of google's directions service, extract the path for the set of geocoded roads. Then, for each automatically generated path I estimated the bus stop locations by evenly distributing the number of stops for each route over each path. The next phase involved checking each estimated route and manually redefining stops and line-strings where the automated generation had failed.

Eventually I ended up with the prototype shown above - this simply allowed viewing of individual routes and stops. At this point I dropped the project for some time to work on other things. When I returned I was happy to learn that OASA, via the Greek government open geo-data initiative had released time table data in GTFS format. This really helped me, since I had not yet started on creating time table data for buses as I had done earlier with the train network. This, in my opinion, is a great move since it allows developers like me to concentrate more on the creation of an app rather than harvesting and structuring scrapped data. This video I re-posted some time ago on this blog pretty much sums up the benefits.

I also believe the initiative to publish official feeds makes more sense than spending millions(?) on projects such as Attica Routing Portal and the more recent Opti-Trans project, since once these contracted projects are complete, they are likely to lay rotting due to the fact that funds are required to maintain them and eventually just serve as relics based on depreciated routing data.

Although a good move, the GTFS feed itself is not at the moment in good shape. I've been told that OASA are working to fix it, but in the mean time if you are interested in using this data there are a few things to note: (1) The feed is invalid - E.g., it does not conform to the GTFS spec. (2) Quite a few of the stops are in the wrong order. (3) The schedule data for intermediate stops is very wrong for some routes, showing buses traveling greater than the speed of light!

Thursday, January 5, 2012

Genetic Algorithm for Image Evolution

Some time ago I came across this, this and this - an interesting idea to reproduce an image given a minimal set of polygons, utilising evolutionary search. The original idea used a hill climbing strategy to randomly mutate a collection of polygons, keeping a mutation only if the change yielded an improvement, defined by the sum of pixel by pixel differences between the original image and the collection of polygons in the new image.

I was curious if the method could be improved by using a genetic algorithm (using a population of candidate solutions instead of just 1) and ended up with this . Using this image as input, the objective is to evolve a new image constrained by a maximum number of polygons. Here is the result:

The algorithm initialises a population of candidate images where each image contains 1 randomly coloured and positioned polygon. At each evolutionary step, each individual (candidate image) in the population is evaluated according to a fitness function which determines the closeness of the candidate image from the original and then assigned a fitness score. Then, individuals from the population are selected using fitness proportionate selection such that individuals with greater fitness (closer proximity to original image) have a greater chance of mating. Selected individuals then produce offspring using a genetic crossover technique and are then subject to mutation. Crossover involved copying polygons from each parent to form new offspring, while mutation involved random changes to polygon structure and colour as well as the possibility to add or remove a polygon from the newly created offspring.

The main difficulty with using a GA for this problem is the choice of crossover function. I tried a number of schemes and found them to be destructive - Ultimately mutation led the search for image improvements, however this simply results in random search. The second difficulty is the quality of the fitness function: I tried the sum of differences between the original image and the candidate image over each RGB value, histogram comparison and a combination of both. Ultimately, I could not really improve on random search so am thinking of new ways to encode the problem so that the search space is better defined.

Friday, October 14, 2011

Athens: The City Before

A neighboring balcony from a newly constructed modern apartment building partially demolishes a neoclassical building in Athens. 

Based on a sequence of frames extracted from a 1980s film "Here is Athens…..the city before" (Nikos Grammatikopoulos), documenting the destruction of neoclassical architecture in Athens.

Friday, May 6, 2011

Taxi calculation reloaded

Since September last year my taxi calculator app has been used by 44 thousand people looking to estimate the cost of their taxi journey in Athens. After a lot of feedback, last weekend I decided to re-implement it and introduce some new features.

The first version of the app had many limitations. It was only intended to work in Athens, it did not include "Tariff 2" in any calculations (double tariff for part of trip outside of city limits), it did not include the new flat rate 35-50 euro fare to and from Athens airport and Athens centre (specifically inside the Daktylios Athinon), and most importantly, did not take into account the fare incurred when the taxi is stationary (in traffic or lights).

Finally, it turned out that the google directions API I was using to calculate the possible taxi route did not work for the center/daktylios ring of Athens. This is because the centre of Athens operates a traffic restriction policy based on the day of week and parity of number plate, and due to its complication, google or teleatlas have marked the area as pedestrian only. As a consequence, google directions do not work at all for Athens centre and are sub optimal for the Athens greater area because of the diversion from the pedestrian only zone.

The new version now has the following features:
  • Works for most of Greece, including Greek islands (fares differ on islands due to different VAT/FPA)
  • Tariff 2 fare now included for trips originating from Athens or Thessaloniki if the trip exceeds city limits
  • Includes fixed tariff for trips to and from Athens centre and Athens airport
  • 24 hour Forecast of traffic conditions in Athens or Thessaloniki and impact on taxi fare
  • New routing/directions service based on OpenStreetMap data
  • Toll calculation for all of Greece
  • JSON API for taxi cost calculation, toll calculation and dynamic traveling times (contact me if you would like to use it in your own app)
The original app was 100% client side (Javascript) - this is great for scalability, since the client does all of the work, however for the new version I wanted to pull most of the calculation server-side - specifically because the new features require a number of geometry calculations and quite a lot of geospatial data.

In my day to day work I primarily use Java and for my own projects I tend to dump the bloat of xml and overweight frameworks for something more productive, powerful and less suffocating - Specifically a functional language - I've been playing around with Lisp, Erlang quite a lot (at a novice level) and starting to appreciate Javascript as a very powerful language and considered using nodeJs for the application re-write. However, I came across Play Framework which is a full stack web app framework for Java inspired by Ruby on rails. A little skeptical at first, I thought I'd give it a spin, and quickly found it to be a productive and high quality framework.

Eventually I ended up with this stack:
The MapQuest open directions API is a very good open routing service exposed via JSON/XML and makes use of data from the open street map project. The documentation and support is top-quality enabling me to make use of this web service in my app in next to no time.

The most significant addition to the taxi app is the inclusion of a traffic model for Athens and Thessaloniki. As a separate side project I have been working on a forecasting/simulation system for traffic flow which will eventually make use of an implementation I started a few years ago of "NEAT" (Neuro Evolution of Augmenting Topologies) as a forecasting technique.

For now, the model is very crude and is based on the assumption that the deviation from the average optimal driving speed (average driving speed with no-traffic) is very strongly correlated with trip duration. To model this in the taxi app, I use a matrix of weights which represent the deviation from optimal driving speed for an averaged 24 hour period in Athens.

During the hours between midnight and ~5am, the traveling conditions are "optimal", meaning average driving speed will be >= the average of speed limits inside the city - this has a weight of "1" (no deviation). During the first peak/rush hour, the weight is 1.55, or according to my assumption, total (moving) traveling time will be 1.55 times more (on average during weekdays) and total (stationary) time/traffic lights/gridlock with also be 1.55 time more (likely).

The graph above shows my weights which have been derived from actual observations. The 5 labels (A to E) are very interesting and say quite a lot about life in Athens (or indeed many cities). It is obvious that in (A) congestion is much higher from ~7am, peaking exactly at 9am (B). Between 8 and 9, the situation rapidly deteriorates - People are going to work, and evidently the majority of the population want to be there at 9. The sudden peak is due to the size of the city - it is small, you can go quite far in 1 hour and most people will leave for work during this period. Label (C) shows that during weekdays most people will be at work and some perhaps heading out for lunch. Label (C) also shows a peak during saturday, which is probably people who are working on saturdays (having again arrived at 9 - smaller peak below B) and are leaving earlier at around 1pm. Label (C) also marks an interesting point - At around ~2pm the traffic suddenly starts to rise again up until it's peak at ~5pm. This long-tailed rise shows that unlike the sharp 9am peak, where most people want to be at work for 9, the time for people leaving work is obviously quite variable, but peaks around 5pm and long-tails off to settle down at around 8pm. Perhaps I'm in the wrong job because most people i know (working as programmers in Athens) have 0 chance of leaving work before 8. The final label (E) shows a peak of Saturday traffic at around 9pm - tsiporisation :)

Monday, January 3, 2011

Monday, November 8, 2010

Athens transit weighted graph I

Experiment, code->paint: Bash, MySQL Geospatial, Java, Linestring weighted edge clustering, Oil on canvas.
Central Athens bus network, edges weighted by OASA frequency.

Sunday, October 10, 2010

Visualisation of Athens Public Transport

"Unfortunately, no one can be told what The Matrix is. You have to see it for yourself."

Using the new styled map functionality of Google maps API v3, I've created these visualisations of the bus network in Athens.

The network is made up of 569 bus routes covering a combined distance of 6500 Km - enough to stretch to central China... Each route is shown here in red - where bright red indicates a higher concentration of buses.

Central Athens - This detail demonstrates how well the centre of Athens is connected. It puzzles me why anyone would need to use a car (or taxi ;)) I'm in favor of banning all cars in the central zone...

Connections to the port from central Athens

Port of Piraeus

Agia Paraskevi

Monday, September 20, 2010

Athens Taxi Cost Estimation

Tired of being overcharged for taxi trips in Athens? At the weekend I made this simple tool to help estimate the cost of traveling through Athens by taxi: The costs have been calculated according to the most up to date laws (2010) and include a full itemisation of charges you can expect from your taxi trip. The charges also include the 11% VAT introduced for Athens taxi services in July.

To use the tool, simply right-click on the map to specify your starting and destination address or use the input fields in the left hand side menu. The screenshot below shows a trip from the port to the airport including additional port, toll-road (Attikh Odos) and airport fees.

The tool was made using Google's Beta Maps API v3, Google geocoding and directions services and some very ugly hacked up css/jquery/javascript. Improvements coming soon :)

Open Transport Data

A Case for Open Data in Transit from Streetfilms on Vimeo.

Saturday, May 1, 2010

Eastern Telegraph Co Ltd

This interesting looking building was once the hub of telecommunications for Greece, established c,1854 (I think). It is the site where the first submarine cables were laid, in 1858, connecting Syros to mainland Greece via Pieria. Later, cables were laid connecting Syros and Chios, subsequently providing the first telecommunications to Constantinople and Alexandria.

According to this reference Greece under King George, "There are 8,958 kilometres of wire, and 186 telegraph- offices. The most important service is that of the Eastern Telegraph Company, with its head-quarters at Syros." The building later housed the Merchant Navy School.

The interior of the building is fascinating. Strange mechanical devices lay dormant amidst the call of squabbling pigeons.

The structure was severely damaged during the Second World War, although restored in 1947 to house the TTT (Panhellenic Telephony Telegraphy & Postal Service).


Next to the Eastern Telegraph Company building lay the ruins of "Limokathartirio"; a large complex, constructed in 1839 by the Bavarian architect & painter, Wilhelm von Weiler. Weiler was also the architect of the Old Military Hospital in Athens, constructed earlier in 1834. The Lazaretta/Limokathartirio building in Syros is considered to be Weiler's masterpiece. Initially, the complex was used as Quarantine until the late 19th century.

The main part of the building consists of 32 rooms (all equipped with a kitchen, wash area and lavatory), in which mainly Eastern travelers would be required to stay for a minimum of 7 days "clearing period" before entering Ermoupoli.

Until the late 19th century, this clearing period was implemented as a measure to protect the local population of Syros from major diseases of the time, such as the deadly cholera pandemics. From 1908, the complex was used as an asylum for the insane. And later still, up until 1961, used as a prison holding political prisoners en-route to be expelled to deserted islands.

The building has been divided up according to function (Quarantine, Asylum, Prison). It is possible to identify the purpose of each area by looking at the colour of the walls. Saffron is the quarantine area, while Red is the Asylum area and Grey, the prison.

It is difficult to find any specific historical events (although there must be many) related to the occupants of the building. Although, through a little searching, it is apparent that a Greek priest, intellectual & revolutionary (Theophilos Kairis) died in the building (assuming the building was also a prison at the time) on the 10th January, 1853.

In modern Greek history, Theophilos Kairis (b. 1784) is an important figure. On the 10th May, 1821 he declared the War of Independence by raising the Greek flag on the island of Andros (See wikipedia article). He also imported revolutionary thinking, such as western science & liberalism in to Greece and is notably credited with introducing the first telescope to Greece.

Despite this, his "radical" ideas did not impress the Greek Orthodox Church. Consequently, the never ending obstacle of rational thought that is - The Church -, declared him as a heretic & sent him to Syros for trial. Theophilos Kairis, along with his collaborators were tried on the 21st December 1852 in Ermopoli, where he was sentenced to 2 years imprisonment. Being 68 years old and frail, he perished after only 10 days of imprisonment in the Lazaretta building.

He was buried without any ceremony in the yard of the building; Later, priest's, along with followers of the church, desecrated his grave by exhuming his body, cutting open his belly and stuffing it with Lime in protest of his "heresy".

Off topic note: according to this reference, the desecration happened as above - I find this a little hard to believe. It is known that Theophilos Kairis died of "natural causes" - after only 10 days in prison. He died in a time of plague & disease, he was buried without ceremony, and further still, in a building which was also used as quarantine. If these facts are taken into consideration, along with the strange use of Lime, I believe: Lime (Quick lime / powder - Calcium Oxide), in times of plague, has historically been used to cover corpses. This is because calcium oxide/lime speeds up the decomposition process, thereby minimising the risk of disease to the living population. Lime, in it's raw form, was also used to mask the smell of death. Given this, it's more probable that since he perished of "natural causes", it may have been suspected that he died of sickness, and had to be buried quickly - hense the lack of ceremony.

It's a fascinating structure, with a strange, and as is the case with Theophilos Kairis; a dark history. Unfortunately, many of the red brick arches have become the victim of developers (one of whom is known locally as "The Dragon") who have apparently removed many bricks by hammering away at the building under the cover of night in order to build fireplaces in modern day homes.

In 1976, a developer removed over 50 bricks from von Weiler's masterpiece to be used in reinforced concrete for a patio for a holiday villa in the nearby village of Finika. Subsequently, the magnificent arches at the front of the building collapsed. Soon after this, in 1978, the Greek state recognised the building as an official monument; although since then, it has been left to ruin. More unfortunate are the plans to use the site as a casino :( -- I believe this to be a shame, given the site's notable architecture and its importance in modern Greek history.