Space Exploration

Hello? Hello? Is this thing still on?

Hello internet. It’s been quite a while. It’s been a very busy year personally and professionally for me, encompassing changes, big highs and a few sad lows. In truth, I’ve found it difficult to find something worth writing about, even when I found the time. I took an interest in Kodi before finding out how frustrating it can be, and setting up a gaming PC in the Amazon EC2 cloud looked fun, but I never find time to play games these days (although I’ll probably give that another go at some stage).

Tumbleweeds passed by. And then Local Guides for Google Maps came along and piqued my interest.

To Boldly Go

So what is Local Guides? Simply put, it’s a points based rewards scheme for Google Maps. You make contributions, you earn points. Earn enough points, and you level up. Level up, and you get rewards. The rewards are as follows:

  • Level 1 (0 – 4 points): Enter exclusive contests in select countries.
  • Level 2 (5 – 49 points): Get early access to new Google products and features.
  • Level 3 (50 – 199 points): Show up in the Google Maps app with your official Local Guides badge.
  • Level 4 (200 – 499 points): Receive a free 1 TB upgrade of your Drive storage for 2 years.
  • Level 5 (500+ points): Become eligible to apply to attend the annual Local Guides summit in 2016; meet other top Guides from around the world, explore the Google campus, and get the latest info about Google Maps.

To be honest, at first glance, the rewards are pretty meh. However, Level 4 is the headline grabber for me – 1 TB of free storage. Just as well – I got 100Gb of free storage when I bought my Chromebook, and that expires in a few months. I could renew this for as little as $2 a month, but nothing beats free!

So my mission – to explore strange new worlds, to seek out new life and new civilizations, to boldly go where no one has gone before. And in the process, get those 200 points.

Point Scoring

You can contribute to Google Maps in five different ways, each way earns you 1 point:

1 – Add a New Location

Maybe a local business is missing from maps. Add it in using Google Map Maker a for an easy point. New locations are subject to peer review, you’ll get points only if they are published.

01_mapmaker

2 – Edit an Existing Location

Maybe an existing location has details missing (think opening hours, phone number, website), or has incorrect details (location on map is incorrect, classified as “restaurant” when it should be “coffee shop”), or details to be refined (change “restaurant” to “Chinese restaurant” if it better describes a location). Again, subject to peer review, and will earn you a point. Note that you can only earn 1 edit point per location – multiple edits of the same place do not accrue more points.

03_tescoedit

3 – Post a Review

Easy point here. Click any location on Google Maps and give it a rating of 1-5 stars. Add a comment. Job done.

04_review

4 – Add a Photo

Following on from the review point above, add a photo for another easy point. I’m one of those annoying people who take photos of my food at restaurants, so I have plenty of these. Note that you can only earn one point per location here – multiple photos do not get multiple points (though it is nice to see how many views your photos get).

photosGreat

5 – Answer Questions

This is only available from the smartphone app. For any place you have reviewed, you can answer a few yes / no / not sure questions about it. Again, only one point per location, not per question. This is available from two places in the app – from the “Contributions” tab, or by selecting a location in maps and clicking on the “Do you know this place?” link.

questions

How Many Locations?

It’s possible to earn up to 5 points per location, but in practical terms it’s highly unlikely you’ll get 5 points for everything (unless you add in 40 brand new locations that no-one else has ever heard of in the Amazonian rainforests or under the sea). A good question is what is the average points score per location. Here’s my experience:

  • For every location I review, I get 1 point.
  • I answer questions on each location – 1 point.
  • I add a photo roughly every 1 in five locations – 0.2 point.
  • I add a new location roughly every 1 in 10 locations – 0.1 point.
  • I edit a location roughly every 1 in 5 locations (pro tip – add a location and then edit it with further detail after it’s been published) – 0.2 point.

This gives a weighted average of 2.5 points per location. For 200 points, this equates to 80 locations. 80 Locations is my new target.

80 Locations? That’s Loads!

Nope. It’s not really. Think of different categories of places, and think of different places you’ve been to within those categories. Here’s 10 categories for starters:

  • Restaurants – easy to rhyme off places I’ve been to for family or work functions (lunches and dinners), or places I fancied trying by myself. Some I frequent, others I’ve only been once. Quick review on the food, service, maybe a photo.
  • Takeaways – a quick review of the local burrito place, burger bar, chip shop…etc. along with a pic of what I’ve bought.
  • Cafés – there is no shortage of coffee shops nowadays. Are they all the same? Differentiate by talking about service, cleanliness, convenience of location, maybe a photo of a view if there is one.
  • Pubs – can I name 8 bars I’ve had a pint in over the years? Is that a rhetorical question? I don’t usually post pub photos unless it’s a shot of the outside of the building, maybe a holiday snap. Quick comment about atmosphere, cleanliness, beers on offer…etc.
  • Hotels – reviews on anywhere I’ve stayed while on holiday, or wedding venues I’ve been to. Usually have a photo for these.
  • Supermarkets – okay, I don’t take photos of these places, but can easily rhyme off 8 supermarkets (there are 4 within a mile radius of my house alone). Do they stock the products I like? Are they clean? What are the staff like?
  • Convenience Stores – again, no photo ops here really. Same deal as when reviewing supermarkets, although these stores are much more plentiful.
  • Petrol Stations – no photos, simple reviews – do they have adequate parking, are they conveniently located, is there a shop in store, are there toilet facilities…etc.
  • Parks / Zoos / Museums – loads of photos here. Plenty to say about each location as there’s such a variety of scope here.
  • Music / Sporting Venues – some photo opportunities here. I’ve been to plenty of live music shows and have favourites venues – the atmosphere, the sound… good memories. Likewise, I have my favourite sports venues too for similar reasons (Monza in 2006 – what a day that was).

Welcome to Level 4

10-15 minutes a day over lunch or in the evening, and I’d easily amassed 200 points in less than 2 weeks. Truth be told, it became quite addictive. Then the waiting game – it can take up to 5 weeks before you get an email to offer you the 1 TB of storage, but I got mine in a few days. Job done!

06_tb

 

 

 

 

 

 

Taking Matters into Your Own Hands – Android Apps on Chrome

Chrome OS Revisited

Just over nine months ago, I wrote this post outlining why I had purchased a Chromebook. In the time since, I’ve lived quite comfortably without a home Windows environment. Just to recap:

  • I can browse the web, email, shop, bank and pay bills online.
  • Google Docs, Sheets and Slides provide me with word processing, spreadsheet and presentation software respectively.
  • I have 100Gb of Google Drive storage (and I can now add OneDrive and Dropbox into the mix as well).
  • Photos / Google+ provides fantastic photo backup and sharing service, along with lightweight editing facilities.
  • All my music is in the cloud with Google Music (the free storage limit was recently extended from 20,000 to 50,000 songs!).
Hello again!

Hello again!

My HP11 Chromebook is light (just over a kilo), fast (7 second boot) and cheap (£180 last year!), with other similarly specced models available in the same price range. For those things that my Chromebook can’t do, I have my trusty Nexus 4 phone. The Play Store has a plethora of apps to keep me happy, more so than the Chrome Web Store.

And yet… I’d like my Chromebook to do those things for me. Chrome is great, but I want it to do more…

Android Apps on Chrome OS

At Google I/O last year, there was a lot of excitement about the upcoming “Android L” release (which we now know and love as “Lollipop”), including integration with Chrome OS. One showcased feature was Android apps running on a Chromebook – notably Vine and Evernote.

Apps! Apps! Apps!

Apps! Apps! Apps!

I was pretty excited about this – complimenting the Chrome Web Store with the Play Store would add a lot of new functionality to the humble Chromebook – literally thousands and thousands of new apps to really make your Chromebook exactly the way you want it. But it didn’t pan out like that.

I don’t know if I was alone in assuming that apps would “just work”, but it turns out each app has to be ported across to the new platform (i.e. from a touch based phone/tablet to a laptop form factor). This may involve some rewrite on the part of the app developer. Admittedly, some ported apps do “just work”, but I’ll come to that later.

Introducing Twerk

Not long after the first trickle of Android apps for Chrome OS came out, ways and means surfaced on how it was possible to port apps across to Chrome OS. They were far too experimental for my liking, and lead to a complete wipe of the machine too. It was something I decided to live without.

And the Twerk came along. Simply put, it allows me to package up and Android app as a Chrome Extension with a few button clicks and no major surgery involved. I’ve been able to package up and install a few apps that I like, enhancing the functionality that I already have (especially the offline apps). So far, I have ported:

Skype. On a Chromebook. Note the nice little icon in the shelf too!

Skype. On a Chromebook. Note the nice little icon on the shelf too!

Not just for Chrome OS – Android on Windows, Linux and Mac!

In reading up on ported apps, I stumbled upon a reddit community outlining what apps had been tested, which have bugs, which work well…etc. They maintain a list here. Along the way, I learned that such functionality is available on Windows, Linux and Mac as long as the Chrome browser is installed.

That’s right – you can run and test Android apps without owning an Android device.

And Now the How

I’m not a magician, so I’m happy to explain how my tricks are done. Before porting any apps, there are two steps to do:

  • Install ARChon Runtime for Chrome
  • Install Twerk

ARChon is a package that allows Chrome to run Android apps. It’s available here – pay attention to which OS version you require. Once downloaded, unpack the zip file and then go to chrome://extensions in the address bar:

Configuring the runtime

Configuring the runtime

Click the checkbox to enable “Developer mode” checkbox, and then click the “Load unpacked extension” button (both shown above). From the popup displayed, select the directory where you unzipped the ARChon zip file to. ARChon is now “installed”.

Alternatively, If you are using Chrome OS, you can simply install the Evernote app – before the Evernote app is installed, the Android Runtime for Chrome (a suitable alternative runtime) is automatically downloaded and installed first.

Twerk is a much smaller download. Both installs here are one-offs.

For each app we wish to port, we must do the following:

  • Obtain APK
  • Create app extension with Twerk
  • Load the new extension in Chrome

Here’s One I Prepared Earlier

So here’s a step by step guide to how I created my Chrome SNES app. It’s an offline app, meaning that I don’t need an internet connection for it to work (which is a sticking point for a lot of people when it comes to Chrome OS).

Firstly, go to the Play Store and browse to find the app you want, then copy the URL in the address bar. Next, go to APK downloader and paste in the URL of the app you just copied. Click to generate the download link, and the download your APK:

Obtaining the APK

Obtaining the APK

Now that you have the APK you need, launch Twerk. Drag and drop your APK file as shown, and configure as you like:

Configuring your app settings

Configuring your app settings

Note the settings I have chosen. Furthermore, clicking on the default package icon allows you to set a new icon – I set mine to something more appropriate that I just downloaded from Google images. Click the button at the bottom to create and save the new Chrome extension – my OCD necessitates that I save all my extensions in a special “Extensions” folder, but that’s just me.

To load your new extension in Chrome, go to chrome://extensions/ again, and click on “Load Unpacked Extension”. This time, select the folder of the extension you have created. Once loaded, you should see the following:

Configuraing the new extension

Configuraing the new extension

Click on “Launch” to launch your newly packaged extension. After the first launch, an icon is created in your app drawer, so there is no need to go back to the Extensions tab each time. Anyhow, the running app initially looks like this:

Loading screen...

Loading screen…

As it’s an offline app, you will be asked to specify a folder where all offline data is stored the first time the app is loaded. Again, my OCD means that I have a separate “AppData” folder where I save all offline application application data.

And then my app is good to go! Here is Super Mario World running natively on my Chromebook (ROMs obtained separately).

25 years ago...

25 years ago…

Go Forth and Experiment!

I’ve ported a handful of apps in the last week. Some work, some don’t. I’m keen to try more emulators, as well as install more offline apps and utilities. I often check in on the reddit list to see what has been tested, and sometimes whole extensions are packages up ready for download.

Chrome demonstrates once more that it is no ordinary browser. By extension, my Chromebook is no ordinary computer – fast, functional and cheap! Missing Windows? Not me.

One to Rule Them All – Custom Search Engines with Google Chrome

Searching for Answers

Tax avoiding billionaire celebrity bicycle faller-offer Bono once sang: “I still haven’t found what looking for”. Perhaps he wasn’t looking in the right place. I think he’d have had more luck if he used a custom search engine.

Normal search engines will scour the web to find content relevant to what you are looking for. There are 2 problems with this:

  • Not all of the web is accessible via search engine
  • Sometimes too much content is returned

Consider the first point raised here – if you use webmail such as GMail or Outlook, you can’t use a search engine to find that email you’re looking for, and rightly so. Instead, you must use the search facility provided by that mail provider.

Regarding the second point – in retrieving relevant content, a search engine may return several pages of results, and the user is faced with an exercise in separating the wheat from the chaff. Consequently, the user may restrict the search to content from a specific site – whilst this is possible using the site keyword, it’s a somewhat hamfisted approach in comparison to using a sites native search functionality if this exists – consider using Google to search for a forum post rather than forum search capabilities. The forum search functionality will be much more accurate, but it’d be nice to have the simplicity of a single search textbox that most search engines provide.

XDA Developer Forum - Not short on criteria

XDA Developer Forum – Not short on criteria

The Chrome Omnibox – Powerful yet Simple

If you’re a Chrome user, you’re familiar with the Omnibox – the address bar come search engine at the top of the window:

Web address on left, search criteria on right. Note the change in icons also.

Web address on left, search criteria on right. Note the change in icons also.

Type in a valid URL, hit enter and you’re taken to that webpage. Type anything else, hit enter and you’re taken to a page of search results for the criteria you just entered.

Such a simple, useful interface.

Let’s make it more useful, and add the ability to search webmail.

Custom Search Engine – GMail

Going back to my earlier point, I’d like to be able to use the Omnibox to search my GMail account. GMail already has this search functionality – I’d just like to be able to access it via the Omnibox.

Right click on the Omnibox and choose “Edit Search Engines”

edit

From the popup that appears, scroll down to “Other Search Engines”:

custom

Add a new row here with the following values:

New Search Engine GMail
Keyword gm
URL mail.google.com/u/0/search/%s

Then click the “Done” button.

Now click in the Omnibox, type gm then press Tab. Notice how your custom search engine is activated:

omniGmail

Type in your search criteria and hit enter – provided you are logged into Gmail (or Chrome) you will be redirected to the page of results in GMail:

omniGmail2

Custom Search Engine – Forum

Now on to the second point – how to use forum search functionality inside a nice simple interface like the Omnibox?

To lay some context, I’m an F1 fan who regularly visits PlanetF1.com. They have a user forum filled with commentary, rumour, opinions – the usual stuff. Quite often, stories will appear in the forum before the appear in the main site – forum users have other sources of information which they make public before any official announcement. They may or may not be true.

If you’ve never heard of Ross Brawn, you don’t know F1. He’s a highly respected F1 team boss who has achieved success with Benetton, Ferrari and even his own team, Brawn GP. He went on sabbatical a few years ago, seemingly leaving F1 and never to return. But there are always rumours of a comeback.

In short, I want a quick way to search the PlanetF1 forum for any topics that match the following criteria:

  • Any topic from the last 3 months
  • Within a specific subforum
  • Where my search text appears in the first post of the topic
  • Ordered most recent post first

Following the previous steps, I create a custom search engine with the following values:

New Search Engine PlanetF1
Keyword pf1
URL forum.planet-f1.com/search.php?keywords=%s&
terms=all&author=&fid%5B%5D=3
&sc=1&sf=firstpost&sk=t&sd=d&
sr=topics&st=90&ch=300&t=0&submit=Search

To use this search engine:

  • Click in the Omnibox
  • Enter the keyword (pf1)
  • Hit the Tab key to activate the custom search engine
  • Enter my search criteria (Ross Brawn)

rossbrawn2

Bingo!

GETting to the point – How does this work?

Pick any site you know of that provides search functionality. For the sake of argument, lets try BBC news, and search for something topical, such as “Sony hack”:

sonyhack

You are redirected to a page of results, but take note of the URL of this page:

http://www.bbc.co.uk/search?q=Sony%20hack

Let’s search for “Christmas Sales”. Again, note the URL of the page your are redirected to:

http://www.bbc.co.uk/search?q=Christmas%20sales

The URLs only differ by the search criteria entered here. This is an example of an http GET request, where the search criteria appears in the URL (as opposed to an http POST, where the search criteria is sent separately to the page request and thus does not appear in the URL).

The idea here is to create a template which uses characters that we enter from the Omnibox. In the case above, we substitute our search criteria with the %s wildcard thus:

New Search Engine BBC News
Keyword bbc
URL bbc.co.uk/search?q=%s

And hey presto – a custom BBC news search engine!

For Your Convenience…

I’ve compiled a list of custom search engine URLs that you may find useful. Note that you are free to name them as you like and set whatever keyword you fancy.

Social Media
Facebook facebook.com/search/results/?init=quick&q=%s
Twitter twitter.com/search?src=typd&q=%
Google+ plus.google.com/s/%s
Yammer yammer.com/#/Threads/Search?type=following&search=%s&search_source=global
LinkedIn linkedin.com/vsearch/f?type=all&keywords=%s
Webmail
GMail mail.google.com/mail/u/0/#search/%s
Inbox for GMail inbox.google.com/u/0/search/%s
Outlook dub125.mail.live.com/?fid=flsearch&srch=1&skws=%s&sdr=4&satt=0
News and Reference
BBC News bbc.co.uk/search?q=%s
Wikipedia en.wikipedia.org/w/index.php?search=%s&title=Special%3ASearch&go=Go
IMDb imdb.com/find?ref_=nv_sr_fn&q=%s&s=all
Buying Online
Amazon UK amazon.co.uk/s/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=%s
eBay UK ebay.co.uk/sch/i.html?_nkw=%s&_sacat=0
Streaming Media
YouTube youtube.com/results?search_query=%s
BBC iPlayer bbc.co.uk/iplayer/search?q=%s
Other
Google Drive drive.google.com/drive/#search?q=%s
Google Translate to French translate.google.co.uk/#auto/fr/%s
Google Translate to Spanish translate.google.co.uk/#auto/es/%s
Google Translate to German translate.google.co.uk/#auto/de/%s
Google Translate to Italian translate.google.co.uk/#auto/it/%s
Google Translate to Irish translate.google.co.uk/#auto/ga/%s

Go Forth and Multiply!

That’s just a handful of custom search engines I can think of off the top of my head. Use as many or as few as you like. Better yet, come up with a few of your own 🙂

Lights! Camera! Action! Data Mining the IMDb Top 250

Hi ho!

As discussed in the previous post, data mining is the analysis of a dataset for the purpose of identifying patterns. In turn, these patterns give us a better understanding as to the nature of the data we are recording. Mining a weather dataset can help predict long term weather forecasts and plan accordingly; mining a sales dataset can help identity best selling products / sales groups and address stock levels / marketing strategy accordingly.

In this post, I will be mining the IMDb Top 250 dataset from the previous post. Whilst the information obtained herein is unlikely to be of any financial benefit to the reader, the purpose will be to impart practical examples of data mining techniques that the reader may then apply to their own situation.

Querying and Filtering a Single Table

A lot can be learned from a few queries on a single table. Let’s look at the Film table and find all those films in the top 250 that have been release from the year 2000 onwards:

top250

How many films is that though?

The answer here is 80 films. COUNT() is an example of an aggregate function – that is, one that returns a single value based on rows of results, rather than a value per row. Some other handy aggregate functions include:

  • MIN() – the lowest value of a column
  • MAX() – the maximum value of a column
  • AVG() – the mean average value of a column
  • SUM() – the sum of all values of a column

I haven’t included a SUM() example above (as summing ratings is pretty meaningless). That said, if our Film table had a “Total Ticket Sales” column, it would be interesting to sum that and compare different years to see the rise and fall of sales over the years, and potentially chart that against the state of the economy at that time. Just saying…

Going Wild!

If we are searching by text criteria, we can specify a generic data range using the % wildcard. The following query lists all those films that begin with the word “The”:

TheFilm

However, we may wish to search for any film where a specific word appears anywhere in the title. The following query lists all those films with the word “City” in the title:

CityFilm

A word of warning here though – adding the percentage wildcard to the beginning of a string search will negate any indexing done on this column. This will result in queries that take much longer to return than expected, especially if we query a large dataset!

Cross Table Queries

A lot more can be learned by interrogating multiple tables at once using joins. By joining the following three tables:

  • Actor
  • lnk_Actor_Film
  • Film

We can determine the cast of a film:

lebowski2

Or find out what films an actor has been in that are in the top 250:

ellenpage

Grouping Films by Genre

I’m a big Sci-Fi fan. Using the following query, I can find the list of Sci-Fi films in the top 250, sorted by descending rating:

scifi

What’s also interesting would be to find what the most popular genres are – that is, the number of films in the top 250 per genre. This query will combine a row value (Genre.GenreName) with an aggregate value (the number of films that belong to the genre specified in Genre.GenreName). To do this, we use GROUP BY thus:

genres

Note that the sum of GenreCount will exceed 250 here, as films can be of more than one genre. Speaking multiple genres, it’s also useful to know what sort of films a specified actor normally appears in:

portmangenres

Robert De Niro and Joe Pesci explain Unions, Intersects and Exceptions

Robert De Niro has quite a few films in the top 250, and so does Joe Pesci.

JoeBob

Here’s how to find a list of all films in which either of them have appeared, using UNION:

DeNiroPesci1

Worth noting here is that UNION will eliminate duplicates – i.e. films that both actors have appeared in together (UNION ALL lists duplicates). Speaking of films starring both actors, we can find this out using INTERSECT:

DeNiroPesci2

Also, we might want to find those films that Robert De Niro is in, but Joe Pesci is not – we can do this using EXCEPT:

DeNiroPesci3

Clint Eastwood and Morgan Freeman are Cannier than Robert De Niro and Joe Pesci

The trouble with UNION, INTERSECT AND EXCEPT above is that we end up scanning the same table (Film) twice – once for Robert De Niro, and once for Joe Pesci. It’s not efficient, especially if we are scanning a large table. Clint Eastwood and Morgan Freeman also have a lot of films in the top 250, and having been around longer, they know a few things.

MorganClint

To get a list of films that either Clint Eastwood or Morgan Freeman have appeared in, we can use the following query:

ClintMorgan1

Here, we use an “OR” condition in our “WHERE” clause, so the Film table is only scanned once. The addition of the DISTINCT keyword ensures that we remove duplicate films (i.e. films that Clint Eastwood and Morgan Freeman have appeared in together will only be listed once).

Speaking of films that star both actors, consider the following query:

ClintMorgan2

This is interesting because we use aliasing to work with two copies of the Actor and lnk_Actor_Film tables. Moreover, we join the lnk_Actor_Film table to a copy of itself! Conceptually, it looks like this:

dbstuff2

Whilst we are scanning the Actor table twice, we are doing that with the INTERSECT version of this query anyhow.

Lastly, we consider the films that Clint Eastwood is in, but Morgan Freeman is not. Have a look at the following query:

ClintMorgan3

There are a few interesting points here. Firstly, we create a temporary dataset FreemanFilmIds that only exists for the execution of the query. This table stores the Ids of Films that Morgan Freeman has appeared in, without having to scan the Film table at all (we’re only interested in the ID here).

The next interesting point is that we left join our Clint Eastwood film data with our FreemanFildIds dataset. Conceptually, it looks like this:

LeftJoin

We then filter on FreemanFildIds.FilmId is null – that is, those films that Clint Eastwood is in, but Morgan Freeman is not.

Filtering by Aggregate Functions

Consider the following query to find the top 20 Actors by highest average film rating of films they have appeared in:

Here are the results:

top20

Who are these people? And all the same score? Where is Morgan Freeman?

On investigation, it turns out that all of the above people have only appeared in 1 film in the top 250 – The Shawshank Redemption – which has a score of 9.2 (the highest score) meaning that film extras or minor part actors have a very high average rating. I did like Clancy Brown in this film, and he’s great in The Highlander (not in the top 250 tho!), but he’s no Morgan Freeman.

As such, I rewrite the above query with the stipulation that an actor must appear in at least 5 films in the top 250. To filter on an aggregate function (namely COUNT()), I must use the HAVING keyword thus:

Here are the results:

top20-5

 

A lot of familiar names there.

I Didn’t Like Seeing Donny Go…

And so we reach the final curtain. I hope I’ve provided ample examples of  functions and techniques for your benefit. If so, please comment, like, favourite, retweet, +1, share…etc. Full script of SQL queries available here.

Everything in its Right Place – Database Normalisation

Flat File Tables – Everything at Once

I’m currently working on a project to replace an Excel based filesystem with a .NET / SQL Server solution. Part of the development effort necessitates the conversion of Excel lookup tables to relational database tables. Often, the Excel lookup tables are “flat file” in nature, and require normalisation.

To illustrate the point without discussing project data, consider the following dataset (data taken from IMDb Top 250 as of date of posting):

dataset

This dataset represents a list of films and the actors who were in them. If I wanted to find the cast of a specific film using Excel, I can use a filter on the film name:

lebowski

The problem here is that a lot of data is repeated unnecessarily – if a film name was entered incorrectly, it would need to be changed for all occurrences in the film column. A similar issue arises for actor names.

There are clearly 2 distinct datasets here – Films and Actors. The link between these two datasets is known as a relationship; namely what films an actor has appeared in and vice versa.

Combining all aspects into a single dataset (i.e. the Cartesian product) produces what is known as a “flat file” database – a single table with all info.

Why do people store data like this unnecessarily? Well, it usually starts with a need to record all data about a relationship – the film details, the actor details…etc – but the subsequent analysis of the data into distinct groups is never done, and the table grows in the fashion because “that’s the way the data has always been stored”. The process of splitting the datasets up and identifying the relationships between them is known as normalisation. Here are some of the techniques I have used in my project work applied to the above dataset.

Importing Excel Data

The first thing to to is to import my Excel flat file dataset into a similar flat file table in my database. Normally, I would save my dataset off as a comma delimited values file, but there is an issue here as film titles can include commas as well, which would break a subsequent data import. To get around this, I use the method defined here to save my file as pipe delimited instead.

I then create a database table that will store all our flat file data thus:

Lastly, I run a bulk insert to add all our CSV data to the database:

Deriving our Film Dataset

Using the flat file table as a base, I create a new table of the list of distinct films:

I populate the film table thus:

I then modify the flat file table to add a link to the film table:

And populate this column thus:

Furthermore, we index this column as follows:

I no longer have need for the film columns on my flat file table, so those column are dropped:

Deriving our Actor Dataset

A similar series of scripts is applied to create an Actor table:

We are left with FilmId, ActorId and the property of this relationship (i.e. the character that the actor played in the film). Just to tidy things up, we’ll rename the table from FilmData to lnk_Actor_Film:

The Film Genres table – Getting Clever with Functions and Cursors

To create a Genre table (and the relationship between Film and Genre),  a little more work is required due to the way film genres are recorded in the flat file.

That is, a film has a comma delimited string of genres stored against it; ideally, we wish to get the complete list of distinct genres. To do this, we need the following custom function, which will split a single comma delimited string up and return a table of its constituent parts:

Next, we will use a cursor to iterate over all our films and add genres to new table thus:

Now we want the a table of the distinct genres, which we create and populate as follows:

Now for the relationship table to link Films and Genre. Again, we use another cursor to iterate over our films – this time we find all GenreIds that apply to the film, and add entries in the link table, and then drop the “Genres” column from the Film table:

So our Entity-Relationship Model looks like this:

ERDiagram

Hi Ho! Data Mining… Coming Soon

Normalised databases are perfect for Data mining i.e. the analysis of a dataset for the purpose of identifying patterns. These patterns are used to give us a better understanding as to the nature of the data we are recording. I’m going to follow up with another post on this topic in a few days. In the meantime, the both the CSV file and my MS SQL scripts are attached if you’d like to play around with the above dataset:

 

 

 

 

Upcycling – A New Use for an Old Smartphone

Hoard Mode

I confess, I’m a hoarder. It’s a habit that baffles family and friends; why do I retain items that other (saner) people would have thrown out as junk long ago?

In most cases, there is a misguided belief that an item will become useful again at some point in the future. In some cases, there is a sentimental attachment that prevents me disposing of said item. A few favourite exhibits from my extensive collection / pile are shown below.

L-R: Megadrive circa 1990; assorted F1 memorabilia from races attended last decade;  the mighty Scorpionok; one of many old phones

L-R: Sega Megadrive circa 1990; assorted F1 memorabilia from races attended last decade; the mighty Scorpionok; one of many old phones

The last item in the above picture is a Samsung Galaxy Ace, my first smartphone. I have really fond memories of this phone – it could do so much without costing a fortune, and it started my interest in Android, but it’s light years behind my Nexus 4. How could it be put to use? A backup / emergency phone? I have plenty of those already…

Things to do with an old Droid

Consider this: a smartphone is a mini computer, with built in WiFi, camera, Bluetooth, GPS, sensors, solid state storage – it’s a digital Swiss army knife.

The protocol droid is fluent in over six million forms of communication, despite having its mind wiped

The protocol droid is fluent in over six million forms of communication, despite having its mind wiped

There are several articles detailing tonnes of uses for an old Android, such as:

  • Media server
  • Internet connected security camera
  • In car GPS and navigation system
  • Pedometer
  • Torrentbox
  • VPN Server
  • Bitcoin miner
  • Internet radio

In time, I think I’ll attempt all of the above, but for today I’m writing about a torrentbox / media server. The instructions below show how to create either, or indeed both (which is what I did).

Torrenting for Beginners through the media of Tetris and Food

For those not in the know, torrenting is a very popular way of downloading larges files on the internet. Instead of downloading a large file from a single server, a user downloads file fragments from multiple servers asynchronously, which are then assembled in the correct order on the client machine. A separate, smaller file (the torrent file) tells the users computer which fragments to fetch, and the order in which they are to be assembled.

Notice that there is only one straight line piece in this diagram. Entirely intentional.

Notice that there is only one straight line piece in this diagram. Entirely intentional.

In essence, a torrent file is like a recipe – it details the ingredients (fragments) required to make a meal (large file) and the ordered steps to recreate it.

The advantage of the multi-server download approach presented by torrenting is that if one server goes down, the user just gets the fragments from somewhere else (a bit like buying some ingredients from Sainsbury’s if Tesco are out of stock, rather than waiting for Tesco to restock).

Disclaimer – you should not download copyrighted material. Please see this helpful video explaining why.

Creating a Torrentbox

Torrent users will search for torrent files on the internet. Those files are then opened with a torrent client program, and downloading begins. The download is left to progress in the background while the user does something else on their computer, or maybe just walks off and does something else entirely.

A torrentbox is a computer used only for the purpose of downloading files in this manner. Lots of old computers find new life as a torrentbox, why not old smartphones? They have disk storage, internet access, and require a minimal amount of power. By nature, they are highly portable as well.

And so I installed tTorrent on my Galaxy Ace.

I chose tTorrent because it’s one of the few torrent clients that has a “watch directory” function – that is, any torrent files dropped into a specified watch folder will automatically be opened and downloading will begin.

And so I installed Dropsync on my Galaxy Ace.

Dropsync allows me to synchronise a folder within my Dropbox account with a local folder on the phone. The idea being that I search for torrent files on my Nexus 4, my Chromebook, my Nexus 7 – any device with a browser – and drop them into my Dropbox folder. The Galaxy Ace syncs the folder, picks up the torrent file and begins downloading.

And all ready by the time I get home

And all ready by the time I get home

Unfortunately, the Galaxy Ace does not support USB On-The-Go, although most old phones running Android 4+ should (try it out if you have an old Galaxy S2 lying around). With USB OTG, I could attach an external hard drive to the phone for much more storage than an SD card could provide. That said, I’d recommend plugging both the smartphone and the external hard drive into an externally powered USB hub – that way the phone can access the hard disk without having to sacrifice an external power source and resort to the battery.

Creating a Media Server

The purpose of a media server will be to share out content stored on the phone with other devices on the same network. This may be content downloaded via torrenting, or any other media accessible to the device (via SD card or connected hard disk, say).

Just one app to install here – BubbleUPnP has a free version with limitations on how much video content may be streamed per day, but the licence is a little over £3 at present. This will make the phone and its content accessible to any other device on the network with a DLNA client:

  • If you have an Android phone or tablet, install BubbleUPnP on it (disable local content share in settings)
  • If you have an iPhone / iPad, install GoodPlayer
  • Windows phone / tablet, install MediaMonkey
  • Windows computers, Macs, XBox 360, XBox One, PS3 and PS4 already have built in clients
From has-been to centre of attention

From has-been to centre of attention

A New Life for an Old Friend

Okay, this setup will never rival a NAS box for performance, but it does give pretty good functionality for essentially nothing given that I had the parts lying around anyway, and power requirements are very low (it uses less energy in a day than it takes to boil an average kettle).

If I did want to spend money, these old phones will only get cheaper as new ones are released. In fact, I’ll probably end up inheriting a lot of them when family members upgrade to new phones.

But I’ll probably use those phones to try out some of the other ideas above.

It’s good to hoard after all.

Chromebooks – Taking the Plunge

Who owns a Chromebook anyway? Well… me

In October of last year, I received a gift voucher from work for completing 5 years of service. After almost a year, I finally got round to spending it – two weeks ago I bought a Chromebook, namely the HP11 shown below.

Action shots with obligatory coffee mug

Action shots with obligatory coffee mug

Why did you do that?!

There are lots of good reasons not to own a Chromebook. They’re not for everyone. Imagine a laptop where the only available program is a browser, and you’re halfway to understanding what a Chromebook is. My HP11 has no optical drive, so I can’t play CDs or DVDs. The hard disk is a meager 16Gb (as in sixteen, there are no missing zeroes there). Functionality is further limited when I don’t have a WiFi connection.

On the software side of things, I’ll be blunt. Anything you can do in a Chromebook you can already do on another computer just by installing the Chrome browser. But there’s lots more that other computers can do that the Chromebook just can’t.

And yet… I still bought a Chromebook. And not on a whim either – I took the best part of a year to decide on what I wanted to spend my gift voucher on. Why?

Making Decisions

My home computer is a Windows 7 laptop I bought 4 years ago. I’ve come to tire of the endless swathes of Windows updates and ever increasing boot times, so a buying a new laptop has been at the back of my mind for a while now. That said, a new Windows machine means Windows 8 (which I’m no fan of), and a new Mac would mean spending twice as much as I would on a Windows laptop. What are my alternatives?

Here we go again...

Here we go again…

I began to think about what I use my home laptop for. Browsing the web. Sending email. Online shopping. Online banking. Paying my bills. Social networking. I back up my photographs to the cloud. Occasionally, I watch catch up TV services, or dabble with Netflix. These are all browser based activities, so I’m not asking for much in a new laptop when it comes to hardware.

On that point, I’m not a PC gamer (I have an Xbox), so cutting edge hardware has never been a requirement of mine. I began to wonder what hardware I didn’t really need.

I can’t remember the last time I installed software from my DVD drive (I download), or if I’ve ever watched a DVD on my laptop. As for CDs – I converted all my CDs to MP3s years ago and pushed them to the cloud. Nowadays, I don’t buy physical media – all my music purchases are digital. In short, I don’t need an optical drive.

I’ve always wondered how much hard disk space I actually need. My laptop has a 500Gb hard disk, of which I have quite a bit of free space. In the interests of portability, I moved larger media files to an external hard disk and photos and music to the cloud. The idea of having my media confined to a single device now seems archaic to me.

What I actually want is a laptop that boots up and performs quickly for all my regular tasks.

So I started looking at Chromebooks, and began mulling over whether I could live with one.

Chrome – More than just a browser

Yes, the Chromebook is like a laptop that only has a browser installed, but that browser is Chrome!

Apps, extensions and themes galore...

Apps, extensions and themes galore…

A quick browse of the Chrome web store shows you the wide range of free apps and extensions you can add to Chrome, including a full suite of Office Software that is fully compatible with MS Office:

All of the above have a share functionality that allows multiple users to view / edit a document at the same time. The document owner can decide what level of access other users have. It’s a useful collaborative tool makes multiple users sending each other different copies of the same document a thing of the past.

I also recommend the following:

  • Gliffy for any flowcharts, organisational charts, UML diagrams …etc.
  • Save to Google Drive – allows you to right click and save web pages / images directly to your cloud storage.
  • MightyText – pair with your phone and send / receive SMS on your computer.
  • AccessToGo – remote desktop client I use to remote to an Amazon EC2 instance and do my Android development work.
  • Google+ Photos – use to backup and share your photos (Chromebook only).
  • Hangouts – use as you would use Skype

There is a selection of games too, including

No need to run out and buy a Chromebook for any of this – install Chrome on your computer and have a look round the store and install whatever you like!

Point of Sale

After building up a suitable level of intrigue (and seeing the HP11 on sale for a £50 discount), I decided to take a gamble and bought my Chromebook.

I held off on buying any additional storage to compliment the 16Gb hard disk – the HP11 comes with 100Gb of free Google Drive storage for 2 years ($1.99 a month thereafter). There are 2 USB ports, so I can pop in a thumbdrive if I really need one.

Sadly I couldn’t avail of the 2 months free unlimited music at Google Music All Access, as I already have a monthly subscription anyway.

Build Quality

There are some things about the HP11 that stand out straight away as great, some less so, and others things that take a bit of getting used to.

The build quality is superb, and not just superb for the price bracket it’s in. It’s light (just over a kilo), silent (ARM processor means no fans and thus Red October levels of silence), and the keyboard is comfortable and responsive. Interestingly, the HP11 speakers are underneath the keyboard, so the sound comes straight up from there. Sound quality is good too.

Another nice feature is the charger – the HP11 uses a standard USB Micro charger, just like many phones, tablets and other assorted gadgets today. Battery life is pretty good, although I can’t get the claimed 6 hours of use (5 at a push for me). Still, much more than adequate.

The screen is 11.6″, which is huge for a tablet, but smaller than what I’m used to with a laptop. The loss in screen space means that while perfectly usable, webpages appear more cluttered than I’m used to – there’s less padding on screen and a bit more scrolling to do – but to be fair, I do work with twin 24″ monitors at work. I’m wondering if I should have spent a little bit more on a HP14, but I like the size and weight of the HP11. A happy medium would be an external monitor to allow me to avail of greater screen space, although I can’t take that with me on the move. I could use a Chromecast to cast my HP11 screen to a TV, but I wouldn’t be able to run the TV on a different resolution to the monitor. Physical size aside, the screen sports a 1366 * 768 resolution and it just shines. It’s a beautiful thing.

Shiny!

Shiny!

The trackpad is taking me some time to get used to. Coming from a Windows laptop, I’m used to a trackpad with left and right buttons, just like a mouse. The HP11 has no such buttons – tap with one finger to left click and two fingers to right click. Use two fingers to scroll, or swipe with three fingers to see all open screens. If you’re a Mac user, this is all second nature to you. I do have a free USB port to attach a mouse, but I think this is one that I’ll persevere and get used to.

Ready, Set, Go! Well, sort of…

The HP11 will boot in 7 seconds (I haven’t left any zeroes out there either). This is light years ahead of my Windows 7 laptop. The reduced hardware set and solid state hard drive means that the processor has less to do on startup.

That said, the first time I started up, I was told that the latest version of Chrome OS was being downloaded and installed – this took around 5 minutes. Chrome OS will always stay up to date and silently download updates in the background, which are installed on reboot.

Existing login on left; new user login on right

Existing login on left; new user login on right

Users log in with their Google account (you have one already if you use GMail, but you can create a new one from the sign in screen if you want). If you use Chrome on other devices (I have Chrome on my work laptop, home laptop and my Nexus 4 phone), all your bookmarks and settings are synced. For me, I can start typing up a document on Google docs at work and then finish it off at home on the Chromebook. I could even proofread it on my phone on the commute to work the next day.

Simpler still – I take photos with my Nexus 4 phone, which backs them up to the cloud. When I log in to my Chromebook, they are there for me to browse, edit, download to USB, set as wallpaper…etc.

I can remove and add users to the Chromebook as I like. As such, anyone with a Google account can sign into my Chromebook and use it as it were their own (note they cannot see files from other accounts on the same Chromebook). When a user is removed, all their details and settings are scrubbed from the machine. Potentially, this makes reselling easier, but I’m not selling mine…

Going off the Grid

Time to get to the elephant in the room. Without an active WiFi connection, Chromebook functionality is heavily limited.

That said, several apps offer offline capabilities (all the office software listed above have an “offline mode”, which will merge any changes into cloud documents on reconnection).

However, Chromebooks have their full potential when they are connected to the internet. This is a cloud centric machine after all.

But then again, without an internet connection, my Windows 7 laptop can’t do most of the things I use it for anyway either…

Chrome OS and Android L

I’m pretty excited about the upcoming “L” release of the Android OS, particularly its’ integration with Chrome OS and the new unified look and feel between the two platforms.

Material design goodness will unify Android and Chrome OS interfaces

Material design goodness will unify Android and Chrome OS interfaces

A new proximity API will sign you into your Chromebook automatically if your Android phone is nearby.

Phone notifications will now appear on the Chromebook – no more missed calls or urgent SMS because I had headphones in at the time of the call.

But the biggest news of all if the capability to run Android apps natively on the Chromebook. The Chrome web store is good, but the Google Play Store is much bigger and better, with a lot more choice and a lot more games (perhaps Minecraft fans will be able to get their fix). I’m keen to see how this pans out as this could take Chromebooks to the next level.

Horses for Courses

You may be reading this and asking yourself “are Chromebooks for me?”.

Maybe they are. Maybe they’re not. As an everyday computer for home use, the HP11 is great. I use Microsoft Office 365 at work, so I’m still able to access my email, work calendar, Lync messenger and OneDrive storage on my Chromebook. I can create MS Office documents online also. Of course, I can fill in my timesheets and holiday requests too.

I have a few options when it comes to doing development work – Codenvy is a browser based IDE for developing and deploying apps to the cloud, but it only caters for a limited set of languages (there’s no .NET languages unsurprisingly) – as hinted earlier, one solution is to remote onto a virtual machine. I do this for Android apps that I write from time to time; in this way I pay for development functionality on a “pay as you go” basis, rather than paying for all the necessary hardware in an upfront one-off cost. I haven’t tried gaming on a VM yet, but have a feeling it won’t be great.

In short, while Chromebooks aren’t for everybody, I would say that they are for most of us.