Data Science - Small Business Investment in London

Capstone project on Jupyter Notebook

SMALL BUSINESS INVESTMENT IN LONDON


London's finance industry is based in the City of London and Canary Wharf, the two major business districts in London. London is one of the preeminent financial centers of the world as the most important location for international finance. We are going to discuss what are the best venues which we can invest in London, we present an analysis made in phyton and we show step by step how we can determine the top 10 venues for the best 3 Boroughs based on the demographic data, unemployment rate statistics which it is suggested to invest in London .
The content of this work has five sections:
  1. Getting the the data
  2. Data wrangling
  3. Methodology
  4. Data analysis
  5. Results and Discussion

London is one of the most attractive cities to do business in the world. It is the capital of both England and U.K. In this notebook we shows how we can study data from UK to understand how and where to invest the money with an small business.

1. Getting the the data

First at all we are going to use iPython based in phyton 3.6 to do the analysis.

1.1 Setup

In the setup part we simply need load all the libraries needed to do this study that is given by:
In [2]:
# library for BeautifulSoup, for web scrapping
from bs4 import BeautifulSoup
# library to handle data in a vectorized manner
import numpy as np
# library for data analsysis
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
# library to handle JSON files
import json
print('numpy, pandas, ..., imported...')
!pip -q install geopy
print('geopy installed...')
# convert an address into latitude and longitude values
from geopy.geocoders import Nominatim
print('Nominatim imported...')
# library to handle requests
import requests
print('requests imported...')
# tranform JSON file into a pandas dataframe
from pandas.io.json import json_normalize
print('json_normalize imported...')
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
print('matplotlib imported...')
# import k-means from clustering stage
from sklearn.cluster import KMeans
print('Kmeans imported...')
# install the Geocoder
!pip -q install geocoder
import geocoder
# import time
import time
!pip -q install folium
print('folium installed...')
import folium # map rendering library
print('folium imported...')
from pandas import ExcelWriter
from pandas import ExcelFile
print('...Done')
numpy, pandas, ..., imported...
geopy installed...
Nominatim imported...
requests imported...
json_normalize imported...
matplotlib imported...
Kmeans imported...
folium installed...
folium imported...
...Done
In [3]:
import warnings
warnings.filterwarnings('ignore')

1.2 Selecting the data

In ordering to determine the area of London where we have to do the investment we require take some data from the London's Poverty Profile 2017 and Annual Population Survey via Nomis, ONS. given at the Trust for London [ 1 ], and New Policy Institute [ 2 ].
The unemployment ratio is the proportion of the working-age population that is unemployed, and in the unemployment ratio by borough in London is showed in the following figure, that is obtained by the following commands:
In [5]:
unemployment_ratio_df = pd.read_csv("unemployment_ratio.csv") 
unemployment_ratio_df2=unemployment_ratio_df.dropna()
unemployment_ratio_df2.rename({'Unnamed: 1': '2001', 'Unnamed: 2': '2011'}, axis=1, inplace=True)
unemployment_ratio_df2.reset_index(drop=True, inplace=True)
unemployment_ratio_df2['2011'] = unemployment_ratio_df2['2011'].str[:-1].astype(float)
unemployment_ratio_df2['2001'] = unemployment_ratio_df2['2001'].str[:-1].astype(float)
unemployment_ratio_df3=unemployment_ratio_df2.sort_values(by='2011', ascending=True)
unemployment_ratio_df3.plot(x="Unemployment ratio by borough", y=["2001", "2011"], kind="bar")
Out[5]:
<matplotlib.axes._subplots.AxesSubplot at 0x1e838fa1cc0>

This graph shows that the ratio has come down significantly in almost all London boroughs in a relatively short timescale.

Considering the lowest unemployment ratio by borough we have the following table:
In [6]:
unemployment_ratio_df3.head()
Out[6]:
Unemployment ratio by borough20012011
31Hammersmith and Fulham5.73.5
30Wandsworth6.33.5
29Kingston upon Thames5.53.7
28Camden5.43.9
27Sutton5.83.9
The top 3 boroughs with lowest unemployment ratio in 2011 are :
  1. Hammersmith and Fulham
  2. Wandsworth
  3. Kingston upon Thames
Now if we compare with the change of Change in unemployment ratio 2011-13 to 2014-1
In [7]:
change_unemployment_ratio_df = pd.read_csv("change_unemployment_ratio.csv") 
change_unemployment_ratio_df=change_unemployment_ratio_df.dropna()
change_unemployment_ratio_df.head()
change_unemployment_ratio_df.rename({'Change in unemployment ratio 2011-13 to 2014-16': 'Borough', 'Unnamed: 1': 'Percentual'}, axis=1, inplace=True)
change_unemployment_ratio_df['Percentual'] = change_unemployment_ratio_df['Percentual'].str[:-1].astype(float)
change_unemployment_ratio_df2=change_unemployment_ratio_df.sort_values(by='Percentual', ascending=True)
change_unemployment_ratio_df2.plot(kind='bar',x='Borough',y='Percentual')
Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x1e8398bc208>
In [8]:
change_unemployment_ratio_df2.head(10)
Out[8]:
BoroughPercentual
25Newham-3.825
8Croydon-3.699
9Ealing-3.664
10Enfield-3.528
22Lambeth-3.031
2Barking and Dagenham-2.945
23Lewisham-2.804
32Wandsworth-2.787
17Hillingdon-2.741
5Brent-2.724
And the highest change unemployment ratio are for the Boroughs:
  1. Newham
  2. Croydon
  3. Ealing
But this fact does not guarantee that will be a good place to invest. So in ordering to be sure about which Borough should be the best to invest,we consider the average percentual, this can be a good indication to determine which Borough we should choose.
In [9]:
change_unemployment_ratio_df2["Percentual"].mean()
Out[9]:
-1.9897500000000001
therefore arround -1.98% the Borough shold be stable.
  1. Hammersmith and Fulham has -2.200%
  2. Haringey -2.134%
  3. Waltham Forest -2.036%
We have those three candidates, but due to Hammersmith and Fulham has the lowest Unemployment , a good candidate to invest is Hammersmith and Fulham which is located in the South West and North West of London.
Thus we are going to explore what are the possible small bussiness what have a trend over the South West of London.

One of the strongest regions and cities on the planet is London. It’s highly attractive, and it offers one of the best prospects for business development and innovative thinking. London is the showcase for our work throughout Europe and is paramount in our global strategy.

2. Data wrangling

Data wrangling sometimes referred to as data munging, is the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics.
In this section we are going to extract more data from different sources of London and transform them into dataframes in ordering to perfom the analytics.

2.1 Exploration of the city of London

In order to transform data from websites into data which we can process we requiere use a great tool called BeautifulSoup, we load the libraries and we get the List of areas of London.
In [10]:
# library for BeautifulSoup
from bs4 import BeautifulSoup
wikipedia_link = 'https://en.wikipedia.org/wiki/List_of_areas_of_London'
wikipedia_page = requests.get(wikipedia_link)
In [11]:
# Cleans html file
soup = BeautifulSoup(wikipedia_page.content, 'html.parser')
# This extracts the "tbody" within the table where class is "wikitable sortable"
table = soup.find('table', {'class':'wikitable sortable'}).tbody
# Extracts all "tr" (table rows) within the table above
rows = table.find_all('tr')
# Extracts the column headers, removes and replaces possible '\n' with space for the "th" tag
columns = [i.text.replace('\n', '')
           for i in rows[0].find_all('th')]
# Converts columns to pd dataframe
df = pd.DataFrame(columns = columns)
'''
Extracts every row with corresponding columns then appends the values to the create pd dataframe "df". The first row (row[0]) is skipped because it is already the header
'''
for i in range(1, len(rows)):
    tds = rows[i].find_all('td')    
    if len(tds) == 7:
        values = [tds[0].text, tds[1].text, tds[2].text.replace('\n', ''.replace('\xa0','')), tds[3].text, tds[4].text.replace('\n', ''.replace('\xa0','')), tds[5].text.replace('\n', ''.replace('\xa0','')), tds[6].text.replace('\n', ''.replace('\xa0',''))]
    else:
        values = [td.text.replace('\n', '').replace('\xa0','') for td in tds]
        
        df = df.append(pd.Series(values, index = columns), ignore_index = True)
        df
In [12]:
df.head(5)
Out[12]:
LocationLondon boroughPost townPostcode districtDial codeOS grid ref
0Abbey WoodBexley, Greenwich [2]LONDONSE2020TQ465785
1ActonEaling, Hammersmith and Fulham[3]LONDONW3, W4020TQ205805
2AddingtonCroydon[3]CROYDONCR0020TQ375645
3AddiscombeCroydon[3]CROYDONCR0020TQ345665
4Albany ParkBexleyBEXLEY, SIDCUPDA5, DA14020TQ478728
In [13]:
df.columns = ['Location', 'Borough', 'Post-town', 'Postcode',
       'Dial-code', 'OSgridref']
In [14]:
df.columns
Out[14]:
Index(['Location', 'Borough', 'Post-town', 'Postcode', 'Dial-code',
       'OSgridref'],
      dtype='object')
In [15]:
df.head()
Out[15]:
LocationBoroughPost-townPostcodeDial-codeOSgridref
0Abbey WoodBexley, Greenwich [2]LONDONSE2020TQ465785
1ActonEaling, Hammersmith and Fulham[3]LONDONW3, W4020TQ205805
2AddingtonCroydon[3]CROYDONCR0020TQ375645
3AddiscombeCroydon[3]CROYDONCR0020TQ345665
4Albany ParkBexleyBEXLEY, SIDCUPDA5, DA14020TQ478728
We have to perfomr some cleaning with the following command
In [16]:
# Remove Borough reference numbers with []
df['Borough'] = df['Borough'].map(lambda x: x.rstrip(']').rstrip('0123456789').rstrip('['))
In [17]:
df.head()
Out[17]:
LocationBoroughPost-townPostcodeDial-codeOSgridref
0Abbey WoodBexley, GreenwichLONDONSE2020TQ465785
1ActonEaling, Hammersmith and FulhamLONDONW3, W4020TQ205805
2AddingtonCroydonCROYDONCR0020TQ375645
3AddiscombeCroydonCROYDONCR0020TQ345665
4Albany ParkBexleyBEXLEY, SIDCUPDA5, DA14020TQ478728
We are going now to do some assumptions to reduce the amount of data that we want to process.
The postcodes are spread to multi-rows and assigned the same values from the other columns.
In [18]:
df0 = df.drop('Postcode', axis=1).join(df['Postcode'].str.split(',', expand=True).stack().reset_index(level=1, drop=True).rename('Postcode'))
In [19]:
df0.head()
Out[19]:
LocationBoroughPost-townDial-codeOSgridrefPostcode
0Abbey WoodBexley, GreenwichLONDON020TQ465785SE2
1ActonEaling, Hammersmith and FulhamLONDON020TQ205805W3
1ActonEaling, Hammersmith and FulhamLONDON020TQ205805W4
2AddingtonCroydonCROYDON020TQ375645CR0
3AddiscombeCroydonCROYDON020TQ345665CR0
From the data, only the ‘Location’, ‘Borough’, ‘Postcode’, ‘Post-town’ will be used.
In [20]:
df1 = df0[['Location', 'Borough', 'Postcode', 'Post-town']].reset_index(drop=True)
In [21]:
df1.head()
Out[21]:
LocationBoroughPostcodePost-town
0Abbey WoodBexley, GreenwichSE2LONDON
1ActonEaling, Hammersmith and FulhamW3LONDON
2ActonEaling, Hammersmith and FulhamW4LONDON
3AddingtonCroydonCR0CROYDON
4AddiscombeCroydonCR0CROYDON
Now, only the Boroughs with London Post-town will be used for our search of location. Therefore, all the non-post-town are dropped.
In [22]:
df2 = df1 # assigns df1 to df2
df21 = df2[df2['Post-town'].str.contains('LONDON')]
In [23]:
df21.shape
Out[23]:
(380, 4)
In [24]:
df3 = df21[['Location', 'Borough', 'Postcode']].reset_index(drop=True)
In [25]:
df3.head()
Out[25]:
LocationBoroughPostcode
0Abbey WoodBexley, GreenwichSE2
1ActonEaling, Hammersmith and FulhamW3
2ActonEaling, Hammersmith and FulhamW4
3AldgateCityEC3
4AldwychWestminsterWC2
Due to the study of the Bouroughts, Hammersmith and Fulham are in the North West and South West areas of London. For this project the South West will be considered for our analysis. The south west areas has postcodes starting with SW
In [26]:
df_london = df3 # re-assigns to df_london
# Strips whitespaces before postcode
df_london.Postcode = df_london.Postcode.str.strip()
# New dataframe for South East London postcodes - df_se
df_sw = df_london[df_london['Postcode'].str.startswith(('SW'))].reset_index(drop=True)
In [27]:
df_sw.head(10)
Out[27]:
LocationBoroughPostcode
0BalhamWandsworthSW12
1BarnesRichmond upon ThamesSW13
2BatterseaWandsworthSW11
3BelgraviaWestminsterSW1
4BrixtonLambethSW2
5BrixtonLambethSW9
6BromptonKensington and ChelseaHammersmith and FulhamSW3
7CastelnauRichmond upon ThamesSW13
8ChelseaKensington and ChelseaSW3
9ClaphamLambeth, WandsworthSW4
We are intterested to see the demography of London for white people.
In [28]:
demograph_link = 'https://en.wikipedia.org/wiki/Demography_of_London'
demograph_page = requests.get(demograph_link)
soup1 = BeautifulSoup(demograph_page.content, 'html.parser')
table1 = soup1.find('table', {'class':'wikitable sortable'}).tbody
rows1 = table1.find_all('tr')
columns1 = [i.text.replace('\n', '')
 for i in rows1[0].find_all('th')]
demo_london = pd.DataFrame(columns = columns1)
for j in range(1, len(rows1)):
    tds1 = rows1[j].find_all('td')
    if len(tds1) == 7:
        values1 = [tds1[0].text, tds1[1].text, tds1[2].text.replace('\n', ''.replace('\xa0','')), tds1[3].text, tds1[4].text.replace('\n', ''.replace('\xa0','')), tds1[5].text.replace('\n', ''.replace('\xa0',''))]
    else:
        values1 = [td1.text.replace('\n', '').replace('\xa0','') for td1 in tds1]
        
        demo_london = demo_london.append(pd.Series(values1, index = columns1), ignore_index = True)
demo_london
Out[28]:
Local authorityWhiteMixedAsianBlackOther
0Barnet64.14.818.57.74.9
1Barking and Dagenham58.34.215.9201.6
2Bexley81.92.36.68.50.8
3Brent36.35.134.118.85.8
4Bromley84.33.55.260.9
5Camden66.35.616.18.23.8
6City of London78.63.912.72.62.1
7Croydon55.16.616.420.21.8
8Ealing494.529.710.96
9Enfield615.511.217.25.1
10Greenwich62.54.811.719.11.9
11Hackney54.76.410.523.15.3
12Haringey60.56.59.518.84.7
13Harrow42.2442.68.22.9
14Havering87.72.14.94.80.6
15Hammersmith and Fulham68.15.59.111.85.5
16Hillingdon60.63.825.37.33
17Hounslow51.44.134.46.63.6
18Islington68.26.59.212.83.4
19Kensington and Chelsea70.65.7106.57.2
20Kingston upon Thames74.53.916.32.52.7
21Lambeth57.17.66.925.92.4
22Lewisham53.57.49.327.22.6
23Merton64.94.718.110.41.9
24Newham294.543.519.63.5
25Redbridge42.54.141.88.92.7
26Richmond upon Thames863.67.31.51.6
27Southwark54.36.29.426.93.3
28Sutton78.63.811.64.81.3
29Tower Hamlets45.24.141.17.32.3
30Waltham Forest52.25.321.117.34.1
31Wandsworth71.4510.910.72.1
32Westminster61.75.214.57.511.1
In [29]:
#converting string to float
In [30]:
demo_london['White'] = demo_london['White'].astype('float')
demo_london_sorted = demo_london.sort_values(by='White', ascending = False)
In [31]:
demo_london_sorted.head(10)
Out[31]:
Local authorityWhiteMixedAsianBlackOther
14Havering87.72.14.94.80.6
26Richmond upon Thames86.03.67.31.51.6
4Bromley84.33.55.260.9
2Bexley81.92.36.68.50.8
6City of London78.63.912.72.62.1
28Sutton78.63.811.64.81.3
20Kingston upon Thames74.53.916.32.52.7
31Wandsworth71.4510.910.72.1
19Kensington and Chelsea70.65.7106.57.2
18Islington68.26.59.212.83.4
In [32]:
demo_london_sorted["White"].mean()
Out[32]:
61.584848484848486
Considering the the top 6 areas with higher of the average white people around > 61.58% we have the following list of Boroughs:
  1. Hammersmith and Fulham 68.1%
  2. Camden 66.3%
  3. Merton 64.9%
  4. Barnet 64.1%
  5. Greenwich 62.5%
  6. Westminster 61.7%
In [33]:
df_sw_top = df_sw[df_sw['Borough'].isin(['Hammersmith and Fulham','Camden', 'Merton', 'Barnet','Greenwich', 'Westminster'])].reset_index(drop=True)
In [34]:
df_sw_top
Out[34]:
LocationBoroughPostcode
0BelgraviaWestminsterSW1
1Colliers WoodMertonSW19
2FulhamHammersmith and FulhamSW6
3KnightsbridgeWestminsterSW1
4Merton ParkMertonSW19
5MillbankWestminsterSW1
6Parsons GreenHammersmith and FulhamSW6
7PimlicoWestminsterSW1
8Raynes ParkMertonSW20
9Sands EndHammersmith and FulhamSW6
10South WimbledonMertonSW19
11St James'sWestminsterSW1
12WestminsterWestminsterSW1
13WimbledonMertonSW19
14WimbledonMertonSW20
In [35]:
df_sw_top.shape
Out[35]:
(15, 3)
In [36]:
# Geocoder starts here
# Defining a function to use --> get_latlng()'''
def get_latlng(arcgis_geocoder):
    
    # Initialize the Location (lat. and long.) to "None"
    lat_lng_coords = None
    
    # While loop helps to create a continous run until all the location coordinates are geocoded
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, London, United Kingdom'.format(arcgis_geocoder))
        lat_lng_coords = g.latlng
    return lat_lng_coords
# Geocoder ends here
In [37]:
sample = get_latlng('SW6')
In [38]:
sample
Out[38]:
[51.47772000000003, -0.2014499999999657]
In [39]:
ga = geocoder.geocodefarm(sample, method = 'reverse')
ga
Out[39]:
<[OK] Geocodefarm - Reverse [687b Fulham Road, London, SW6 5UJ, United Kingdom]>
In [40]:
start = time.time()
postal_codes = df_sw_top['Postcode']    
coordinates = [get_latlng(postal_code) for postal_code in postal_codes.tolist()]
end = time.time()
print("Time of execution: ", end - start, "seconds")
Time of execution:  11.869275093078613 seconds
Then we proceed to store the location data — latitude and longitude as follows. The obtained coordinates are then joined to df_sw_topto create new data frame.
In [41]:
df_sw_loc = df_sw_top
# The obtained coordinates (latitude and longitude) are joined with the dataframe as shown
df_sw_coordinates = pd.DataFrame(coordinates, columns = ['Latitude', 'Longitude'])
df_sw_loc['Latitude'] = df_sw_coordinates['Latitude']
df_sw_loc['Longitude'] = df_sw_coordinates['Longitude']
df_sw_loc.head(5)
Out[41]:
LocationBoroughPostcodeLatitudeLongitude
0BelgraviaWestminsterSW151.49713-0.13829
1Colliers WoodMertonSW1951.42170-0.20796
2FulhamHammersmith and FulhamSW651.47772-0.20145
3KnightsbridgeWestminsterSW151.49713-0.13829
4Merton ParkMertonSW1951.42170-0.20796
In [42]:
df_sw_loc.shape
Out[42]:
(15, 5)
In orderting to use Foursquare, it is needed the credentials, that are saved in the file credential.json
In [43]:
import json
filename = 'credential.json'
with open(filename) as f:
    data = json.load(f)
In [ ]:
 
In [44]:
CLIENT_ID = data['CLIENT_ID'] #Foursquare )FS) ID
CLIENT_SECRET = data['CLIENT_SECRET'] # FS Secret
VERSION = data['VERSION'] # FS API version
In [45]:
#LIMIT = 30
#print('Your credentails:')
#print('CLIENT_ID: ' + CLIENT_ID)
#print('CLIENT_SECRET:' + CLIENT_SECRET)

3. METHODOLOGY

With the Foursquare site, we are going to obtain a data set of stores around specific locations, which will be stored into a data frame and with that, it is possible to do the analytics.
First, we will proceed with the exploration and the collecting of data over a single Neighbourhood and later we proceed the same procedure with multiple Neighbourhoods storing a data frame.
The following step is get the cluster of the venues by location and identify the top most cooomon venues by cluster.

3.1 Data Exploration

Single Neighbourhood

An initial exploration of a single Neighbourhood within the London area was done to examine the Foursquare.
In [46]:
# Resets the current index to a new
sw_df = df_sw_loc.reset_index().drop('index', axis = 1)
sw_df.loc[sw_df['Location'] == 'Fulham']
Out[46]:
LocationBoroughPostcodeLatitudeLongitude
2FulhamHammersmith and FulhamSW651.47772-0.20145
let’s use the Fulham with the index location 2 as shown below:
In [47]:
Fulham_lat = sw_df.loc[2, 'Latitude']
Fulham_long = sw_df.loc[2, 'Longitude']
Fulham_loc = sw_df.loc[2, 'Location']
Fulham_postcode = sw_df.loc[2, 'Postcode']
print('The latitude and longitude values of {} with postcode {}, are {}, {}.'.format(Fulham_loc, Fulham_postcode, Fulham_lat, Fulham_long))
The latitude and longitude values of Fulham with postcode SW6, are 51.47772000000003, -0.2014499999999657.
Let’s explore the top 50 venues that are within a 1500 metres radius of Fulham . And then, let’s create the GET request URL, and then the url is named.
In [48]:
# Credentials are provided already for this part
LIMIT = 50 # limit of number of venues returned by Foursquare API
radius = 1500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Fulham_lat, 
    Fulham_long, 
    radius, 
    LIMIT)
# displays URL
url
Out[48]:
'https://api.foursquare.com/v2/venues/explore?&client_id=VV415VZUJXDXFEELDKJ45UIUKUEY1R2X3MUUEO5PBVHCQ5XB&client_secret=RPJJV01CL5EG31FBDK1FTWD4BMCJZVXVR0LWIWD03QQBI3N0&v=20180604&ll=51.47772000000003,-0.2014499999999657&radius=1500&limit=50'
In [49]:
results = requests.get(url).json()
From the results, the necessary information needs to be obtained from items key. To do this, the get_category_type function is used from the Foursquare.
In [50]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
The result is then cleaned up from json to a structured pandas dataframe as shown below:
In [51]:
import numpy as np
import pandas as pd
In [52]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON
# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues.head(10)
Out[52]:
namecategorieslatlng
0GAIL's BakeryBakery51.476830-0.202668
1Local HeroCafé51.476264-0.204713
2Baileys Fish & ChipsFish & Chips Shop51.480452-0.203987
3The Power Yoga CompanyYoga Studio51.474317-0.204456
4Nuntee Thai CuisineThai Restaurant51.476795-0.203420
5The White HorsePub51.474319-0.200482
6Parsons GreenPark51.474922-0.206278
7Shot EspressoCoffee Shop51.480614-0.198148
8The Climbing HangarClimbing Gym51.476383-0.199611
9Chairs & CoffeeCoffee Shop51.479713-0.198987
In [53]:
nearby_venues_Fulham_unique = nearby_venues['categories'].value_counts().to_frame(name='Count')
The most common venues in Fulhan are:
In [54]:
nearby_venues_Fulham_unique.head(5)
Out[54]:
Count
Café9
Pizza Place4
Pub3
Grocery Store2
Fish & Chips Shop2
So in Fulham the top nearby venues are cafe, pizza place, pub, Grocery Store and fish & chips Shop.

Multiple Neighbourhoods

In ordering to repeat the previous procedure applied to multiple Neighbourhoods, we define a function:
In [55]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)
and we use this function by using our data frame sw_df as follows:
In [56]:
sw_venues = getNearbyVenues(names=sw_df['Location'],
                                   latitudes=sw_df['Latitude'],
                                   longitudes=sw_df['Longitude']
                                  )
Belgravia
Colliers Wood
Fulham
Knightsbridge
Merton Park
Millbank
Parsons Green
Pimlico
Raynes Park
Sands End
South Wimbledon
St James's
Westminster
Wimbledon
Wimbledon
In [57]:
sw_venues.shape
Out[57]:
(714, 7)
In [58]:
sw_venues.head(5)
Out[58]:
NeighborhoodNeighborhood LatitudeNeighborhood LongitudeVenueVenue LatitudeVenue LongitudeVenue Category
0Belgravia51.49713-0.13829Taj 51 Buckingham Gate Suites & Residences51.498598-0.137404Hotel
1Belgravia51.49713-0.13829Curzon Victoria51.497473-0.136744Movie Theater
2Belgravia51.49713-0.13829Iris & June51.496791-0.136011Coffee Shop
3Belgravia51.49713-0.13829Run & Become51.498128-0.135426Sporting Goods Shop
4Belgravia51.49713-0.13829Quilon51.498772-0.137522Indian Restaurant
The number of venues returned for each neighbourhoods is then explored as follows
In [59]:
sw_venues.groupby('Neighborhood').count()
Out[59]:
Neighborhood LatitudeNeighborhood LongitudeVenueVenue LatitudeVenue LongitudeVenue Category
Neighborhood
Belgravia505050505050
Colliers Wood505050505050
Fulham505050505050
Knightsbridge505050505050
Merton Park505050505050
Millbank505050505050
Parsons Green505050505050
Pimlico505050505050
Raynes Park323232323232
Sands End505050505050
South Wimbledon505050505050
St James's505050505050
Westminster505050505050
Wimbledon828282828282
Then we check how many unique categories all the returned venues. See as follows
In [60]:
print('There are {} uniques categories.'.format(len(sw_venues['Venue Category'].unique())))
There are 79 uniques categories.
In [61]:
sw_venue_unique_count = sw_venues['Venue Category'].value_counts().to_frame(name='Count')
In [62]:
sw_venue_unique_count.head()
Out[62]:
Count
Coffee Shop59
Hotel38
Sushi Restaurant28
Pub27
Theater24

3.2 Clustering

In [63]:
address = 'London, United Kingdom'
geolocator = Nominatim(user_agent="ln_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of London are {}, {}.'.format(latitude, longitude))
The geograpical coordinate of London are 51.5073219, -0.1276474.
In [71]:
map_london = folium.Map(location = [latitude, longitude], zoom_start = 11)
map_london
Out[71]:
In [72]:
# Adding markers to map
for lat, lng, borough, loc in zip(sw_df['Latitude'], 
                                  sw_df['Longitude'],
                                  sw_df['Borough'],
                                  sw_df['Location']):
    label = '{} - {}'.format(loc, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_london)  
    
display(map_london)
We see four points which corresponds to our four postcodes that we want to analize.
In [73]:
sw_df
Out[73]:
LocationBoroughPostcodeLatitudeLongitude
0BelgraviaWestminsterSW151.49713-0.13829
1Colliers WoodMertonSW1951.42170-0.20796
2FulhamHammersmith and FulhamSW651.47772-0.20145
3KnightsbridgeWestminsterSW151.49713-0.13829
4Merton ParkMertonSW1951.42170-0.20796
5MillbankWestminsterSW151.49713-0.13829
6Parsons GreenHammersmith and FulhamSW651.47772-0.20145
7PimlicoWestminsterSW151.49713-0.13829
8Raynes ParkMertonSW2051.41117-0.22623
9Sands EndHammersmith and FulhamSW651.47772-0.20145
10South WimbledonMertonSW1951.42170-0.20796
11St James'sWestminsterSW151.49713-0.13829
12WestminsterWestminsterSW151.49713-0.13829
13WimbledonMertonSW1951.42170-0.20796
14WimbledonMertonSW2051.41117-0.22623

4. Data analysis

In this section, the objective is to check and explore the venues in each neighbourhood.
In [74]:
# one hot encoding
sw_onehot = pd.get_dummies(sw_venues[['Venue Category']], prefix = "", prefix_sep = "")
Then the Neighbourhoodcolumn is added back to the dataframe.
In [75]:
# add neighborhood column back to dataframe
sw_onehot['Neighborhood'] = sw_venues['Neighborhood']
In [76]:
# move neighborhood column to the first column
fixed_columns = [sw_onehot.columns[-1]] + list(sw_onehot.columns[:-1])
sw_onehot = sw_onehot[fixed_columns]
In [78]:
#sw_onehot.head()
In [79]:
# To check the Bakery:
#sw_onehot.loc[sw_onehot['Bakery'] != 0]
Regrouping and Category Statistics
In [80]:
sw_grouped = sw_onehot.groupby('Neighborhood').mean().reset_index()
Grouping of each Neighbourhoods with 10 common venues:
In [81]:
num_top_venues = 10 # Top common venues needed
for hood in sw_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = sw_grouped[sw_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue', 'freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending = False).reset_index(drop = True).head(num_top_venues))
    print('\n')
----Belgravia----
                  venue  freq
0                 Hotel  0.12
1           Coffee Shop  0.08
2               Theater  0.08
3        Sandwich Place  0.06
4             Juice Bar  0.04
5      Sushi Restaurant  0.04
6  Gym / Fitness Center  0.02
7    Falafel Restaurant  0.02
8  Fast Food Restaurant  0.02
9     French Restaurant  0.02


----Colliers Wood----
                  venue  freq
0      Sushi Restaurant  0.08
1           Coffee Shop  0.08
2                   Pub  0.08
3                   Bar  0.06
4     Indian Restaurant  0.04
5      Stationery Store  0.04
6          Burger Joint  0.04
7         Grocery Store  0.04
8           Pizza Place  0.02
9  Gym / Fitness Center  0.02


----Fulham----
                venue  freq
0                Café  0.12
1         Coffee Shop  0.10
2                 Pub  0.06
3  Italian Restaurant  0.06
4         Yoga Studio  0.04
5                Park  0.04
6       Grocery Store  0.04
7           Gastropub  0.04
8   French Restaurant  0.04
9        Climbing Gym  0.04


----Knightsbridge----
                  venue  freq
0                 Hotel  0.12
1           Coffee Shop  0.08
2               Theater  0.08
3        Sandwich Place  0.06
4             Juice Bar  0.04
5      Sushi Restaurant  0.04
6  Gym / Fitness Center  0.02
7    Falafel Restaurant  0.02
8  Fast Food Restaurant  0.02
9     French Restaurant  0.02


----Merton Park----
                  venue  freq
0      Sushi Restaurant  0.08
1           Coffee Shop  0.08
2                   Pub  0.08
3                   Bar  0.06
4     Indian Restaurant  0.04
5      Stationery Store  0.04
6          Burger Joint  0.04
7         Grocery Store  0.04
8           Pizza Place  0.02
9  Gym / Fitness Center  0.02


----Millbank----
                  venue  freq
0                 Hotel  0.12
1           Coffee Shop  0.08
2               Theater  0.08
3        Sandwich Place  0.06
4             Juice Bar  0.04
5      Sushi Restaurant  0.04
6  Gym / Fitness Center  0.02
7    Falafel Restaurant  0.02
8  Fast Food Restaurant  0.02
9     French Restaurant  0.02


----Parsons Green----
                venue  freq
0                Café  0.12
1         Coffee Shop  0.10
2                 Pub  0.06
3  Italian Restaurant  0.06
4         Yoga Studio  0.04
5                Park  0.04
6       Grocery Store  0.04
7           Gastropub  0.04
8   French Restaurant  0.04
9        Climbing Gym  0.04


----Pimlico----
                  venue  freq
0                 Hotel  0.12
1           Coffee Shop  0.08
2               Theater  0.08
3        Sandwich Place  0.06
4             Juice Bar  0.04
5      Sushi Restaurant  0.04
6  Gym / Fitness Center  0.02
7    Falafel Restaurant  0.02
8  Fast Food Restaurant  0.02
9     French Restaurant  0.02


----Raynes Park----
                  venue  freq
0              Bus Stop  0.12
1              Platform  0.09
2     Indian Restaurant  0.06
3  Fast Food Restaurant  0.06
4         Grocery Store  0.06
5           Coffee Shop  0.06
6              Pharmacy  0.06
7                   Pub  0.03
8  Gym / Fitness Center  0.03
9                 Hotel  0.03


----Sands End----
                venue  freq
0                Café  0.12
1         Coffee Shop  0.10
2                 Pub  0.06
3  Italian Restaurant  0.06
4         Yoga Studio  0.04
5                Park  0.04
6       Grocery Store  0.04
7           Gastropub  0.04
8   French Restaurant  0.04
9        Climbing Gym  0.04


----South Wimbledon----
                  venue  freq
0      Sushi Restaurant  0.08
1           Coffee Shop  0.08
2                   Pub  0.08
3                   Bar  0.06
4     Indian Restaurant  0.04
5      Stationery Store  0.04
6          Burger Joint  0.04
7         Grocery Store  0.04
8           Pizza Place  0.02
9  Gym / Fitness Center  0.02


----St James's----
                  venue  freq
0                 Hotel  0.12
1           Coffee Shop  0.08
2               Theater  0.08
3        Sandwich Place  0.06
4             Juice Bar  0.04
5      Sushi Restaurant  0.04
6  Gym / Fitness Center  0.02
7    Falafel Restaurant  0.02
8  Fast Food Restaurant  0.02
9     French Restaurant  0.02


----Westminster----
                  venue  freq
0                 Hotel  0.12
1           Coffee Shop  0.08
2               Theater  0.08
3        Sandwich Place  0.06
4             Juice Bar  0.04
5      Sushi Restaurant  0.04
6  Gym / Fitness Center  0.02
7    Falafel Restaurant  0.02
8  Fast Food Restaurant  0.02
9     French Restaurant  0.02


----Wimbledon----
                  venue  freq
0           Coffee Shop  0.07
1                   Pub  0.06
2      Sushi Restaurant  0.05
3              Bus Stop  0.05
4     Indian Restaurant  0.05
5         Grocery Store  0.05
6                   Bar  0.04
7          Burger Joint  0.04
8              Platform  0.04
9  Fast Food Restaurant  0.02


Creating new dataframe Putting the common venues into pandas dataframe, the following return_most_common_venuesis used to sort the venues in descending order.
In [82]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending = False)
    
    return row_categories_sorted.index.values[0:num_top_venues]
Then we create a new panda dataframe with 10 most common venues as shown below:
In [83]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighborhood'] = sw_grouped['Neighborhood']
for ind in np.arange(sw_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(sw_grouped.iloc[ind, :], num_top_venues)
neighbourhoods_venues_sorted.head(5)
Out[83]:
Neighborhood1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
0BelgraviaHotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
1Colliers WoodPubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant
2FulhamCaféCoffee ShopItalian RestaurantPubYoga StudioGrocery StoreWine ShopParkFrench RestaurantClimbing Gym
3KnightsbridgeHotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
4Merton ParkPubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant
Clustering of Neighbourhoods We create the grouped clustering for the neighbourhood as shown below:
In [84]:
sw_grouped_clustering = sw_grouped.drop('Neighborhood', 1)
And then create clusters of the neighbourhood using the k-means to cluster the neighbourhood into 5 cluster
In [85]:
# set number of clusters
kclusters = 5
# run k-means clustering
kmeans = KMeans(n_clusters = kclusters, random_state=0).fit(sw_grouped_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]
Out[85]:
array([0, 1, 2, 0, 1, 0, 2, 0, 3, 2])
Now creating a new dataframe that includes the clusters as well as the top 10 venues for each neighbourhoods.
In [86]:
# add clustering labels
neighbourhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
sw_merged = sw_df
# match/merge SE London data with latitude/longitude for each neighborhood
sw_merged_latlong = sw_merged.join(neighbourhoods_venues_sorted.set_index('Neighborhood'), on = 'Location')
sw_merged_latlong.head(5)
Out[86]:
LocationBoroughPostcodeLatitudeLongitudeCluster Labels1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
0BelgraviaWestminsterSW151.49713-0.138290HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
1Colliers WoodMertonSW1951.42170-0.207961PubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant
2FulhamHammersmith and FulhamSW651.47772-0.201452CaféCoffee ShopItalian RestaurantPubYoga StudioGrocery StoreWine ShopParkFrench RestaurantClimbing Gym
3KnightsbridgeWestminsterSW151.49713-0.138290HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
4Merton ParkMertonSW1951.42170-0.207961PubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant

5. RESULTS

To visualize the clusters, we have the following
In [87]:
sw_clusters=sw_merged_latlong
In [88]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
In [89]:
# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(sw_clusters['Latitude'], sw_clusters['Longitude'], sw_clusters['Location'], sw_clusters['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=20,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
In [90]:
display(map_clusters)
The individual clsuters dataframe can be obtained by using the following code (better to run each cluster individually:
In [91]:
# Cluster 1
cluster1=sw_clusters.loc[sw_clusters['Cluster Labels'] == 0, sw_clusters.columns[[1] + list(range(5, sw_clusters.shape[1]))]]
In [92]:
# Cluster 2
cluster2=sw_clusters.loc[sw_clusters['Cluster Labels'] == 1, sw_clusters.columns[[1] + list(range(5, sw_clusters.shape[1]))]]
In [93]:
# Cluster 3
cluster3=sw_clusters.loc[sw_clusters['Cluster Labels'] == 2, sw_clusters.columns[[1] + list(range(5, sw_clusters.shape[1]))]]
In [94]:
# Cluster 4
cluster4=sw_clusters.loc[sw_clusters['Cluster Labels'] == 3, sw_clusters.columns[[1] + list(range(5, sw_clusters.shape[1]))]]
In [95]:
# Cluster 5
cluster5=sw_clusters.loc[sw_clusters['Cluster Labels'] == 4, sw_clusters.columns[[1] + list(range(5, sw_clusters.shape[1]))]]
In [96]:
cluster1
Out[96]:
BoroughCluster Labels1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
0Westminster0HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
3Westminster0HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
5Westminster0HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
7Westminster0HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
11Westminster0HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
12Westminster0HotelTheaterCoffee ShopSandwich PlaceJuice BarSushi RestaurantIndian RestaurantMovie TheaterModern European RestaurantClothing Store
In [97]:
cluster2
Out[97]:
BoroughCluster Labels1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
1Merton1PubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant
4Merton1PubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant
10Merton1PubCoffee ShopSushi RestaurantBarIndian RestaurantStationery StoreBurger JointGrocery StoreMovie TheaterMexican Restaurant
In [98]:
cluster3
Out[98]:
BoroughCluster Labels1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
2Hammersmith and Fulham2CaféCoffee ShopItalian RestaurantPubYoga StudioGrocery StoreWine ShopParkFrench RestaurantClimbing Gym
6Hammersmith and Fulham2CaféCoffee ShopItalian RestaurantPubYoga StudioGrocery StoreWine ShopParkFrench RestaurantClimbing Gym
9Hammersmith and Fulham2CaféCoffee ShopItalian RestaurantPubYoga StudioGrocery StoreWine ShopParkFrench RestaurantClimbing Gym
In [99]:
cluster4
Out[99]:
BoroughCluster Labels1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
8Merton3Bus StopPlatformGrocery StoreFast Food RestaurantCoffee ShopPharmacyIndian RestaurantBBQ JointTrailBakery
In [100]:
cluster5
Out[100]:
BoroughCluster Labels1st Most Common Venue2nd Most Common Venue3rd Most Common Venue4th Most Common Venue5th Most Common Venue6th Most Common Venue7th Most Common Venue8th Most Common Venue9th Most Common Venue10th Most Common Venue
13Merton4Coffee ShopPubGrocery StoreBus StopSushi RestaurantIndian RestaurantBarBurger JointPlatformGym / Fitness Center
14Merton4Coffee ShopPubGrocery StoreBus StopSushi RestaurantIndian RestaurantBarBurger JointPlatformGym / Fitness Center

Conclusion

According to the latest Land Registry figures Westminster is one the most expensive boroughs to buy a property to live in London. Slightly less than a million – £990,896, is the average cost of property to buy in City of Westminster. However, considering having a view on London Eye or Big Ben while eating breakfast, houseprices in this area sound to be reasonable. The Hammersmith and Fulham has avg. £784,613 cost of property.
So the conclusion should be invest in one of the common venue :

HAMMERSMITH AND FULHAM

  1. Café
  2. Coffee Shop
  3. Italian Restaurant
  4. Pub
  5. Yoga Studio
  6. Grocery Store
  7. Wine Shop
  8. Park
  9. French Restaurant
  10. Climbing Gy

WESTMINSTER

  1. Hotel
  2. Theater
  3. Coffee Sho
  4. Sandwich Place
  5. Juice Bar
  6. Sushi Restaurant
  7. Indian Restaurant
  8. Movie Theater
  9. Modern European Restaurant
  10. Clothing Store

MERTON

  1. Pub
  2. Coffee Shop
  3. Sushi Restaurant
  4. Bar
  5. Indian Restaurant
  6. Stationery Store
  7. Burger Joint
  8. Grocery Store
  9. Movie Theater
  10. Mexican Restauran
which is based on cluster 2.

Comments