Design Pattern

Now since we know, what objective of this project is, let's analyze what we already know.

We will first learn few design patterns, using examples from Theme parks like Disney, Universal or Magic mountain ride parks, who manage crowd gathering so well.

for example, take a section of map from Disney or universal theme parks.

Disney Themepark Disney Themepark

Universal Studio Themepark Universal Studio


Pattern

Let's first analyze few basic characteristics of theme parks.

To start with, assume all of these theme parks tickets are sold in advance or bought on same day and most of the times, tickets are sold less than maximum occupancy as per capacity allowed.

remember, crowd gathering less than maximum allowed occupancy, may not be the case in other type of crowd gatherings, such as protests, political rally or festive gatherings etc.. We will address this later in vision IOT section, where it become an important factor to detect anomaly.

  • Create Graph, Vertices and Edges (relationships)

    Let's break each ride in park by it entity and characteristics (i.e. attributes).

  • Gathering Visitor, Food Supply and other data

    create and load visitor information register and other data

  • IOTs climate data

    gather IOT (Internet of things) data from sensors

  • Analyzing patterns


Create Graph, Vertices and Edges

# we are using Julia Language for Graph analysis
# TigerGraph provide RESTAPI end points, GSQL and GRAPHSTUDIO to connect TIGERGRAPH
#######################################################################
# pyTigerGraph is a Python based library to connect with GRAPH database and run GSQLs
# we will use Julia PyCall package to connect with pyTigerGraph library
#######################################################################
## **perhaps, some day I will re-write pyTigerGraph package in Julia ##
#######################################################################

# open Julia REPL, Jupyter or your favorite Julia IDE, run following

# first import all packages required to support our data analysis
# rest of this chapter assume that below packages are imported once
import Pkg
Pkg.add("DataFrames")
Pkg.add("CSV")
Pkg.add("PyCall")
Pkg.build("PyCall");

# you will also need to install pyTigerGraph in your python environment
# !pip install -U pyTigerGraph
    Updating registry at `~/.julia/registries/General`
    Updating git-repo `https://github.com/JuliaRegistries/General.git`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.7/Project.toml`
  No Changes to `~/.julia/environments/v1.7/Manifest.toml`
    Building Conda ─→ `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/6e47d11ea2776bc5627421d59cdcc1296c058071/build.log`
    Building PyCall → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/1fc929f47d7c151c839c5fc1375929766fb8edcc/build.log`
Info

before proceeding any further, please setup Tiger Graph Server instance at tgcloud.io please don't expect these credentials to work for you, as there is cost involved to keep this.

hostName = "https://p2p.i.tgcloud.io"

userName = "tigercloud"

password = "tigercloud"

graphName = "HazardAhead"

conn = tg.TigerGraphConnection(host=hostName, username=userName, password=password, graphname=graphName)


now once you have TigerGraph and Julia environments setup, let's jump on to setup sample graph, vertices and edges to get a hang of tools.

import Pkg
# you may not need to add conda, pytigergraph
# if you already have python setup
# these instructions are specific for julia setup
Pkg.add("Conda")
ENV["PYTHON"] = "/usr/bin/python3"
using PyCall
using Conda
Conda.pip_interop(true;)
# Conda.pip_interop(true; [env::Environment="/usr/bin/python3"])
Conda.pip("install", "pyTigerGraph")
Conda.add("pyTigerGraph")
tg = pyimport("pyTigerGraph")
# please don't expect below credentials to work for you, and signup at tgcloud
hostName = "https://p2p.i.tgcloud.io"
userName = "amit"
password = "password"
graphName = "HazardAhead"
conn = tg.TigerGraphConnection(host=hostName, username=userName, password=password, graphname=graphName)
# conn.gsql(getSchema)
PyObject <pyTigerGraph.pyTigerGraph.TigerGraphConnection object at 0x7f9fac7796d0>
Warning

Operations that DO NOT need a Token

Viewing the schema of your graph using functions such as getSchema and getVertexTypes does not require you to have an authentication token. A token is also not required to run gsql commands through pyTigerGraph.

Sample Connection

conn = tg.TigerGraphConnection(host='https://pytigergraph-demo.i.tgcloud.io', username='tigergraph' password='password' graphname='DemoGraph')

Operations that DO need a Token

A token is required to view or modify any actual DATA in the graph. Examples are: upserting data, deleting edges, and getting stats about any loaded vertices. A token is also required to get version data about the TigerGraph instance.

Sample Connection

conn = tg.TigerGraphConnection(host='https://pytigergraph-demo.i.tgcloud.io', username='tigergraph' password='password' graphname='DemoGraph', apiToken='av1im8nd2v06clbnb424jj7fp09hp049')

Note

Below code is directly executed over Python environment

first you will also need to install pyTigerGraph in your python environment,

!pip install -U pyTigerGraph

then execute following commands to create TGCloud Graph

import pyTigerGraph as tg
hostName = "https://p2p.i.tgcloud.io"
userName = "amit"
password = "password"
graphName = "HazardAhead"
conn = tg.TigerGraphConnection(host=hostName, username=userName, password=password, graphname=graphName)

conn.gsql("ls")
conn.gsql('''USE GLOBAL
DROP ALL
''')

conn.gsql('''
  USE GLOBAL
  CREATE VERTEX Guest (PRIMARY_ID id INT, bookDate DATETIME, name STRING, phoneNo INT, age INT, gender STRING, checkIn DATETIME, checkOut DATETIME, specialNeeds BOOL, race STRING, price STRING, accompanies INT, family BOOL, localResident BOOL, ADDRESS STRING) WITH primary_id_as_attribute="true"

  CREATE VERTEX Ride (PRIMARY_ID id INT, name STRING, indoor BOOL, inlets INT, outlets INT, temperature INT, avgWaitTime INT, popularityRating INT, rideType STRING, rideClass STRING, maturityRating STRING, numExits INT, area INT, numEmployees INT) WITH primary_id_as_attribute="true"

  CREATE VERTEX FoodCourt (PRIMARY_ID id INT, name STRING, indoor BOOL, inlets INT, outlets INT, temperature INT, avgWaitTime INT, popularityRating INT, foodType STRING, numExits INT, area INT, numEmployees INT) WITH primary_id_as_attribute="true"

  CREATE DIRECTED EDGE rides (From Guest, To Ride, rideTime DATETIME)
  CREATE DIRECTED EDGE eats (From Guest, To FoodCourt, eatTime DATETIME)
  CREATE UNDIRECTED EDGE accompanied (From Guest, To Guest)

''')
results = conn.gsql('CREATE GRAPH HazardAhead(Guest, Ride, FoodCourt, rides, eats, accompanied)')

Graph 1

Loading Data

conn.gsql('''
USE GLOBAL
USE GRAPH HazardAhead
CREATE LOADING JOB HazardAhead_PATH FOR GRAPH HazardAhead {
DEFINE FILENAME file1 = "sampleData/visitors.csv";
DEFINE FILENAME file2 = "sampleData/ride.csv";
DEFINE FILENAME file3 = "sampleData/foodcourt.csv";
DEFINE FILENAME file4 = "sampleData/rides.csv";
DEFINE FILENAME file5 = "sampleData/eats.csv";
DEFINE FILENAME file6 = "sampleData/accompanied.csv";
LOAD file1 TO VERTEX Visitor VALUES ($0, $1,,....) USING header="true", separator=",";
LOAD file1 TO VERTEX Ride VALUES ($0, $1,,....) USING header="true", separator=",";
LOAD file1 TO VERTEX FoodCourt VALUES ($0, $1,,....) USING header="true", separator=",";
LOAD file1 TO VERTEX rides VALUES ($0, $1,,....) USING header="true", separator=",";
LOAD file1 TO VERTEX eats VALUES ($0, $1,,....) USING header="true", separator=",";
LOAD file1 TO VERTEX accompanied VALUES ($0, $1,,....) USING header="true", separator=",";
}
''')

results = conn.gsql('RUN LOADING JOB HazardAhead_PATH USING file1="sampleData/visitors.csv", "sampleData/ride.csv", ...)

Graph 2

Graph 3

Graph 4


Gathering Visitor, Food Supply and other data

##############################################
# let's create 1000 visitors in visit register
##############################################
using DataFrames, CSV, Dates, Distributions
sampleSizeVisitor = 1000
visitorDF = DataFrame(
    id = 1:1:sampleSizeVisitor,
    bookDate = rand(Date("2020-04-01", dateformat"y-m-d"): Day(1): Date("2020-04-10", dateformat"y-m-d"), sampleSizeVisitor),
    name = "Last First Name M.",
    phoneNo = rand(1110000000:1:9988800000, sampleSizeVisitor),
    age = rand(9:1:78, sampleSizeVisitor),
    gender = rand(["Male","Female","Others","NA"], sampleSizeVisitor),
    checkIn = rand(Date("2020-04-01", dateformat"y-m-d"): Day(1): Date("2020-04-10", dateformat"y-m-d"), sampleSizeVisitor),
    checkOut = rand(Date("2020-04-01", dateformat"y-m-d"): Day(1): Date("2020-04-10", dateformat"y-m-d"), sampleSizeVisitor),
    specialNeeds = rand([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1], sampleSizeVisitor), # biased distributions, mostly false
    race = "na",
    price = rand(Normal(100, 2), sampleSizeVisitor),
    accompanies = rand([1,2,3,4], sampleSizeVisitor),
    family = rand([0,1], sampleSizeVisitor),
    localResident = rand([0,1], sampleSizeVisitor),
    ADDRESS = "Not available",
    )

first(visitorDF,5)

5 rows × 15 columns

idbookDatenamephoneNoagegendercheckIncheckOutspecialNeedsracepriceaccompaniesfamilylocalResidentADDRESS
Int64DateStringInt64Int64StringDateDateInt64StringFloat64Int64Int64Int64String
112020-04-04Last First Name M.290044603328Others2020-04-062020-04-090na100.882411Not available
222020-04-05Last First Name M.630907569325Female2020-04-032020-04-060na104.687111Not available
332020-04-02Last First Name M.754958544952Female2020-04-032020-04-100na101.423111Not available
442020-04-03Last First Name M.650242606953Male2020-04-082020-04-090na100.103111Not available
552020-04-02Last First Name M.622018078523Male2020-04-102020-04-080na96.9288410Not available
##############################################
# let's create 20 Rides in Park
##############################################
using DataFrames, CSV, Dates, Distributions
sampleSize = 20
rideDF = DataFrame(
    id = 1:1:sampleSize,
    name = "Joy Ride",
    indoor = rand([0,1], sampleSize),
    inlets = rand([1,2,3,4], sampleSize),
    outlets = rand([1,2,3,4], sampleSize),
    temperature = rand(64:1:94, sampleSize),
    avgWaitTime = rand(5:1:110, sampleSize),
    popularityRating = rand(1:1:10, sampleSize),
    rideType = rand(["Adult","Teen","Kids", "YoungAdult"], sampleSize),
    rideClass = rand(["Luxury", "Special"], sampleSize),
    maturityRating = rand(1:1:10, sampleSize),
    numExits = rand([1,2,3,4], sampleSize),
    area = rand(5000:5:15000, sampleSize),
    numEmployees = rand(1:1:5, sampleSize)
    )

first(rideDF, 5)

5 rows × 14 columns

idnameindoorinletsoutletstemperatureavgWaitTimepopularityRatingrideTyperideClassmaturityRatingnumExitsareanumEmployees
Int64StringInt64Int64Int64Int64Int64Int64StringStringInt64Int64Int64Int64
11Joy Ride04380827TeenLuxury31120552
22Joy Ride12267336AdultSpecial10360901
33Joy Ride04274867KidsSpecial8298401
44Joy Ride043809710AdultSpecial3373201
55Joy Ride12278315AdultSpecial103142505
##############################################
# let's create 20 Food Courts in Park
##############################################
using DataFrames, CSV, Dates, Distributions
sampleSize = 20
foodcourtDF = DataFrame(
    id = 1:1:sampleSize,
    name = "Joy Ride",
    indoor = rand([0,1], sampleSize),
    inlets = rand([1,2,3,4], sampleSize),
    outlets = rand([1,2,3,4], sampleSize),
    temperature = rand(64:1:94, sampleSize),
    avgWaitTime = rand(5:1:110, sampleSize),
    popularityRating = rand(1:1:10, sampleSize),
    foodType = rand(["Fast","Formal","Snacks"], sampleSize),
    numExits = rand([1,2,3,4], sampleSize),
    area = rand(5000:5:15000, sampleSize),
    numEmployees = rand(1:1:15, sampleSize)
    )

first(foodcourtDF, 5)

5 rows × 12 columns

idnameindoorinletsoutletstemperatureavgWaitTimepopularityRatingfoodTypenumExitsareanumEmployees
Int64StringInt64Int64Int64Int64Int64Int64StringInt64Int64Int64
11Joy Ride02292552Formal170159
22Joy Ride02191241Formal4804512
33Joy Ride011851075Fast11003014
44Joy Ride02376642Formal2101106
55Joy Ride04476810Fast41308511

IOTs climate data

##############################################
# let's create weather data
##############################################
using DataFrames, CSV, Dates, Distributions
sampleSize = 365
weatherDF = DataFrame(
    cityid = 1:1:sampleSize,
    state = rand(["LA","LA","FL"], sampleSize),
    indoorTemp = rand(64:1:94, sampleSize),
    outdoorTemp = rand(64:1:94, sampleSize),
    wind = rand(5:1:30, sampleSize),
    humidity = rand(30:1:70, sampleSize),
    precipitation = rand(0:1:5, sampleSize)
    )

first(weatherDF, 5)

5 rows × 7 columns

cityidstateindoorTempoutdoorTempwindhumidityprecipitation
Int64StringInt64Int64Int64Int64Int64
11FL649415325
22LA848512384
33FL66768504
44LA73909523
55LA78697613

Analyzing patterns

Graph 1

Graph 2

Graph 3

Graph 4