GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs. It provides
high-level APIs in Scala, Java, and Python. It aims to provide both the functionality of GraphX
and extended functionality taking advantage of Spark DataFrames. This extended functionality
includes motif finding, DataFrame-based serialization, and highly expressive graph queries.
What are GraphFrames? GraphX is to RDDs as GraphFrames are to DataFrames.
GraphFrames represent graphs: vertices (e.g., users) and edges (e.g., relationships between
users). If you are familiar with GraphX, then GraphFrames will be easy to learn. The key differ-
ence is that GraphFrames are based upon Spark DataFrames, rather than RDDs.
GraphFrames also provide powerful tools for running queries and standard graph algorithms.
With GraphFrames, you can easily search for patterns within graphs, find important vertices, and
more. Refer to the User Guide for a full list of queries and algorithms.
creating nodes using pagerank algorithm

# Create a Vertex DataFrame with unique ID column “id”
v = sqlContext.createDataFrame([
(“a”, “Alice”, 34),
(“b”, “Bob”, 36),
(“c”, “Charlie”, 30),
], [“id”, “name”, “age”])
# Create an Edge DataFrame with “src” and “dst” columns
e = sqlContext.createDataFrame([
(“a”, “b”, “friend”),
(“b”, “c”, “follow”),
(“c”, “b”, “follow”),
], [“src”, “dst”, “relationship”])
# Create a GraphFrame
from graphframes import *
g = GraphFrame(v, e)
# Query: Get in-degree of each vertex.
# Query: Count the number of “follow” connections in the graph.
g.edges.filter(“relationship = ’follow’”).count()
# Run PageRank algorithm, and show results.
results = g.pageRank(resetProbability=0.01, maxIter=20)“id”, “pagerank”).show()


Figure 4.2: NetworkX logo
NetworkX is a Python package for the creation, manipulation, and study of the structure, dy-
namics, and functions of complex networks.
• Data structures for graphs, digraphs, and multigraphs• Many standard graph algorithms
• Network structure and analysis measures
• Generators for classic graphs, random graphs, and synthetic networks
• Nodes can be ”anything” (e.g., text, images, XML records)
• Edges can hold arbitrary data (e.g., weights, time-series)
• Open source 3-clause BSD license
• Well tested with over 90% code coverage
• Additional benefits from Python include fast prototyping, easy to teach, and multi-platform
sudo apt-get install python-pip python-virtualenv
virtualenv venv
source venv/bin/activate
pip install networkx
Algorithm PageRank computes a ranking of the nodes in the graph G based on the structure
of the incoming links. It was originally designed as an algorithm to rank web pages.
Graph types
• Undirected Simple
• Directed Simple
• With Self-loops
• With Parallel edges


Figure 4.1: OSMnx map of manhattan
OSMnx: retrieve, construct, analyze, and visualize street networks from OpenStreetMap.
OSMnx is a Python package that lets you download spatial geometries and construct, project,
visualize, and analyze street networks from OpenStreetMaps APIs. Users can download and con-
struct walkable, drivable, or bikable urban networks with a single line of Python code, and then
easily analyze and visualize them.
• Download street networks anywhere in the world with a single line of code
• Download other infrastructure network types, place polygons, or building footprints as well• Download by city name, polygon, bounding box, or point/address + network distance
• Get drivable, walkable, bikable, or all street networks
• Visualize the street network as a static image or leaflet web map
• Simplify and correct the networks topology to clean and consolidate intersections
• Save networks to disk as shapefiles or GraphML
• Conduct topological and spatial analyses to automatically calculate dozens of indicators
• Calculate and plot shortest-path routes as a static image or leaflet web map
• Plot figure-ground diagrams of street networks and/or building footprints
• Download node elevations and calculate edge grades
• Visualize travel distance and travel time with isoline and isochrone maps
• Calculate and visualize street bearings and orientations
sudo apt-get install python-pip python-virtualenv
virtualenv venv
source venv/bin/activate
pip install osmnx
import osmnx as ox
G = ox.graph_from_place(’Punjab, India’, network_type=’drive’)

Open Street Map(OSM)

OpenStreetMap (OSM) is a collaborative project to create a free editable map of the world.
The creation and growth of OSM has been motivated by restrictions on use or availability of map
information across much of the world, and the advent of inexpensive portable satellite navigation

OSM is considered a prominent example of volunteered geographic information.
Created by Steve Coast in the UK in 2004, it was inspired by the success of Wikipedia and
the predominance of proprietary map data in the UK and elsewhere. Since then, it has grown
to over 2 million registered users, who can collect data using manual survey, GPS devices, aerial
photography, and other free sources.

This crowdsourced data is then made available under the Open Database Licence. The site is supported by the OpenStreetMap Foundation, a non-profit
organisation registered in England and Wales.

Rather than the map itself, the data generated by the OpenStreetMap project is considered its
primary output. The data is then available for use in both traditional applications, like its usage
by Craigslist, OsmAnd, Geocaching, MapQuest Open, JMP statistical software, and Foursquare
to replace Google Maps, and more unusual roles like replacing the default data included with
GPS receivers. OpenStreetMap data has been favourably compared with proprietary datasources,
though data quality varies worldwide.

Map usage Map is available on the following platform.

  •  Web browser Data provided by the OpenStreetMap project can be viewed in a web browser
    with JavaScript support via Hypertext Transfer Protocol (HTTP) on its official website.
  • OsmAnd OsmAnd is free software for Android and iOS mobile devices that can use offline vector data from OSM. It also supports layering OSM vector data with prerendered raster map tiles from OpenStreetMap and other sources.

• is free software for Android and iOS mobile devices that provides offline
maps based on OSM data.
• GNOME Maps GNOME Maps is a graphical front-end written in JavaScript and intro-
duced in GNOME 3.10. It provides a mechanism to find the user’s location with the help of
GeoClue, finds directions via GraphHopper and it can deliver a list as answer to queries.
• Marble Marble is a KDE virtual globe application which received support for OpenStreetMap.
• FoxtrotGPS FoxtrotGPS is a GTK+-based map viewer, that is especially suited to touch
input. It is available in the SHR or Debian repositories.
• Emerillon Another GTK+-based map viewer.
• The web site provides a slippy map interface based on the Leaflet
JavaScript library (and formerly built on OpenLayers), displaying map tiles rendered by
the Mapnik rendering engine, and tiles from other sources including
• Custom maps can also be generated from OSM data through various software including Jawg
Maps, Mapnik, Mapbox Studio, Mapzen’s Tangrams.
• OpenStreetMap maintains lists of online and offline routing engines available, such as the
Open Source Routing Machine. OSM data is popular with routing researchers, and is also
available to open-source projects and companies to build routing applications (or for any
other purpose).