How do I create a Sankey diagram in power bi or any other program which just shows the transfers of locations for employees from a company?

34 views Asked by At

I'd like to create a sankey diagram using the information from the table below. I'd like for the source location (Previous Location) to have the same number of destinations (New Location) in the diagram, so that there is just a one to one flow for each row. The issue I am having is that when I create this chart in power bi, it creates multiple layers in power bi, where one location will go into another location which would go into another location...i.e Berlin to London to Hong Kong. This is not what I'd like. Any advice on how I could create a sankey diagram where one point just flows to one other point? I don't mind which program this is done in.

enter image description here

1

There are 1 answers

0
fam-woodpecker On

I'm not convinced a Sankey diagram would even work like that just by design. You'd be expecting multiple copies of nodes then which would confuse the plot to the point it may not be readable (if I understand your request).

Maybe a circular flow digram is better? Basically a Sankey in a circle.

Theres a python library that does it as below

pip install pycirclize
import pandas as pd
from pycirclize import Circos

cities = [
    "London",
    "Milan",
    "Tokyo",
    "Brussels",
    "Berlin",
    "Riyadh",
    "Shanghai",
    "Hong Kong",
    "Seoul",
    "Dubai",
    "New York",
    "Sydney",
    "Singapore",
    "Brisbane",
    "Melbourne",
    "Kuala Lumpur",
    "Perth",
]

# create route matrix
row_names = cities
col_names = cities
matrix_data = [
    [0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
]
matrix_df = pd.DataFrame(matrix_data, index=row_names, columns=col_names)

# Initialize Circos from matrix for plotting Chord Diagram
circos = Circos.initialize_from_matrix(
    matrix_df,
    space=5,
    cmap="tab10",
    label_kws=dict(size=12),
    link_kws=dict(ec="black", lw=0.5, direction=1),
)

circos.savefig("circlize_plot.png")

You can simplify the row_names and col_names to be only cities that feature in the From and To columns respectively, but I just did a rough version. You could also automate this matrix being built from the input.

Flights Circular Flow Plot