Can you mix SQL Server graph tables with non-graph tables?

97 views Asked by At

I have been playing with SQL Server graph tables, and you can do some pretty interesting things with them, such as treating them as if they were relational tables:

CREATE TABLE Person 
(
    ID INTEGER PRIMARY KEY,
    [Name] VARCHAR(100)
) AS NODE;

CREATE TABLE City 
(
    ID INTEGER PRIMARY KEY,
    [Name] VARCHAR(100),
    [StateName] VARCHAR(50)
) AS NODE;

INSERT INTO Person (ID, name)
VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Chrissy')

INSERT INTO City (ID, name, stateName)
VALUES (1, 'San Francisco', 'CA'),
       (2, 'Dallas', 'TX'),
       (3, 'Redmond', 'WA');

CREATE TABLE LivesIn AS EDGE

INSERT livesIn
VALUES ((SELECT $node_id FROM Person WHERE ID = 1), 
        (SELECT $node_id FROM City WHERE ID = 1)),
       ((SELECT $node_id FROM Person WHERE ID = 2), 
        (SELECT $node_id FROM City WHERE ID = 2)),
       ((SELECT $node_id FROM Person WHERE ID = 3), 
        (SELECT $node_id FROM City WHERE ID = 3))    

-- basic graph join - works
SELECT *
FROM Person,livesin,city
WHERE MATCH(Person-(livesin)->city)

-- traditional (but nonsensical) relational join on nodes - also works
select * 
from Person p 
inner join city c on p.id = c.id

Now, the 64 dollar question is: Is it possible to mix these two models within the same query? I have been experimenting with syntax to try this out, but without luck. I couldn't find any resources online either.

The reason I ask, it would be very interesting to be able to mix your data structures to take advantage of the best of both worlds. Imagine you are working on a legacy system that was built using traditional RDB techniques. You get asked to add in a org chart, so you drop in some graph nodes and edges, and you want to tie these relationships to the existing employee tables to reduce duplicate data. I could see you needing to write a query to join tabular table data with graph data to make this work.

MS themselves claim (https://learn.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-overview?view=sql-server-ver15) that you can: 'Query across graph and relational data in a single query.'.

The best I have been able to come up with so far is something like this:

DECLARE @Buffer TABLE 
(
    UserName VARCHAR(100),
    Friend VARCHAR(100)
)

INSERT INTO @Buffer
    SELECT
        Person1.[name] AS UserName,
        Person2.[name] AS Friend
    FROM
        Person AS Person1,
        friendOf,
        Person AS Person2
    WHERE MATCH(Person1-(friendOf)->Person2)

-- SomeOtherRdbTable could be any Rdb table with a compatible column to join on
SELECT * 
FROM @Buffer b
INNER JOIN [SomeOtherRdbTable] s ON s.FirstName = b.UserName

This trickery works by just staging the results of the graph search into an in-memory temp table that can be joined to a normal RDB table in following steps. It isn't a terrible solution, but it could certainly be streamlined away if there was a method to directly join the graph call with another non-node table.

I suspect that you cannot (currently) do this sort of join directly, as there are probably separate distinct execution paths that are followed for differing table types. The ability to directly integrate these calls would be an amazing step forward to enable adoption of mixed DB types.

1

There are 1 answers

5
Martin Smith On

The MATCH clause can only be used in a WHERE clause rather than an ON condition.

I did try

FROM
    Person AS Person1 INNER JOIN [SomeOtherRdbTable] s ON s.FirstName = Person1.[name],
    friendOf,
    Person AS Person2
WHERE MATCH(Person1-(friendOf)->Person2)

But that gives error

Identifier 'Person1' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.

If you need an INNER JOIN you could revert to ANSI 89 Join syntax

SELECT
    Person1.[name] AS UserName,
    Person2.[name] AS Friend
FROM
    Person AS Person1,
    friendOf,
    Person AS Person2,
    [SomeOtherRdbTable] s
WHERE MATCH(Person1-(friendOf)->Person2) and s.FirstName = Person1.[name]

But this has various disadvantages (can't convert to outer join, join conditions separated from the table in the query text and so less readable and easy to do inadvertent cross joins) so probably best to use a common table expression or similar

WITH Buffer AS
(
    SELECT
        Person1.[name] AS UserName,
        Person2.[name] AS Friend
    FROM
        Person AS Person1,
        friendOf,
        Person AS Person2
    WHERE MATCH(Person1-(friendOf)->Person2)


)
SELECT * 
FROM Buffer b
INNER JOIN [SomeOtherRdbTable] s ON s.FirstName = b.UserName