Graph Database Modeling by Thakur Kanika & Singh Ajit

Graph Database Modeling by Thakur Kanika & Singh Ajit

Author:Thakur, Kanika & Singh, Ajit [Thakur, Kanika]
Language: eng
Format: epub
Publisher: UNKNOWN
Published: 2020-08-23T16:00:00+00:00


Thepreviousfigurebreaksupthetournamentwinnerstableintotwotables,onewithplayer detailsandonewiththetournamentdetails.Theactualruleson"functionaldependencies"and "nonprimeattributes"arehardtoremember,buttheprocessofsplittingandmergingtables comesintuitivelywithexperience.Forexample,iftherewasanexistingtablewhichhadone rowperplayer,we'dprobablymovethe"dateofbirth"tothattable.

Transformationrulesthatproduceequivalentschemas

Thissectionlistssometransformationrulesthatproduceequivalentgraphschemas.Agraph schemaisequven toanothergraphschemaifthedatastoredinoneschema,alongwith theapplicationsthataccessit,canbeportedtotheotherschema,andviceversa.Theserules arelikesplittingandmergingtablesinrelationalmodels.

Thetransformationrulesinthissectioncanbemechanicallyappliedtoanyschema,andhas nothingtodowithitssemantics.Byapplyingacombinationoftheserules,youcouldsimplify thesemanticsandimprovetheusabilityofyourgraphmodel.

RuleA:Renamingpropertiesandlabels

Thisruleconsistsofthreetransformationsthatresultinequivalentschemas:

Anyvertexlabelcanberenamed,solongasthenewnamedoesn'trefertoanexisting vertexlabel.

Anyedgelabelcanberenamed,solongasthenewnamedoesn'trefertoanexistingedge labelbetweentheoutandinvertextypes.

Anyvertex/edgepropertycanberenamedsolongthenewnamedoesn'trefertoan existingpropertyofthevertex/edgetype.

Thefollowingfigureillustratessomeexampleapplicationsofthisruleonvertexandedgelabels:

Fgure10:Renamngproperesandabes

Theschemashowninthetopisasimplegraphschemashowingfamilyrelationships.This schemaistransformedtotheschemashowninthebottom ofthefigureusingthe followingtransformations:

VertexlabelsManandWomanarerenamedtoMaleandFemale.

Edgelabelsmother(2instances),father(2instances)arerenamedtoparent.

Eventhoughitseemslikesomeinformationislostbyrenamingmother/fathertoparent,this isn'ttruebecausethevertexlabelsattheendpoints(Male/Female)havethatinformation.This sametransformationwouldn'tbesoobviouswhilelookingataninstanceofthisgraphlikethe Kennedyfamilytree.

Notethatyoucannotrename'wife'to'parent'inthebottomschema.Thisisbecausethere alreadyexistsaparentedgetypefromMaletoFemale.

RuleB:Reversingedgedirections

Thisrulestatesthatanedgetypecanbereversedprovideditisaselfloop,orthereisnoedge typewiththesamelabelinthereversedirection.Thecardinalityoftheedgetypeisreversedas well.

Fgure 1:Reverngedgedrecons

Thefollowingfigureillustratesanexampletransformationusingthisruleandtheprevious one.Thetransformationinvolvesthefollowingsteps:

The'wife'edgeisrenamedto'husband'(ruleA)andthenreversed.

Eachparentedgeisrenamedto'son'or'daughter'andreversed.

Notethatthereversalisdoneinthegraphinstanceaswellastheschema.Inotherwords, JFKJrparent>JFKSr.becomesJFKSr.son>JFKSr.

Youcouldalwaysrenamethefour'son'and'daughter'edgetypes,to'child'usingruleA. Again,noinformationislostsincethevertexlabelsarestillunique.

Youwould,however,notbeabletorename'husband'to'son'.Youcouldrename'husband' to'daughter'(thoughabsurd).Theapplicationwillhavetointerpret"maledaughters"as husbands.Butafteryourenamehusbandtodaughter,youwouldnotbeabletoreverseits direction.

Asyoucanseealready,someapplicationsoftheserulesmaybequitehardtoderiveifyouare thinkingintermsofgraphinstances,ratherthangraphschemas.

RuleC:Propertydisplacement

Fgure12:Propertydsacement

Thisrulestatesthatapropertyonanedgetypecanbemovedtoeitheradjacentvertextype, provideditslookacrosscardinalityis1.Thereverserulestatesthatapropertyinavertextype canbemovedtoanadjacentedgetypewithlookacrosscardinalityof1,providedtheedge alwaysexistswhenthepropertyexists.

Theadjoiningfigureclarifiestherule,wherethe'dateOfBirth'propertyismovedtothe 'mother'relationshipbecausethereisexactlyonemotherrelationshipperMan/Womanandit isdefinedwhenthedateOfBirthisdefined.Ifyourename'dateOfBirth'to'deliveryDate',one couldarguethatthepropertybelongsintheedgeandnotthevertex.

NotethataMan'sdateOfBirthcannotbedisplacedtothewiferelationshipbecausethat wouldmeanthatthedataOfBirthcannotbestoredunlessthepersonismarried.Similarly, thedateOfBirthintheedgetypelabeled'mother'fromMantoWomaninthebottomschema, cannotbemovedtoWomanbecauseofthecardinalityrestrictionsintherule.

Usingthisrule,youcanmovethepropertiesaroundtheschematocomeupwithabetter lookingdesign.Thisruleisalsousefulinsatisfyingindexingrequirementsofvariousgraph databases.Forexample,ifagraphdatabaseonlysupportsindexesonvertexproperties,you couldmovesearchablepropertiesfromtheedgestovertices.Similarly,ifagraphdatabase supportsvertexcentricindexesbasedonpropertiesonadjacentedges/vertices,youcanuse thisruletobringtheindexedpropertyclosertothevertextypeofinterest.

RuleD:Specializationandgeneralization

Thisrulestatesthat:

AnyvertextypecanbedividedintotwodisjointvertextypesbasedonaBooleanteston thepropertiesandadjacentedgelabelsofavertexbelongingtothattype.

AnyedgetypecanbedividedintotwodisjointedgetypesbasedonaBooleanteston thepropertiesandadjacentvertexlabelsofanedgebelongingtothattype. Fgure13:Generizaon

Inotherwords,ifweprovideabooleanfunctionthatcangiveaT/Fresultgivenavertex/edge, wecanusethatfunctiontodivideavertex/edgetypeintotwodifferenttypes.

Thereverserulestatesthat:

Anyvertex/edgetypecanbemergedintoanothervertex/edgetypeprovidedthereisa Booleantestthatcandistinguishitsvertices/edgesfromthemergedvertices.

Theadjoiningfigureshowsanexampletransformationinvolvingthefollowingsteps:

MaleandFemalearegeneralizedasPerson,becausethebooleantest,sexequals'M', candistinguishMalefromFemale.

Afterthat,sonanddaughteredgetypesaregeneralizedaschildbecausethebooleantest, sexofinvertexequals'M',candistinguishsonfromdaughter.

Thisruleisusefulinincreasingthespecificity,orreducingthecomplexityofthegraphschema. Asageneralprinciple,itisbettertousethisruleforspecialization,we.e.,increasingthe specificity,becausethatallowsthedifferentvertexandedgetypestoembracedifferent behaviorintermsofpropertiesandadjacentedges.However,thereareinstanceswherethe differencesbetweenthevertextypesaresominorthatspecializationonlyresultsinapplication complexity.ThisargumentcouldapplytotheabovegeneralizationofMaleandFemaleto Person.

RuleE:Edgepromotion

Fgure14:Edgepromoon

Thisrulestatesthatanedgetypecanbe promoted toavertextypebyaddingtwo"out"edge typestotheendpoints.Thepropertiesofthevertextypebecomepropertiesoftheedgetype. ThecardinalityofthenewedgetypesareN:1or1:1dependingonthelookacrosscardinalityof theoriginalendpointvertex'stype.

Notethatthedirectionofthenewedgetypescanbechangedusingontherenameand reverserulesresp.Weonlymentionthe"out"directiontosimplifythewayinwhich cardinalityforthenewedgestypesisderived.

Theadjoiningfigureshowsthehusbandedgepromotedtoavertextypecalled'Marriage'. Theedgetypes'husband'and'wife'pointtothetwoendpointsofthevertextype.

TheedgepromotionruleisusefulinapreparingbinaryrelationshiptobecomeanNary relationship.

Thereverserulestatesthatanyvertextypewithtwopropertylessedgetypes,withsameside cardinalityofexactly1,canbedemoted toanedgebetweentheadjacentvertices.Thisprocess isusefultosimplifyschemas.Youcanusethepropertydisplacementrule(ruleC)tomove propertiesoutofedges.

RuleF:Propertypromotion

Fgure15:Propertypromoon

Thisrulestatesthatanygroupofpropertiescanbepromotedtoanewvertextypewiththose properties,providedthenewvertextypehasedgesconnectingittoallexistingvertextypesthat includethepropertygroup.Thesamesidecardinalityofthenewedgetypeis1.

Theadjoiningfigureshowsthe'sex'propertyconvertedtoanewvertextype.Thisvertextype willhaveexactlytwonodescorrespondingtomaleandfemale.Soinotherwords,everyperson inthenewgraphwillhaveanoutgoing'isa'edgetooneofthetwonewvertices.

Thisruleisequivalenttothesplittingofarelationintotworelations,asshowninthefirst figureofthissection.Anygroupofproperties,typicallyonesthatrepeat,canbepromotedtoa vertex.

Whileapplyingthisrule,itisbettertoincludeallvertextypesthathavethesamegroupof properties.Forexample,ifthereisa'sex'propertyinadifferentAnimaltype,itisbetterto pointthattothenewSexvertextypeaswell.Ifyouhaveedgetypeswiththepropertygroup, youcanfirstpromotethoseedgetypestovertices.

Thereverseofthisruleisthatavertextypethathaspropertylessedgetypeswithsameside cardinalityof1,canbedemotedtothegroupofpropertiesthatitholds.Thesepropertiesmust beaddedtoeveryadjacentvertextype.Thisistheequivalentofthedenormalizationofatable intherelationalmodel,whichisusefultoreducethenumberofjoins(ortraversalsinthecase ofgraphdatabases).

RuleG:Propertyexpansion

Fgure16:Propertyexpanson

Thisrulestatesthatapropertyofavertextypethatrepresentsalistofvaluescanbemoved toaseparatevertextypewhichstoreseachvalue.Thenewvertextypemusthavean"in" edgetypefromtheexistingvertextypewithcardinality1:N.Theadjoiningfigureshowsthis ruleappliedtothenicknamepropertywhichholdsalistofStrings.

Thereverserulestatesthatanyvertextypewithexactlyonepropertylessedgetypewith lookacrosscardinalityofexactly1canberemovedaftermovingitspropertiestoalistinthe adjacentvertextype.

Thisistheequivalentof1NFintherelationalmodel.Unlikerelationaldatabases,however,many graphdatabasessupportlistsasavalidtypeforpropertyvalues.Sothechoiceofstoring nicknamesasaListoraseparatevertextypeisuptothedesigner.

Summary

Rulebasedschematransformationsaretoolsthatadatamodeldesignercanusetorewritea graphschema,withoutlosinganyinformationintheprocess.Inotherwords,adatamodel designercanusetheserulestoselectthedirectionsofedges,thenamesofdifferentlabels andkeys,thelocationsofvariousproperties,andsoon.Thesechangesdon'tmatterfroman pureinformationperspective,butcouldmakeabigdifferenceintheusabilityandefficiency. Inthatsense,adatamodeldesignercangobacktoCodd'soriginalgoalsfornormalization designingschemasthatareeasytomodify,easytoextend,informativetousersand supportiveofvariousquerypatterns.

Chapter5

One metare fornormalization

Theprevioussectionlistedsevenrulebasedschematransformationssuchasrenaminglabels, reversingedges,promotingedgesandpropertiestovertices,andsoon.Suchrulebased transformationscanbemechanicallyappliedto anygraphschema,withoutlosingany informationintheprocess.Usingtheserules,agraphdatabasedesignercanstartwitha designgeneratedfromanentityrelationshipmodelandtweakittogetafinaldesign.

Thissectiondescribesasingle metare from whichthesevenpreviouslydescribed rulescanbederived.Italsoformalizessomeoftheideaspresentedintheprevious sectionsusingsettheory.

Schemasandconstraints

Fgure17:Exampegraphschema

Theabovefigureshowsanexamplegraphschemadescribingconstraintsonthegraphdata modelsuchas:

Whatarethelegallabelsforvertices?

Whatarethelegaledgelabelsbetweentwovertextypes?

Whatarethelegalpropertykeysandvaluetypesateachedgeorvertextype?

Thereality,however,isthatagraphmodelcouldhaveotherconstraintsthataren'texpressed intheschema.Forexample,the'inviter'edgeineveryInvitationmustbetotheUserwhohas an'owns'edgetothe'page'edgeoftheInvitation.Thisconstraintisn'tcapturedintheabove schema.

Thequestionis:Howcanwemodelcompexconstrntsnagraphmode?

Graphuniverses,transformationsandequivalence

Agraphunverse Uisasetofgraphs,typicallyaninfiniteset.Agraphuniverserepresentsa datamodelinthesensethatitcaptureseveryvalidgraphthatbelongstothedatamodel.

AgraphuniverseUis compabe withagraphschemaS,ifeverygraphintheuniverseis compatiblewithS.Inotherwords,althoughthegraphuniverseisaprecisedescriptionofthe model,itcanstillbeunderstoodasarefinementofamorelooselydefinedgraphschema.

Redenngequvenceusngtransformaonfuncons

Fgure18:Annverbefuncon

AgraphtransformationTisafunctionthattakesgraphsfromoneuniverseUto anotheruniverseV.Inshort,T:U→ V.

AuniverseUisequivalenttoauniverseVifthereisatransformationfunctionT:U → V, whereTisinvertible.Invertibleandbijectivearetermstocharacterizefunctionsthat establishaonetoonecorrespondencebetweentwosets,whichinthiscasearegraph universes.

Inotherwords,givenanygraphG∈U,wecanuseT(G)togetagraphG'∈V.Thenwecanusethe inversefunctionT1 (G')togetbackG.Henceestablishingequivalenceofthetwouniverses.

Aprogrammngperspecve

Ifweareupgradingfrom onegraphmodeltoanother,thetransformationfunctionisthe upgradescp thatwewouldimplementtomovetothenewmodel.Ifwecanalsowritea downgradescp ,thenwehavetwoequivalentmodels(oruniverses).Inotherwords,two graphmodels,representedasuniversesorschemas,areequivalentiftheyareforwardand backwardcompabe .

Derivedtypes

ConsideragraphuniverseUthatiscompatiblewithaschemaS.AvertextypeinScanbecalleda devedvertextypenU ,ifeverygraphG∈Uissuchthatitsvertices(andadjacentedges)belongingto thevertextypecanbecalculatedfromtherestofthegraph.

Inotherwords,givenanygraphintheuniverseU,afterweremoveallverticescorrespondingtothe derivedvertextype,thereshouldbeawaytocalculatethoseverticesagain.Derivededgeand propertytypescanbedefinedsimilarly.Notethatallderivedelementtypesaredefinedingraph schemas,butarespecifictographuniversesthatarecompatiblewiththatschema.

Metarule:Addingandremovingderivedtypes

Finally,hereisthemetarulebehindallschematransformations:

GivenanygraphuniverseUcompatiblewithaschemaS,wecanaddaderivedvertex/edge/propertytypetoproducean equivalentgraphuniverseVcompatiblewiththeschemaS∪{derivedtype}.

Thereverserulestatesthat:

GivenanygraphuniverseUcompatiblewithaschemaS,wecanremoveaderived vertex/edge/propertytypetoproduceanequivalentgraphuniverseVcompatible withtheschemaS{derivedtype}.

Fgure19:Modfedgraphschema

The'invitee'edgetypeinthegraphschemashowninthefirstfigureisaderivededgetype. Thisisbecausethe'invitee'edgescanbecalculatedbygoingfrom theInvitationverticesto thePageandbacktotheuserthrough'owns'edge(reversedirection).Wecansimplifythe originalschematotheversionshownintheadjoiningfigurebyapplyingtheserules:

(Metarule)Removederivededgetype'invitee'

(Edgepromotion)DemotethebinaryrelationshipInvitationtoanedgecalled'invited'.

Asyoucansee,theupdatedschemaissimplerthantheoriginalschemaderivedfrom anER diagram.

Provingthemetarule

Themetaruleiseasyto provebecauseofthewayderivedtypesaredefined.The transformationfunctiontoremoveaderivedtypesimplyremovesallelementsthatbelong tothattype.Theinversefunctioncalculatesthederivedtypesfrom theremaininggraph. Hencetheuniversewiththederivedtypeisequivalenttotheuniversewithoutit.

Provingthe7rules:Renaming,Reversing,PropertyDisplacement,...

Considertheexampleofrenaminganedgetype.Thisrulewasstatedinthelastsectionas:

Anyedgelabelcanberenamed,solongasthenewnamedoesn'trefertoanexisting edgelabelbetweentheoutandinvertextypes.

Wecanprovethisintwosteps:

Addderivededgetypewiththenewnameasacopyoftheoldedgetype. Removetheoldedgetypewhichisnowderivablefromthenewedgetype.

Ofcourse,step1requiresthattheedgetypewiththenewnamedoesn'talreadyexistinthe schema.Otherwise,alledgesoftheedgetypecan'tbederived.Hencethecondition"aslong asthenewnamedoesn'trefertoanexistingedgelabel."

Inthismanner,wecanproveeachrulebyperformingsomestepstofirstaddnewderived typesandthenremovetheexistingtypeswhichbecomederivedtypesthemselves.

Beyondtransformationrules

Thinkingintermsofgraphuniverses,derivedtypesandtransformationfunctionsletsusdo moreradicaltransformationstoourgraphmodel.Themannerwithwhichweapplytheserules ortransformationsdependsonouroverallstrategyfordatamodeling.

Onestrategyistominimizethenumberofimplicitconstraintsnotcapturedbytheschema.For instance,theschemashowninthesecondfiguredoesn'thavetheimplicitconstraintonthe'invitee' edgetypeshowninthefirstfigure.Generally,fewerimplicitconstraintsmeanslessduplicationof dataandlesschanceofbugswhileupdatingthedatabase.Thisissimilartonormalizationin relationaldatabases.

Adifferentstrategyistotunethegraphforitsspecificqueryingneeds.Suchapproacheshave beenpopularizedby"denormalization"techniquessuchasdimensionalmodeling.Forinstance, wecouldadda"shortcut"derivededgetypecalled'latest'from UsertoPagetoshowthelast createdpageforeachuser.Theimportantthingthenistoensurethatanychangetotherestof thegraphisaccuratelyreflectedinthederivedelementtypes.Thecodethatoperatesonthe graphmustbedesignedwiththeseconstraintsinmind.

Summary

Thissectionintroducedsettheoreticrepresentationsofgraphmodelscalledgraphuniverses, whicharemorepowerfulthangraphschemas.Secondly,thissectionshowedthattwograph universesareequivalentifthereisaninvertiblegraphtransformationfunctionbetweenthem. Finally,thissectionshowedthatallschematransformationrulespresentedintheearlier sectioncanbederivedfromonemetarulethatdealswithaddingandremovingderivedtypes.

Validatinggraphschemas

Thelastfew sectionshavediscussedhow propertygraphschemascanhelpdesigngraph databasesfrom ERmodelsandrefinethedatamodelthroughschemamanipulations.After readingthisthreadontheGremlinusersgroup,werealizedthatitiseasytovalidategraphs againstschemaswithGremlinandGroovy.

Fgure20:Tnkergraphschema

ThisgistonGithubshowshowyoucantakeaninstancegraphandchecktoseeifitiscompatible withaschemagraph.Theschemagraphhasverticesandedgescorrespondingtovertexandedge types.Here'sthecodetocreateaschemagraphinsideaGremlinshellfortheclassicTinkerpop schemashownhere:

sg=newTinkerGraph()

person=sg.addVertex()

person.setProperty('_label','person')

person.setProperty('name','java.lang.String')

person.setProperty('age','java.lang.Integer')

software=sg.addVertex()

software.setProperty('_label','software')

software.setProperty('name','java.lang.String')

software.setProperty('lang','java.lang.String')

knows=person.addEdge('knows',person)

knows.setProperty('weight','java.lang.Float')

created=person.addEdge('created',software)

created.setProperty('weight','java.lang.Float')

created.setProperty('_minIn',1)//Someonemustcreatethesoftware

ThepropertieshavevaluescorrespondingtotheJavaClassofthepropertyvaluesinthe instancegraph.Thepropertykeyscanendwith'?'toindicatethepropertyisoptional.The edgesintheschemagraphcanhave4specialproperties,viz._minIn,_maxIn,_minOutand _maxOuttoindicatecardinalityrestrictionsforvariousedgetypes.

Anyinstancegraph,g,canbevalidatedagainsttheschemastoredinsg,usingtheGremlin script:

g.V.filter({checkVertex(it,sg)})

YoucanlookatthefullGithubgisttoseehowthevalidationisdone.

ThecurrentversionofTinkerpopdoesn'tsupportvertexlabels.Sothemappingfrom the vertextothevertextypeisspecifictothegraph,likethis:

vertexType={v,sg>.age?sg.V('_label','person').next():sg.V('_label','software').next()}

Mostgraphschemastypicallyhaveapropertynamed'type'thatwouldmakethismapping easier.

HoweverwithTinkerpop3,thismethodcanbestandardizedtousethelabel: vertexType={v,sg>sg.V('label',v.label).next()}

Pixy:Firstorderlogicongraphdatabases

TheprevioussectionshaveshownthatanyERmodelcanbeconvertedtoapropertygraph schema,andthattheschemacanbenormalizedusingrules.However,onekeyquestion remains:

Dographdatabasesofferthesamequerngcapalitesasraonaldatabases?

Inotherwords,anydatathatfitsinanERmodelcanbestuffedintoagraphdatabase.But candatastuffedinthisfashionbequeriedeffectively?Thisisthesubjectofthissection.

Background

OnSQL

SQListhequerystandardforrelationaldatabases.Itfirstappearedinthe1970sandwas standardizedinthe80sand90s.ThetheoreticalfoundationofSQLisrelationalalgebra.Codd showedthatrelationalalgebraisequivalenttorelationalcalculus,aformoffirstorderlogic.His theoremisthebedrockofSQL'sexpressivepower.

Onfirstorderlogic

Usingrelationalalgebra,wecanwriteanyqueryoftheform"FindallrowsfromtablesA,B,C,..., matching somepredcat ",aslongasthepredicatecanbeexpressedinfirstorderlogic. Specifically,thepredicateisformedusing:

variouscomparisonsonrowsandcolumns, logicaloperations"and"(∧),"or"(∨)and"not"(¬),and

theuniversal"forevery"(∀)andexistential"thereexists"quantifiers(∃)thatoperateonrowsofagiventable.

Let'sconsidertablesnamedperson,carandticket.Wecouldexpressaquerylike"findme peoplewhoownonlyBMW cars,buthaveatleastonespeedingticket".Thepredicatecanbe writtenas:

my_query(person)=(∀car,personownscar∧car.make='BMW')∧(∃ticket,personhasticket)

OnGremlin

Gremlinisastandardgraphtraversallanguage.ItispartoftheTinkerpopstackandworks acrossallBlueprintscompatibledatabases.YoucanreadmoreaboutGremlinhere:

GremlinWikionGithub

GremlinDocs

ThePathologicalGremlin(presentation)

Gremlinisgreatforstepbasedqueries.Fore.g.,somethinglike"findthefriendofafriendof vertexv"canbewrittenasv.out('friend').out('friend').Thisstyleoftraversalwithverticesand edgesisn'tnaturalinSQLwithtuples.

ThedeclarativequeryingstyleofSQLis,however,differentfrom Gremlin.TheSQL2Gremlin tutorialgoesthroughsomeexamples.Butyoucanseethatthetranslationisn'tobvious.

Pixy:FirstorderlogicwithGremlin

Pixyisabridgefrom firstorderlogictoGremlin.ThefirstorderlogicofPixyoperateson verticesandedges.Wecanaskquestionslike"Findverticesandedgesthatmatchsome precat "wherethepredicateisformedby

variouscomparisonsonvertexandedgeproperties, logicaloperations"and"(∧),"or"(∨)and"not"(¬),and

theuniversal"forevery"(∀)andexistential"thereexists"quantifiers(∃)thatoperateon verticesandedges.

PixyqueriesareexpressedusingPrologrules,notSQL.RulesinPrologareexpressedasHorn clauses.

ProloglikeSQLhasthefullexpressivepoweroffirstorderlogic.

Let'stakethepredicatefromtheearlierdiscussion,

my_query(person)=(∀car,personownscar∧car.make='BMW')∧(∃ticket,personhasticket)

Let'ssaythatwerepresentpeopleasverticeswithoutgoingedgetypesnamed'car'and'ticket' toverticesrepresentingcarsandtickets.Now,wecouldexpresstheabovepredicateusing Hornclausesasfollows:

my_query(Person,Ticket):out(Person,'ticket',Ticket),

not(not_all_bmw(Person)).

not_all_bmw(Person):out(Person,'car',Car),

property(Car,'make',Make),

Make<>'BMW'.

NotethatoutandpropertyarepredefinedpredicatesinPixy.Youcanseethatthe ∃partofthequeryiseasy.Thisisamatteroffindinga ticket.The∀partofthequeryisimplementedusingtwonots.Inotherwords,saying"everycarisaBMW"isthesameassaying"thereisno carthatisn'taBMW".

ERmodelsinPixy

IfyouuseanERmodelasastartingpointforyourdesign,youcanreconstitutetheERmodel from thefinalgraphschemausingPixy.ConsiderthepreviouslyreferencedERmodelwith entitiesnamedUser,PageandTagandrelationshipsnamedOwns,InvitesandTaggedAs.

Fgure21:ERmodelforUserPageTagappcaon

Thiswastranslatedtoagraphschemawithfourtypesofvertices,viz.User,Page,Tagand Invitation.

Fgure 2:GraphschemaforUserPageTagappcaon

Now,wecanreconstitutetheERmodelfrom thegraphschemausingPixywiththefollowing clauses:

%Entities

user(User,Name,Login):property(User,'name',Name),property(User,'login',Login). page(Page,Uri,Html,CreateTs):property(Page,'uri',Uri),...

tag(Tag,Hashtag,Description):property(Tag,'hashtag',Hashtag),...

%Relationships

owns(User,Page):out(User,'owns',Page). taggedAs(Page,Tag):out(Page,'taggedas',Tag).

invites(Invitation,Inviter,Invitee,Page):

out(Invitation,'invitee',Invitee),

out(Invitation,'inviter',Inviter),

out(Invitation,'page',Page).

Everypredicatecorrespondstoanentityorarelationship.Thepredicateoperatesonvertices, edgesandpropertiesthatbelongtothegraphschema.Now,yougetthefullpowerof firstorderlogicontheERmodel.Inotherwords,anyfirstorderpredicatethatappliesto entitiesandrelationshipscanbewrittenasaPixyquerythatusestheaboveclauses.

Let'stakeanexamplepredicatethatmatchesallusersinvitedtopagestagged'tinkerpop' createdin2014.Youcouldexpressthisasfollows:

tinkerpop_invitee(User,Page):invites(_,_,User,Page),

page(Page,_,_,CreateTs),

CreateTs>1388534400L,%Unixtimestampfor1/1/2014

taggedAs(Page,Tag),

tag(Tag,'tinkerpop'). Notethat'_'isusedtorepresentanonymousvariables.

Queryrequirementsdon'tusuallymatterwhilemodeling

Itisn'tsurprisingthatqueriesinfirstorderlogiccanbecompiledtoGremlin,sinceGremlinis Turingcomplete.ThesurprisingthingisthatPixyconvertsanyfirstorderlogicqueryonan ERmodeltosomethingthatexecutes"efficiently"onthecorrespondinggraphdatabase.

By"efficiently",wemeanthatthePixy/Gremlinquerywillalwaystraverseedgestogofrom oneentity/relationshiptoanother.Edgetraversaloperationsingraphdatabasesare typicallyordersofmagnitudefasterthanindexbasedjoinsinrelationaldatabases.

Queriesonproperties,willofcourse,needindexesforefficientquerying.Butaslongasyour startingERmodelisaccurate,yourapplicationwillnothavetosimulatejoinsusingthese propertyindexes.Inthatsense,thegraphschemadesignisindependentofthequery requirements.

Conclusion

Inthisbook,ihaveshownthatthepropertygraphmodelisatheoreticalstrongmodelcapable ofbeingasflexibleandpowerfulastherelationalmodel.Specifically,wehaveshownthatthe threepillarsoftherelationalmodelarematchedbythepropertygraphmodel:

Expressivepower:ERdiagramscanbeconvertedtopropertygraphschemas.

Strongdesignguidelines,akanormalization:Graphschemascanbenormalizedusing the7normalizationrules,orthe1normalizationmetarule.

Apowerfulquerylanguage:Pixy,likeSQL,canmodelqueriesexpressibleinfirstorderlogic andconvertthesequeriestoefficient“JOINless”Gremlintraversals.

Tosummarize…

Fgure23:Agraphsummazngtsdocument

References

InternationalStandardISO/IEC13250TopicMaps,December1999.

S.Abiteboul.QueryingSemiStructuredData.InProc.ofthe6thInt.Conf. onDatabaseTheory(ICDT),volume1186ofLNCS,pages1–18. Springer,Jan1997.

S.AbiteboulandR.Hull.IFO:AFormalSemanticDatabaseModel.InProc. ofthe3thSymposium onPrinciplesofDatabaseSystems(PODS), pages119–132.ACMPress,1984.

S.Abiteboul,D.Quass,J.McHugh,J.Widom,andJ.L.Wiener.TheLorel querylanguageforsemistructured data.InternationalJournalon DigitalLibraries(JODL),1(1):68–88,1997.

S.AbiteboulandV.Vianu.QueriesandComputationontheWeb.InProc. ofthe6thInt.Conf.onDatabaseTheory(ICDT),volume1186ofLNCS, pages262–275.Springer,Jan1997.

R.AgrawalandH.V.Jagadish.EfficientSearchinVeryLargeDatabases.In Proc.ofthe14thInt.Conf.onVeryLargeDataBases(VLDB),pages 407–418.MorganKaufmann,AugSept1988.

R.AgrawalandH.V.Jagadish.MaterializationandIncrementalUpdateof PathInformation.InProc.ofthe5thInt.Conf.onDataEngineering(ICDE), pages374–383.IEEEComputerSociety,Feb1989.

R.AgrawalandH.V.Jagadish.AlgorithmsforSearchingMassiveGraphs. IEEE Transactions on Knowledge and Data Engineering (TKDE), 6(2):225–238,1994.

R.AlbertandA.L.Barabasi.Statisticalmechanicsofcomplexnetworks. ReviewsofModernPhysics,74:47,Jan2002.

N.Alechina,S.Demri,andM.deRijke.AModalPerspectiveonPathConstraints. JournalofLogicandComputation,13(6):939–956,2003.

B.AmannandM.Scholl.Gram:AGraphDataModelandQueryLanguage.In European Conference on HypertextTechnology (ECHT),pages 201–211.ACM,NovDec1992.

M.Andries and G.Engels.A Hybrid QueryLanguage foran Extended EntityRelationshipModel.TechnicalReportTR9315,InstituteofAdvanced ComputerScience,UniversiteitLeiden,May1993.

M.Andries,M.Gemis,J.Paredaens,I.Thyssens,andJ.V.denBussche. ConceptsforGraphOrientedObjectManipulation.InProc.ofthe3rdInt. Conf.onExtendingDatabaseTechnology(EDBT),volume580ofLNCS, pages21–38.Springer,March1992.

R.AnglesandC.Gutierrez.QueryingRDFDatafrom aGraphDatabase Perspective.InProc.2ndEuropeanSemanticWebConference(ESWC), number3532inLNCS,pages346–360,2005.

M.A.AufaurePortierand C.Tr´epied.A SurveyofQueryLanguages for GeographicInformationSystems.InProc.ofthe3rdInt.Workshopon InterfacestoDatabases,pages431–438,July1976.

M.AzmoodehandH.Du.GQL,AGraphicalQueryLanguageforSemantic Databases.InProc.ofthe4thInt.Conf.onScientificandStatistical Database Management(SSDBM),volume 339 ofLNCS,pages 259–277.Springer,June1988.

C.Beeri.DataModelsandLanguagesforDatabases.InProc.ofthe2ndInt. Conf.onDatabaseTheory(ICDT),volume326ofLNCS,pages19–40. Springer,AugSept1988.

G.Benk¨o,C.Flamm,andP.F.Stadler.A GraphBasedToyModelof Chemistry.JournalofChemicalInformationandComputerSciences (JCISD),43(1):1085–1093,Jan2003.

C.Berge.GraphsandHypergraphs.NorthHolland,Amsterdam,1973.

U.Brandes.NetworkAnalysis.Number3418inLNCS.SpringerVerlag,2005.

T.Bray,J.Paoli,andC.M.SperbergMcQueen.ExtensibleMarkupLanguage (XML) 1.0, W3C Recommendation 10 February 1998. http://www.w3.org/TR/1998/RECxml19980210.

A.Broder,R.Kumar,F.Maghoul,P.Raghavan,S.Rajagopalan,R.Stata,A. Tomkins,andJ.Wiener.GraphstructureintheWeb.In

Proc.ofthe9thInt.WorldWideWebconferenceonComputernetworks:the internationaljournalofcomputerandtelecommunicationsnetworking,pages 309–320.NorthHollandPublishingCo.,2000.

P.Buneman.SemistructuredData.InProc.ofthe16thSymposium on PrinciplesofDatabaseSystems(PODS),pages117–121.ACMPress, May1997.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.