Graph Database Modeling by Thakur Kanika & Singh Ajit
Author:Thakur, Kanika & Singh, Ajit [Thakur, Kanika]
Language: eng
Format: epub
Publisher: UNKNOWN
Published: 2020-08-23T16:00:00+00:00
Thepreviousfigurebreaksupthetournamentwinnerstableintotwotables,onewithplayer detailsandonewiththetournamentdetails.Theactualruleson"functionaldependencies"and "nonprimeattributes"arehardtoremember,buttheprocessofsplittingandmergingtables comesintuitivelywithexperience.Forexample,iftherewasanexistingtablewhichhadone rowperplayer,we'dprobablymovethe"dateofbirth"tothattable.
Transformationrulesthatproduceequivalentschemas
Thissectionlistssometransformationrulesthatproduceequivalentgraphschemas.Agraph schemaisequven toanothergraphschemaifthedatastoredinoneschema,alongwith theapplicationsthataccessit,canbeportedtotheotherschema,andviceversa.Theserules arelikesplittingandmergingtablesinrelationalmodels.
Thetransformationrulesinthissectioncanbemechanicallyappliedtoanyschema,andhas nothingtodowithitssemantics.Byapplyingacombinationoftheserules,youcouldsimplify thesemanticsandimprovetheusabilityofyourgraphmodel.
RuleA:Renamingpropertiesandlabels
Thisruleconsistsofthreetransformationsthatresultinequivalentschemas:
Anyvertexlabelcanberenamed,solongasthenewnamedoesn'trefertoanexisting vertexlabel.
Anyedgelabelcanberenamed,solongasthenewnamedoesn'trefertoanexistingedge labelbetweentheoutandinvertextypes.
Anyvertex/edgepropertycanberenamedsolongthenewnamedoesn'trefertoan existingpropertyofthevertex/edgetype.
Thefollowingfigureillustratessomeexampleapplicationsofthisruleonvertexandedgelabels:
Fgure10:Renamngproperesandabes
Theschemashowninthetopisasimplegraphschemashowingfamilyrelationships.This schemaistransformedtotheschemashowninthebottom ofthefigureusingthe followingtransformations:
VertexlabelsManandWomanarerenamedtoMaleandFemale.
Edgelabelsmother(2instances),father(2instances)arerenamedtoparent.
Eventhoughitseemslikesomeinformationislostbyrenamingmother/fathertoparent,this isn'ttruebecausethevertexlabelsattheendpoints(Male/Female)havethatinformation.This sametransformationwouldn'tbesoobviouswhilelookingataninstanceofthisgraphlikethe Kennedyfamilytree.
Notethatyoucannotrename'wife'to'parent'inthebottomschema.Thisisbecausethere alreadyexistsaparentedgetypefromMaletoFemale.
RuleB:Reversingedgedirections
Thisrulestatesthatanedgetypecanbereversedprovideditisaselfloop,orthereisnoedge typewiththesamelabelinthereversedirection.Thecardinalityoftheedgetypeisreversedas well.
Fgure 1:Reverngedgedrecons
Thefollowingfigureillustratesanexampletransformationusingthisruleandtheprevious one.Thetransformationinvolvesthefollowingsteps:
The'wife'edgeisrenamedto'husband'(ruleA)andthenreversed.
Eachparentedgeisrenamedto'son'or'daughter'andreversed.
Notethatthereversalisdoneinthegraphinstanceaswellastheschema.Inotherwords, JFKJrparent>JFKSr.becomesJFKSr.son>JFKSr.
Youcouldalwaysrenamethefour'son'and'daughter'edgetypes,to'child'usingruleA. Again,noinformationislostsincethevertexlabelsarestillunique.
Youwould,however,notbeabletorename'husband'to'son'.Youcouldrename'husband' to'daughter'(thoughabsurd).Theapplicationwillhavetointerpret"maledaughters"as husbands.Butafteryourenamehusbandtodaughter,youwouldnotbeabletoreverseits direction.
Asyoucanseealready,someapplicationsoftheserulesmaybequitehardtoderiveifyouare thinkingintermsofgraphinstances,ratherthangraphschemas.
RuleC:Propertydisplacement
Fgure12:Propertydsacement
Thisrulestatesthatapropertyonanedgetypecanbemovedtoeitheradjacentvertextype, provideditslookacrosscardinalityis1.Thereverserulestatesthatapropertyinavertextype canbemovedtoanadjacentedgetypewithlookacrosscardinalityof1,providedtheedge alwaysexistswhenthepropertyexists.
Theadjoiningfigureclarifiestherule,wherethe'dateOfBirth'propertyismovedtothe 'mother'relationshipbecausethereisexactlyonemotherrelationshipperMan/Womanandit isdefinedwhenthedateOfBirthisdefined.Ifyourename'dateOfBirth'to'deliveryDate',one couldarguethatthepropertybelongsintheedgeandnotthevertex.
NotethataMan'sdateOfBirthcannotbedisplacedtothewiferelationshipbecausethat wouldmeanthatthedataOfBirthcannotbestoredunlessthepersonismarried.Similarly, thedateOfBirthintheedgetypelabeled'mother'fromMantoWomaninthebottomschema, cannotbemovedtoWomanbecauseofthecardinalityrestrictionsintherule.
Usingthisrule,youcanmovethepropertiesaroundtheschematocomeupwithabetter lookingdesign.Thisruleisalsousefulinsatisfyingindexingrequirementsofvariousgraph databases.Forexample,ifagraphdatabaseonlysupportsindexesonvertexproperties,you couldmovesearchablepropertiesfromtheedgestovertices.Similarly,ifagraphdatabase supportsvertexcentricindexesbasedonpropertiesonadjacentedges/vertices,youcanuse thisruletobringtheindexedpropertyclosertothevertextypeofinterest.
RuleD:Specializationandgeneralization
Thisrulestatesthat:
AnyvertextypecanbedividedintotwodisjointvertextypesbasedonaBooleanteston thepropertiesandadjacentedgelabelsofavertexbelongingtothattype.
AnyedgetypecanbedividedintotwodisjointedgetypesbasedonaBooleanteston thepropertiesandadjacentvertexlabelsofanedgebelongingtothattype. Fgure13:Generizaon
Inotherwords,ifweprovideabooleanfunctionthatcangiveaT/Fresultgivenavertex/edge, wecanusethatfunctiontodivideavertex/edgetypeintotwodifferenttypes.
Thereverserulestatesthat:
Anyvertex/edgetypecanbemergedintoanothervertex/edgetypeprovidedthereisa Booleantestthatcandistinguishitsvertices/edgesfromthemergedvertices.
Theadjoiningfigureshowsanexampletransformationinvolvingthefollowingsteps:
MaleandFemalearegeneralizedasPerson,becausethebooleantest,sexequals'M', candistinguishMalefromFemale.
Afterthat,sonanddaughteredgetypesaregeneralizedaschildbecausethebooleantest, sexofinvertexequals'M',candistinguishsonfromdaughter.
Thisruleisusefulinincreasingthespecificity,orreducingthecomplexityofthegraphschema. Asageneralprinciple,itisbettertousethisruleforspecialization,we.e.,increasingthe specificity,becausethatallowsthedifferentvertexandedgetypestoembracedifferent behaviorintermsofpropertiesandadjacentedges.However,thereareinstanceswherethe differencesbetweenthevertextypesaresominorthatspecializationonlyresultsinapplication complexity.ThisargumentcouldapplytotheabovegeneralizationofMaleandFemaleto Person.
RuleE:Edgepromotion
Fgure14:Edgepromoon
Thisrulestatesthatanedgetypecanbe promoted toavertextypebyaddingtwo"out"edge typestotheendpoints.Thepropertiesofthevertextypebecomepropertiesoftheedgetype. ThecardinalityofthenewedgetypesareN:1or1:1dependingonthelookacrosscardinalityof theoriginalendpointvertex'stype.
Notethatthedirectionofthenewedgetypescanbechangedusingontherenameand reverserulesresp.Weonlymentionthe"out"directiontosimplifythewayinwhich cardinalityforthenewedgestypesisderived.
Theadjoiningfigureshowsthehusbandedgepromotedtoavertextypecalled'Marriage'. Theedgetypes'husband'and'wife'pointtothetwoendpointsofthevertextype.
TheedgepromotionruleisusefulinapreparingbinaryrelationshiptobecomeanNary relationship.
Thereverserulestatesthatanyvertextypewithtwopropertylessedgetypes,withsameside cardinalityofexactly1,canbedemoted toanedgebetweentheadjacentvertices.Thisprocess isusefultosimplifyschemas.Youcanusethepropertydisplacementrule(ruleC)tomove propertiesoutofedges.
RuleF:Propertypromotion
Fgure15:Propertypromoon
Thisrulestatesthatanygroupofpropertiescanbepromotedtoanewvertextypewiththose properties,providedthenewvertextypehasedgesconnectingittoallexistingvertextypesthat includethepropertygroup.Thesamesidecardinalityofthenewedgetypeis1.
Theadjoiningfigureshowsthe'sex'propertyconvertedtoanewvertextype.Thisvertextype willhaveexactlytwonodescorrespondingtomaleandfemale.Soinotherwords,everyperson inthenewgraphwillhaveanoutgoing'isa'edgetooneofthetwonewvertices.
Thisruleisequivalenttothesplittingofarelationintotworelations,asshowninthefirst figureofthissection.Anygroupofproperties,typicallyonesthatrepeat,canbepromotedtoa vertex.
Whileapplyingthisrule,itisbettertoincludeallvertextypesthathavethesamegroupof properties.Forexample,ifthereisa'sex'propertyinadifferentAnimaltype,itisbetterto pointthattothenewSexvertextypeaswell.Ifyouhaveedgetypeswiththepropertygroup, youcanfirstpromotethoseedgetypestovertices.
Thereverseofthisruleisthatavertextypethathaspropertylessedgetypeswithsameside cardinalityof1,canbedemotedtothegroupofpropertiesthatitholds.Thesepropertiesmust beaddedtoeveryadjacentvertextype.Thisistheequivalentofthedenormalizationofatable intherelationalmodel,whichisusefultoreducethenumberofjoins(ortraversalsinthecase ofgraphdatabases).
RuleG:Propertyexpansion
Fgure16:Propertyexpanson
Thisrulestatesthatapropertyofavertextypethatrepresentsalistofvaluescanbemoved toaseparatevertextypewhichstoreseachvalue.Thenewvertextypemusthavean"in" edgetypefromtheexistingvertextypewithcardinality1:N.Theadjoiningfigureshowsthis ruleappliedtothenicknamepropertywhichholdsalistofStrings.
Thereverserulestatesthatanyvertextypewithexactlyonepropertylessedgetypewith lookacrosscardinalityofexactly1canberemovedaftermovingitspropertiestoalistinthe adjacentvertextype.
Thisistheequivalentof1NFintherelationalmodel.Unlikerelationaldatabases,however,many graphdatabasessupportlistsasavalidtypeforpropertyvalues.Sothechoiceofstoring nicknamesasaListoraseparatevertextypeisuptothedesigner.
Summary
Rulebasedschematransformationsaretoolsthatadatamodeldesignercanusetorewritea graphschema,withoutlosinganyinformationintheprocess.Inotherwords,adatamodel designercanusetheserulestoselectthedirectionsofedges,thenamesofdifferentlabels andkeys,thelocationsofvariousproperties,andsoon.Thesechangesdon'tmatterfroman pureinformationperspective,butcouldmakeabigdifferenceintheusabilityandefficiency. Inthatsense,adatamodeldesignercangobacktoCodd'soriginalgoalsfornormalization designingschemasthatareeasytomodify,easytoextend,informativetousersand supportiveofvariousquerypatterns.
Chapter5
One metare fornormalization
Theprevioussectionlistedsevenrulebasedschematransformationssuchasrenaminglabels, reversingedges,promotingedgesandpropertiestovertices,andsoon.Suchrulebased transformationscanbemechanicallyappliedto anygraphschema,withoutlosingany informationintheprocess.Usingtheserules,agraphdatabasedesignercanstartwitha designgeneratedfromanentityrelationshipmodelandtweakittogetafinaldesign.
Thissectiondescribesasingle metare from whichthesevenpreviouslydescribed rulescanbederived.Italsoformalizessomeoftheideaspresentedintheprevious sectionsusingsettheory.
Schemasandconstraints
Fgure17:Exampegraphschema
Theabovefigureshowsanexamplegraphschemadescribingconstraintsonthegraphdata modelsuchas:
Whatarethelegallabelsforvertices?
Whatarethelegaledgelabelsbetweentwovertextypes?
Whatarethelegalpropertykeysandvaluetypesateachedgeorvertextype?
Thereality,however,isthatagraphmodelcouldhaveotherconstraintsthataren'texpressed intheschema.Forexample,the'inviter'edgeineveryInvitationmustbetotheUserwhohas an'owns'edgetothe'page'edgeoftheInvitation.Thisconstraintisn'tcapturedintheabove schema.
Thequestionis:Howcanwemodelcompexconstrntsnagraphmode?
Graphuniverses,transformationsandequivalence
Agraphunverse Uisasetofgraphs,typicallyaninfiniteset.Agraphuniverserepresentsa datamodelinthesensethatitcaptureseveryvalidgraphthatbelongstothedatamodel.
AgraphuniverseUis compabe withagraphschemaS,ifeverygraphintheuniverseis compatiblewithS.Inotherwords,althoughthegraphuniverseisaprecisedescriptionofthe model,itcanstillbeunderstoodasarefinementofamorelooselydefinedgraphschema.
Redenngequvenceusngtransformaonfuncons
Fgure18:Annverbefuncon
AgraphtransformationTisafunctionthattakesgraphsfromoneuniverseUto anotheruniverseV.Inshort,T:U→ V.
AuniverseUisequivalenttoauniverseVifthereisatransformationfunctionT:U → V, whereTisinvertible.Invertibleandbijectivearetermstocharacterizefunctionsthat establishaonetoonecorrespondencebetweentwosets,whichinthiscasearegraph universes.
Inotherwords,givenanygraphG∈U,wecanuseT(G)togetagraphG'∈V.Thenwecanusethe inversefunctionT1 (G')togetbackG.Henceestablishingequivalenceofthetwouniverses.
Aprogrammngperspecve
Ifweareupgradingfrom onegraphmodeltoanother,thetransformationfunctionisthe upgradescp thatwewouldimplementtomovetothenewmodel.Ifwecanalsowritea downgradescp ,thenwehavetwoequivalentmodels(oruniverses).Inotherwords,two graphmodels,representedasuniversesorschemas,areequivalentiftheyareforwardand backwardcompabe .
Derivedtypes
ConsideragraphuniverseUthatiscompatiblewithaschemaS.AvertextypeinScanbecalleda devedvertextypenU ,ifeverygraphG∈Uissuchthatitsvertices(andadjacentedges)belongingto thevertextypecanbecalculatedfromtherestofthegraph.
Inotherwords,givenanygraphintheuniverseU,afterweremoveallverticescorrespondingtothe derivedvertextype,thereshouldbeawaytocalculatethoseverticesagain.Derivededgeand propertytypescanbedefinedsimilarly.Notethatallderivedelementtypesaredefinedingraph schemas,butarespecifictographuniversesthatarecompatiblewiththatschema.
Metarule:Addingandremovingderivedtypes
Finally,hereisthemetarulebehindallschematransformations:
GivenanygraphuniverseUcompatiblewithaschemaS,wecanaddaderivedvertex/edge/propertytypetoproducean equivalentgraphuniverseVcompatiblewiththeschemaS∪{derivedtype}.
Thereverserulestatesthat:
GivenanygraphuniverseUcompatiblewithaschemaS,wecanremoveaderived vertex/edge/propertytypetoproduceanequivalentgraphuniverseVcompatible withtheschemaS{derivedtype}.
Fgure19:Modfedgraphschema
The'invitee'edgetypeinthegraphschemashowninthefirstfigureisaderivededgetype. Thisisbecausethe'invitee'edgescanbecalculatedbygoingfrom theInvitationverticesto thePageandbacktotheuserthrough'owns'edge(reversedirection).Wecansimplifythe originalschematotheversionshownintheadjoiningfigurebyapplyingtheserules:
(Metarule)Removederivededgetype'invitee'
(Edgepromotion)DemotethebinaryrelationshipInvitationtoanedgecalled'invited'.
Asyoucansee,theupdatedschemaissimplerthantheoriginalschemaderivedfrom anER diagram.
Provingthemetarule
Themetaruleiseasyto provebecauseofthewayderivedtypesaredefined.The transformationfunctiontoremoveaderivedtypesimplyremovesallelementsthatbelong tothattype.Theinversefunctioncalculatesthederivedtypesfrom theremaininggraph. Hencetheuniversewiththederivedtypeisequivalenttotheuniversewithoutit.
Provingthe7rules:Renaming,Reversing,PropertyDisplacement,...
Considertheexampleofrenaminganedgetype.Thisrulewasstatedinthelastsectionas:
Anyedgelabelcanberenamed,solongasthenewnamedoesn'trefertoanexisting edgelabelbetweentheoutandinvertextypes.
Wecanprovethisintwosteps:
Addderivededgetypewiththenewnameasacopyoftheoldedgetype. Removetheoldedgetypewhichisnowderivablefromthenewedgetype.
Ofcourse,step1requiresthattheedgetypewiththenewnamedoesn'talreadyexistinthe schema.Otherwise,alledgesoftheedgetypecan'tbederived.Hencethecondition"aslong asthenewnamedoesn'trefertoanexistingedgelabel."
Inthismanner,wecanproveeachrulebyperformingsomestepstofirstaddnewderived typesandthenremovetheexistingtypeswhichbecomederivedtypesthemselves.
Beyondtransformationrules
Thinkingintermsofgraphuniverses,derivedtypesandtransformationfunctionsletsusdo moreradicaltransformationstoourgraphmodel.Themannerwithwhichweapplytheserules ortransformationsdependsonouroverallstrategyfordatamodeling.
Onestrategyistominimizethenumberofimplicitconstraintsnotcapturedbytheschema.For instance,theschemashowninthesecondfiguredoesn'thavetheimplicitconstraintonthe'invitee' edgetypeshowninthefirstfigure.Generally,fewerimplicitconstraintsmeanslessduplicationof dataandlesschanceofbugswhileupdatingthedatabase.Thisissimilartonormalizationin relationaldatabases.
Adifferentstrategyistotunethegraphforitsspecificqueryingneeds.Suchapproacheshave beenpopularizedby"denormalization"techniquessuchasdimensionalmodeling.Forinstance, wecouldadda"shortcut"derivededgetypecalled'latest'from UsertoPagetoshowthelast createdpageforeachuser.Theimportantthingthenistoensurethatanychangetotherestof thegraphisaccuratelyreflectedinthederivedelementtypes.Thecodethatoperatesonthe graphmustbedesignedwiththeseconstraintsinmind.
Summary
Thissectionintroducedsettheoreticrepresentationsofgraphmodelscalledgraphuniverses, whicharemorepowerfulthangraphschemas.Secondly,thissectionshowedthattwograph universesareequivalentifthereisaninvertiblegraphtransformationfunctionbetweenthem. Finally,thissectionshowedthatallschematransformationrulespresentedintheearlier sectioncanbederivedfromonemetarulethatdealswithaddingandremovingderivedtypes.
Validatinggraphschemas
Thelastfew sectionshavediscussedhow propertygraphschemascanhelpdesigngraph databasesfrom ERmodelsandrefinethedatamodelthroughschemamanipulations.After readingthisthreadontheGremlinusersgroup,werealizedthatitiseasytovalidategraphs againstschemaswithGremlinandGroovy.
Fgure20:Tnkergraphschema
ThisgistonGithubshowshowyoucantakeaninstancegraphandchecktoseeifitiscompatible withaschemagraph.Theschemagraphhasverticesandedgescorrespondingtovertexandedge types.Here'sthecodetocreateaschemagraphinsideaGremlinshellfortheclassicTinkerpop schemashownhere:
sg=newTinkerGraph()
person=sg.addVertex()
person.setProperty('_label','person')
person.setProperty('name','java.lang.String')
person.setProperty('age','java.lang.Integer')
software=sg.addVertex()
software.setProperty('_label','software')
software.setProperty('name','java.lang.String')
software.setProperty('lang','java.lang.String')
knows=person.addEdge('knows',person)
knows.setProperty('weight','java.lang.Float')
created=person.addEdge('created',software)
created.setProperty('weight','java.lang.Float')
created.setProperty('_minIn',1)//Someonemustcreatethesoftware
ThepropertieshavevaluescorrespondingtotheJavaClassofthepropertyvaluesinthe instancegraph.Thepropertykeyscanendwith'?'toindicatethepropertyisoptional.The edgesintheschemagraphcanhave4specialproperties,viz._minIn,_maxIn,_minOutand _maxOuttoindicatecardinalityrestrictionsforvariousedgetypes.
Anyinstancegraph,g,canbevalidatedagainsttheschemastoredinsg,usingtheGremlin script:
g.V.filter({checkVertex(it,sg)})
YoucanlookatthefullGithubgisttoseehowthevalidationisdone.
ThecurrentversionofTinkerpopdoesn'tsupportvertexlabels.Sothemappingfrom the vertextothevertextypeisspecifictothegraph,likethis:
vertexType={v,sg>.age?sg.V('_label','person').next():sg.V('_label','software').next()}
Mostgraphschemastypicallyhaveapropertynamed'type'thatwouldmakethismapping easier.
HoweverwithTinkerpop3,thismethodcanbestandardizedtousethelabel: vertexType={v,sg>sg.V('label',v.label).next()}
Pixy:Firstorderlogicongraphdatabases
TheprevioussectionshaveshownthatanyERmodelcanbeconvertedtoapropertygraph schema,andthattheschemacanbenormalizedusingrules.However,onekeyquestion remains:
Dographdatabasesofferthesamequerngcapalitesasraonaldatabases?
Inotherwords,anydatathatfitsinanERmodelcanbestuffedintoagraphdatabase.But candatastuffedinthisfashionbequeriedeffectively?Thisisthesubjectofthissection.
Background
OnSQL
SQListhequerystandardforrelationaldatabases.Itfirstappearedinthe1970sandwas standardizedinthe80sand90s.ThetheoreticalfoundationofSQLisrelationalalgebra.Codd showedthatrelationalalgebraisequivalenttorelationalcalculus,aformoffirstorderlogic.His theoremisthebedrockofSQL'sexpressivepower.
Onfirstorderlogic
Usingrelationalalgebra,wecanwriteanyqueryoftheform"FindallrowsfromtablesA,B,C,..., matching somepredcat ",aslongasthepredicatecanbeexpressedinfirstorderlogic. Specifically,thepredicateisformedusing:
variouscomparisonsonrowsandcolumns, logicaloperations"and"(∧),"or"(∨)and"not"(¬),and
theuniversal"forevery"(∀)andexistential"thereexists"quantifiers(∃)thatoperateonrowsofagiventable.
Let'sconsidertablesnamedperson,carandticket.Wecouldexpressaquerylike"findme peoplewhoownonlyBMW cars,buthaveatleastonespeedingticket".Thepredicatecanbe writtenas:
my_query(person)=(∀car,personownscar∧car.make='BMW')∧(∃ticket,personhasticket)
OnGremlin
Gremlinisastandardgraphtraversallanguage.ItispartoftheTinkerpopstackandworks acrossallBlueprintscompatibledatabases.YoucanreadmoreaboutGremlinhere:
GremlinWikionGithub
GremlinDocs
ThePathologicalGremlin(presentation)
Gremlinisgreatforstepbasedqueries.Fore.g.,somethinglike"findthefriendofafriendof vertexv"canbewrittenasv.out('friend').out('friend').Thisstyleoftraversalwithverticesand edgesisn'tnaturalinSQLwithtuples.
ThedeclarativequeryingstyleofSQLis,however,differentfrom Gremlin.TheSQL2Gremlin tutorialgoesthroughsomeexamples.Butyoucanseethatthetranslationisn'tobvious.
Pixy:FirstorderlogicwithGremlin
Pixyisabridgefrom firstorderlogictoGremlin.ThefirstorderlogicofPixyoperateson verticesandedges.Wecanaskquestionslike"Findverticesandedgesthatmatchsome precat "wherethepredicateisformedby
variouscomparisonsonvertexandedgeproperties, logicaloperations"and"(∧),"or"(∨)and"not"(¬),and
theuniversal"forevery"(∀)andexistential"thereexists"quantifiers(∃)thatoperateon verticesandedges.
PixyqueriesareexpressedusingPrologrules,notSQL.RulesinPrologareexpressedasHorn clauses.
ProloglikeSQLhasthefullexpressivepoweroffirstorderlogic.
Let'stakethepredicatefromtheearlierdiscussion,
my_query(person)=(∀car,personownscar∧car.make='BMW')∧(∃ticket,personhasticket)
Let'ssaythatwerepresentpeopleasverticeswithoutgoingedgetypesnamed'car'and'ticket' toverticesrepresentingcarsandtickets.Now,wecouldexpresstheabovepredicateusing Hornclausesasfollows:
my_query(Person,Ticket):out(Person,'ticket',Ticket),
not(not_all_bmw(Person)).
not_all_bmw(Person):out(Person,'car',Car),
property(Car,'make',Make),
Make<>'BMW'.
NotethatoutandpropertyarepredefinedpredicatesinPixy.Youcanseethatthe ∃partofthequeryiseasy.Thisisamatteroffindinga ticket.The∀partofthequeryisimplementedusingtwonots.Inotherwords,saying"everycarisaBMW"isthesameassaying"thereisno carthatisn'taBMW".
ERmodelsinPixy
IfyouuseanERmodelasastartingpointforyourdesign,youcanreconstitutetheERmodel from thefinalgraphschemausingPixy.ConsiderthepreviouslyreferencedERmodelwith entitiesnamedUser,PageandTagandrelationshipsnamedOwns,InvitesandTaggedAs.
Fgure21:ERmodelforUserPageTagappcaon
Thiswastranslatedtoagraphschemawithfourtypesofvertices,viz.User,Page,Tagand Invitation.
Fgure 2:GraphschemaforUserPageTagappcaon
Now,wecanreconstitutetheERmodelfrom thegraphschemausingPixywiththefollowing clauses:
%Entities
user(User,Name,Login):property(User,'name',Name),property(User,'login',Login). page(Page,Uri,Html,CreateTs):property(Page,'uri',Uri),...
tag(Tag,Hashtag,Description):property(Tag,'hashtag',Hashtag),...
%Relationships
owns(User,Page):out(User,'owns',Page). taggedAs(Page,Tag):out(Page,'taggedas',Tag).
invites(Invitation,Inviter,Invitee,Page):
out(Invitation,'invitee',Invitee),
out(Invitation,'inviter',Inviter),
out(Invitation,'page',Page).
Everypredicatecorrespondstoanentityorarelationship.Thepredicateoperatesonvertices, edgesandpropertiesthatbelongtothegraphschema.Now,yougetthefullpowerof firstorderlogicontheERmodel.Inotherwords,anyfirstorderpredicatethatappliesto entitiesandrelationshipscanbewrittenasaPixyquerythatusestheaboveclauses.
Let'stakeanexamplepredicatethatmatchesallusersinvitedtopagestagged'tinkerpop' createdin2014.Youcouldexpressthisasfollows:
tinkerpop_invitee(User,Page):invites(_,_,User,Page),
page(Page,_,_,CreateTs),
CreateTs>1388534400L,%Unixtimestampfor1/1/2014
taggedAs(Page,Tag),
tag(Tag,'tinkerpop'). Notethat'_'isusedtorepresentanonymousvariables.
Queryrequirementsdon'tusuallymatterwhilemodeling
Itisn'tsurprisingthatqueriesinfirstorderlogiccanbecompiledtoGremlin,sinceGremlinis Turingcomplete.ThesurprisingthingisthatPixyconvertsanyfirstorderlogicqueryonan ERmodeltosomethingthatexecutes"efficiently"onthecorrespondinggraphdatabase.
By"efficiently",wemeanthatthePixy/Gremlinquerywillalwaystraverseedgestogofrom oneentity/relationshiptoanother.Edgetraversaloperationsingraphdatabasesare typicallyordersofmagnitudefasterthanindexbasedjoinsinrelationaldatabases.
Queriesonproperties,willofcourse,needindexesforefficientquerying.Butaslongasyour startingERmodelisaccurate,yourapplicationwillnothavetosimulatejoinsusingthese propertyindexes.Inthatsense,thegraphschemadesignisindependentofthequery requirements.
Conclusion
Inthisbook,ihaveshownthatthepropertygraphmodelisatheoreticalstrongmodelcapable ofbeingasflexibleandpowerfulastherelationalmodel.Specifically,wehaveshownthatthe threepillarsoftherelationalmodelarematchedbythepropertygraphmodel:
Expressivepower:ERdiagramscanbeconvertedtopropertygraphschemas.
Strongdesignguidelines,akanormalization:Graphschemascanbenormalizedusing the7normalizationrules,orthe1normalizationmetarule.
Apowerfulquerylanguage:Pixy,likeSQL,canmodelqueriesexpressibleinfirstorderlogic andconvertthesequeriestoefficient“JOINless”Gremlintraversals.
Tosummarize…
Fgure23:Agraphsummazngtsdocument
References
InternationalStandardISO/IEC13250TopicMaps,December1999.
S.Abiteboul.QueryingSemiStructuredData.InProc.ofthe6thInt.Conf. onDatabaseTheory(ICDT),volume1186ofLNCS,pages1–18. Springer,Jan1997.
S.AbiteboulandR.Hull.IFO:AFormalSemanticDatabaseModel.InProc. ofthe3thSymposium onPrinciplesofDatabaseSystems(PODS), pages119–132.ACMPress,1984.
S.Abiteboul,D.Quass,J.McHugh,J.Widom,andJ.L.Wiener.TheLorel querylanguageforsemistructured data.InternationalJournalon DigitalLibraries(JODL),1(1):68–88,1997.
S.AbiteboulandV.Vianu.QueriesandComputationontheWeb.InProc. ofthe6thInt.Conf.onDatabaseTheory(ICDT),volume1186ofLNCS, pages262–275.Springer,Jan1997.
R.AgrawalandH.V.Jagadish.EfficientSearchinVeryLargeDatabases.In Proc.ofthe14thInt.Conf.onVeryLargeDataBases(VLDB),pages 407–418.MorganKaufmann,AugSept1988.
R.AgrawalandH.V.Jagadish.MaterializationandIncrementalUpdateof PathInformation.InProc.ofthe5thInt.Conf.onDataEngineering(ICDE), pages374–383.IEEEComputerSociety,Feb1989.
R.AgrawalandH.V.Jagadish.AlgorithmsforSearchingMassiveGraphs. IEEE Transactions on Knowledge and Data Engineering (TKDE), 6(2):225–238,1994.
R.AlbertandA.L.Barabasi.Statisticalmechanicsofcomplexnetworks. ReviewsofModernPhysics,74:47,Jan2002.
N.Alechina,S.Demri,andM.deRijke.AModalPerspectiveonPathConstraints. JournalofLogicandComputation,13(6):939–956,2003.
B.AmannandM.Scholl.Gram:AGraphDataModelandQueryLanguage.In European Conference on HypertextTechnology (ECHT),pages 201–211.ACM,NovDec1992.
M.Andries and G.Engels.A Hybrid QueryLanguage foran Extended EntityRelationshipModel.TechnicalReportTR9315,InstituteofAdvanced ComputerScience,UniversiteitLeiden,May1993.
M.Andries,M.Gemis,J.Paredaens,I.Thyssens,andJ.V.denBussche. ConceptsforGraphOrientedObjectManipulation.InProc.ofthe3rdInt. Conf.onExtendingDatabaseTechnology(EDBT),volume580ofLNCS, pages21–38.Springer,March1992.
R.AnglesandC.Gutierrez.QueryingRDFDatafrom aGraphDatabase Perspective.InProc.2ndEuropeanSemanticWebConference(ESWC), number3532inLNCS,pages346–360,2005.
M.A.AufaurePortierand C.Tr´epied.A SurveyofQueryLanguages for GeographicInformationSystems.InProc.ofthe3rdInt.Workshopon InterfacestoDatabases,pages431–438,July1976.
M.AzmoodehandH.Du.GQL,AGraphicalQueryLanguageforSemantic Databases.InProc.ofthe4thInt.Conf.onScientificandStatistical Database Management(SSDBM),volume 339 ofLNCS,pages 259–277.Springer,June1988.
C.Beeri.DataModelsandLanguagesforDatabases.InProc.ofthe2ndInt. Conf.onDatabaseTheory(ICDT),volume326ofLNCS,pages19–40. Springer,AugSept1988.
G.Benk¨o,C.Flamm,andP.F.Stadler.A GraphBasedToyModelof Chemistry.JournalofChemicalInformationandComputerSciences (JCISD),43(1):1085–1093,Jan2003.
C.Berge.GraphsandHypergraphs.NorthHolland,Amsterdam,1973.
U.Brandes.NetworkAnalysis.Number3418inLNCS.SpringerVerlag,2005.
T.Bray,J.Paoli,andC.M.SperbergMcQueen.ExtensibleMarkupLanguage (XML) 1.0, W3C Recommendation 10 February 1998. http://www.w3.org/TR/1998/RECxml19980210.
A.Broder,R.Kumar,F.Maghoul,P.Raghavan,S.Rajagopalan,R.Stata,A. Tomkins,andJ.Wiener.GraphstructureintheWeb.In
Proc.ofthe9thInt.WorldWideWebconferenceonComputernetworks:the internationaljournalofcomputerandtelecommunicationsnetworking,pages 309–320.NorthHollandPublishingCo.,2000.
P.Buneman.SemistructuredData.InProc.ofthe16thSymposium on PrinciplesofDatabaseSystems(PODS),pages117–121.ACMPress, May1997.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8300)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(6739)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(6715)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(6591)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(6376)
Driving Data Quality with Data Contracts by Andrew Jones(6325)
Machine Learning Model Serving Patterns and Best Practices by Md Johirul Islam(6089)
Learning SQL by Alan Beaulieu(5995)
Weapons of Math Destruction by Cathy O'Neil(5779)
Big Data Analysis with Python by Ivan Marin(5363)
Data Engineering with dbt by Roberto Zagni(4361)
Solidity Programming Essentials by Ritesh Modi(4009)
Time Series Analysis with Python Cookbook by Tarek A. Atwan(3867)
Pandas Cookbook by Theodore Petrou(3578)
Blockchain Basics by Daniel Drescher(3294)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2905)
Feature Store for Machine Learning by Jayanth Kumar M J(2814)
Learn T-SQL Querying by Pam Lahoud & Pedro Lopes(2796)
Mastering Python for Finance by Unknown(2744)
