Practical Data Science by Mario Rojas
Author:Mario Rojas [Mario Rojas]
Language: eng
Format: epub, pdf
Publisher: UNKNOWN
Published: 2021-01-14T00:00:00+00:00
Kingdom animalia Class Mammalia order Carnivora Family Canidae
Genus Canis
specie Lupus familiaris
He is classified as Canis Lupus familiaris.
Youmust look at an object as revealed by recently discovered knowledge. Now, I will guide you through a Python code session, to convert the flat file into a graph with knowledge. The only information you have are in fileAnimals.csvin directory.. VKHCG-Hillman -RawData.
The format is
ItemLevel ParentID ItemID ItemName
0 0 50
0 0 202422
1 50 956096
1 50 956097
The field has the following meanings: Bacteria
plantae
Negibacteria posibacteria
⢠ItemLevelis how far the specific item is from the top node in the classification.
⢠ParentIDis the ItemID for the parent of the Item listed.
⢠ItemID is the unique identifier for the item.
⢠ItemNameis the full name of the item.
Thedatafitstogether as aconsiderable treeofclassifications. Youmustcreatea graph that gives you the following:
Bacteria-> Negibacteria and Bacteria-> Posibacteria
Following is the code to transform it. You will perform a few sections of data preparation,data storagefortheretrieve,Assesssupersteps,andthen wewillcomplete the Process step into the data vault. You start with the standard framework, so please transfer the code to your Python editor. First,letâsset up the data:
################################################################ # -*- coding: utf-8 -*
################################################################ import sys
import os
import pandas as pd
import networkx as nx
import sqlite3 as sq
import numpy as np
################################################################ if sys.platform == 'linux':
Base=os.path.expanduser('~') + '/VKHCG'
else:
Base='C:/VKHCG'
print('################################')
print('Working Base :',Base, ' using ', sys.platform)
print('################################')
################################################################ ReaderCode='SuperDataScientist'
Please replace the'Practical Data Scientist'in the next line with your name. ReaderName='Practical Data Scientist'
You now set up the locations of all the deliverables of the code.
################################################################ Company='03-Hillman'
InputRawFileName='Animals.csv'
EDSRetrieveDir='01-Retrieve/01-EDS'
InputRetrieveDir=EDSRetrieveDir + '/02-Python'
InputRetrieveFileName='Retrieve_All_Animals.csv'
EDSAssessDir='02-Assess/01-EDS'
InputAssessDir=EDSAssessDir + '/02-Python'
InputAssessFileName='Assess_All_Animals.csv'
InputAssessGraphName='Assess_All_Animals.gml'
You now create the locations of all the deliverables of the code.
################################################################ sFileRetrieveDir=Base + '/' + Company + '/' + InputRetrieveDir if not os.path.exists(sFileRetrieveDir):
os.makedirs(sFileRetrieveDir)
############################################### ################# sFileAssessDir=Base + '/' + Company + '/' + InputAssessDir if not os.path.exists(sFileAssessDir):
os.makedirs(sFileAssessDir)
################################################################ sDataBaseDir=Base + '/' + Company + '/03-Process/SQLite' if not os.path.exists(sDataBaseDir):
os.makedirs(sDataBaseDir)
################################################################ sDatabaseName=sDataBaseDir + '/Hillman.db'
conn = sq.connect(sDatabaseName)
################################################################ # Raw to Retrieve
################################################################
You upload the CSV file with the flat structure.
sFileName=Base + '/' + Company + '/00-RawData/' + InputRawFileName print('###########')
print('Loading :',sFileName)
AnimalRaw=pd.read_csv(sFileName,header=0,low_memory=False, encoding = "ISO-8859-1")
AnimalRetrieve=AnimalRaw.copy()
print(AnimalRetrieve.shape)
################################################################
You store the Retrieve steps data now.
sFileName=sFileRetrieveDir + '/' + InputRetrieveFileName print('###########')
print('Storing Retrieve :',sFileName)
AnimalRetrieve.to_csv(sFileName, index = False)
You store the Assess steps data now.
################################################################ # Retrieve to Assess
################################################################ AnimalGood1 = AnimalRetrieve.fillna('0', inplace=False) AnimalGood2=AnimalGood1[AnimalGood1.ItemName!=0]
AnimalGood2[['ItemID','ParentID']]=AnimalGood2[['ItemID','ParentID']]. astype(np.int32)
AnimalAssess=AnimalGood2
print(AnimalAssess.shape)
################################################################ sFileName=sFileAssessDir + '/' + InputAssessFileName
print('###########')
print('Storing Assess :',sFileName)
AnimalAssess.to_csv(sFileName, index = False)
################################################################ print('################')
sTable='All_Animals'
print('Storing :',sDatabaseName,' Table:',sTable)
AnimalAssess.to_sql(sTable, conn, if_exists="replace")
print('################')
Youstart with the Process steps, to process the flat data into a graph. Youcan now extract the nodes, as follows:
################################################################
print('################')
sTable='All_Animals'
print('Loading Nodes :',sDatabaseName,' Table:',sTable)
sSQL=" SELECT DISTINCT"
sSQL=sSQL+ " CAST(ItemName AS VARCHAR(200)) AS NodeName,"
sSQL=sSQL+ " CAST(ItemLevel AS INT) AS NodeLevel"
sSQL=sSQL+ " FROM"
sSQL=sSQL+ " " + sTable + ";"
AnimalNodeData=pd.read_sql_query(sSQL, conn)
print(AnimalNodeData.shape)
Youhave now successfully extracted the nodes. Well done. Youcan now extract the edges. You will start with the Process step, to convert the data into an appropriate graph structure.
################################################################ print('################')
sTable='All_Animals'
print('Loading Edges :',sDatabaseName,' Table:',sTable) sSQL=" SELECT DISTINCT"
sSQL=sSQL+ " CAST(A1.ItemName AS VARCHAR(200)) AS Node1," sSQL=sSQL+ " CAST(A2.ItemName AS VARCHAR(200)) AS Node2" sSQL=sSQL+ " FROM"
sSQL=sSQL+ " " + sTable + " AS A1"
sSQL=sSQL+ " JOIN"
sSQL=sSQL+ " " + sTable + " AS A2"
sSQL=sSQL+ " ON"
sSQL=sSQL+ " A1.ItemID=A2.ParentID;"
AnimalEdgeData=pd.read_sql_query(sSQL, conn)
print(AnimalEdgeData.shape)
You have now extracted the edges. So, letâs build a graph.
################################################################ G=nx.Graph()
t=0
G.add_node('world', NodeName='World')
################################################################
You add the nodes first.
GraphData=AnimalNodeData
print(GraphData)
################################################################ m=GraphData.shape[0]
for i in range(m):
t+=1
sNode0Name=str(GraphData['NodeName'][i]).strip() print('Node :',t,' of ',m,sNode0Name)
sNode0=sNode0Name.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7397)
Grails in Action by Glen Smith Peter Ledbrook(7291)
Kotlin in Action by Dmitry Jemerov(4633)
Management Strategies for the Cloud Revolution: How Cloud Computing Is Transforming Business and Why You Can't Afford to Be Left Behind by Charles Babcock(4127)
The Age of Surveillance Capitalism by Shoshana Zuboff(3411)
Learn Windows PowerShell in a Month of Lunches by Don Jones(3236)
Mastering Azure Security by Mustafa Toroman and Tom Janetscheck(3007)
Mastering Python for Networking and Security by José Manuel Ortega(2933)
Blockchain Basics by Daniel Drescher(2882)
TCP IP by Todd Lammle(2633)
Microsoft 365 Identity and Services Exam Guide MS-100 by Aaron Guilmette(2528)
From CIA to APT: An Introduction to Cyber Security by Edward G. Amoroso & Matthew E. Amoroso(2477)
Hands-On Azure for Developers by Kamil Mrzyglod(2395)
React Native - Building Mobile Apps with JavaScript by Novick Vladimir(2334)
MCSA Windows Server 2016 Study Guide: Exam 70-740 by William Panek(2308)
The Social Psychology of Inequality by Unknown(2303)
The Art Of Deception by Kevin Mitnick(2291)
Configuring Windows Server Hybrid Advanced Services Exam Ref AZ-801 by Chris Gill(2275)
Azure Containers Explained by Wesley Haakman & Richard Hooper(2156)