Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
92c4bd8
Made some changes in file mdb.py at line 10 and 11.
Dec 12, 2022
645e6ff
Created notes.txt, it contains notes for issue 1.
Dec 12, 2022
315dddc
working on issue 1 (NOT).
Dec 13, 2022
a94b852
Baby steps.
Dec 17, 2022
09deb24
notes.txt updated.
Dec 19, 2022
b981e40
added few lines in notes.txt
Dec 19, 2022
a9adf80
removed some white lines.
Dec 19, 2022
f5aa070
Just updated some notes.
Jan 1, 2023
9ceb2ae
Updated evaluate_where_clause() method - not operator
Jan 1, 2023
ddcc912
Updated evaluate_where_clause() method - not, between, or, and operators
Jan 2, 2023
4098513
working on evaluate_where_clause
panagiota02 Jan 27, 2023
65df01c
working on evaluate_where_clause
panagiota02 Jan 27, 2023
2d2581d
Added some notes
panagiota02 Jan 27, 2023
27f8ee4
Added 'not' operator in evaluate_where_clause() and done with complex…
Feb 1, 2023
1317e7a
we removed useless code and completed 'not' operator.
kostas96674 Feb 1, 2023
db36975
completed query plan!
kostas96674 Feb 3, 2023
73dc618
making progress on mulitple query plans!
kostas96674 Feb 6, 2023
23615b1
added some notes.
Feb 6, 2023
1d9cbe3
working on execute_dic.
Feb 6, 2023
7583f40
working on where function.
Feb 6, 2023
f7e23d4
removed some white lines.
Feb 6, 2023
43b1108
Made some small changes in lines 194-196 and line 391.
Feb 6, 2023
3d14153
Added/removed some notes.
Feb 7, 2023
285b627
Made some changes to evaluate_where_clause.
Feb 7, 2023
217e6e7
nothing special, just reading the code.
Feb 7, 2023
d366029
Made some changes to evaluate_where_clause.
Feb 7, 2023
b9346f7
Changed 2 lines of code in evaluate_where_clause.
Feb 7, 2023
fea9c25
Changed 2 lines of code in evaluate_where_clause.
Feb 7, 2023
b58c96f
progress on optimiser
kostas96674 Feb 7, 2023
40a16ee
Merge pull request #1 from dimitrisstyl7/issue_1
dimitrisstyl7 Feb 7, 2023
ead43fb
Contains the function we may use for executing where_clause (working …
Feb 7, 2023
685b97e
Removed a function (plans changed).
Feb 7, 2023
2d10fee
Made some changes to _select_where().
Feb 7, 2023
46cee81
Working on select().
Feb 8, 2023
95d7ead
Working on _select_where()
Feb 8, 2023
a9806f7
Changed one if statement in execute_dic.
Feb 9, 2023
984fd65
Working on _select_where().
Feb 9, 2023
681b5bc
We made some changes in interpret() and added code in create_query_pl…
panagiota02 Feb 10, 2023
c839aa5
We changed the name of function update_table() to update().
panagiota02 Feb 10, 2023
5272eb3
We added a new function with name find_rows_by_condition(). Also, we …
panagiota02 Feb 10, 2023
ba048d8
We don't need it anymore!
Feb 10, 2023
7826a7b
Merge branch 'issue_1' into issue_1_dimitris
dimitrisstyl7 Feb 10, 2023
c9cdf8c
Merge pull request #2 from dimitrisstyl7/issue_1_dimitris
dimitrisstyl7 Feb 10, 2023
5015bc1
Removed some new lines (\n), for better code reading.
Feb 11, 2023
2f8d19f
Issue_1 completed succesfully!
Feb 11, 2023
83e1a97
Merge branch 'issue_3' into issue_1
dimitrisstyl7 Feb 12, 2023
eb356ea
Merge pull request #3 from dimitrisstyl7/issue_1
dimitrisstyl7 Feb 12, 2023
ca17610
made some changes
kostas96674 Feb 12, 2023
bf555ed
Merge branch 'issue_3' of https://github.com/dimitrisstyl7/miniDB int…
kostas96674 Feb 12, 2023
137449c
Removed the code which is not part of issue_1 (is part of issue_3) an…
Feb 15, 2023
beb636f
This file is not part of issue_1 (is part of issue_3), so I deleted it.
Feb 15, 2023
ba76a2e
Nothing special, just did few changes which don't afffect the code.
Feb 15, 2023
a8b3da7
Working on mdb.py, adding 'unique' constraint.
Feb 16, 2023
d3e4d6e
Working on 'unique' constraint.
Feb 16, 2023
9f14f8b
Added 'unique' constraint in create table 'instructor' (column 'name'…
Feb 16, 2023
268a987
Working on 'unique' constraint.
Feb 16, 2023
d4cffbc
Updated notes.
Feb 17, 2023
5eed51b
Create index added in function create_query_plan().
Feb 17, 2023
64210d2
Working on ' BTree index over unique
Feb 17, 2023
99ae198
almost finished optimiser
kostas96674 Feb 17, 2023
2967c81
We just tried to export a table.
Feb 17, 2023
c354b2b
Notes updated.
Feb 17, 2023
d10e6b0
Working on BTree index over unique (non-PK) columns - issue_2a.
Feb 17, 2023
37d8413
working on optimiser
kostas96674 Feb 18, 2023
79bd727
Merge branch 'issue_2' into issue_3
Feb 18, 2023
f4b326d
Merge mdb.py
Feb 18, 2023
6f980b1
Merge branch 'issue_3' of https://github.com/dimitrisstyl7/miniDB int…
Feb 18, 2023
55b55aa
Updated notes.
Feb 18, 2023
a7f05ce
Nothing special.
Feb 18, 2023
0494ae7
Working on issue_2.
Feb 18, 2023
bad89c1
Working on issue_2.
Feb 18, 2023
d1c9565
Working on issue_2a.
Feb 18, 2023
6e1dce7
Working on issue_2a.
Feb 18, 2023
5c4f1f4
working on optimiseeeer
kostas96674 Feb 18, 2023
bea605e
Merge remote-tracking branch 'origin/issue_2' into issue_3
Feb 18, 2023
51b3b39
Simple way to load, save and calculate statistics for each table added.
Feb 18, 2023
4b00205
Just moved some functions.
Feb 18, 2023
f80aa30
progress on optimiser
kostas96674 Feb 19, 2023
1b9c209
Merge branch 'issue_3' of https://github.com/dimitrisstyl7/miniDB int…
kostas96674 Feb 19, 2023
9ffb7de
Notes updated.
Feb 20, 2023
1e72cad
Working on extendible hashing (issue_2b).
Feb 20, 2023
7299127
Working on extendible hashing (issue_2b).
Feb 20, 2023
71801f7
Working on extendible hashing (issue_2b).
Feb 20, 2023
a5d1ccb
i think optimiser finished :)
kostas96674 Feb 20, 2023
78b0429
Notes updated.
Feb 20, 2023
f68458c
Nothing special.
Feb 20, 2023
53bd438
Issue_2b completed succesfully.
Feb 20, 2023
3229982
Working on issue_3, statistics.
Feb 20, 2023
ae27e1b
Merge pull request #5 from dimitrisstyl7/issue_2
dimitrisstyl7 Feb 20, 2023
80f250d
Issue_3 almost done.
Feb 20, 2023
ef18b3e
OPTIMISER FINIIIIISHED!!!!
kostas96674 Feb 20, 2023
7dacae0
Issue_3 finished!
Feb 20, 2023
82b01de
nothing special.
Feb 20, 2023
0dd2bf5
Delete person.csv
dimitrisstyl7 Feb 20, 2023
922c1b2
Merge pull request #6 from dimitrisstyl7/issue_3
dimitrisstyl7 Feb 20, 2023
cfe6469
Update mdb.py
dimitrisstyl7 Feb 20, 2023
9cd4140
Update query_plans.py
dimitrisstyl7 Feb 20, 2023
aad9f03
nothing special.
Feb 20, 2023
cf76310
Merge branch 'master' of https://github.com/dimitrisstyl7/miniDB
Feb 20, 2023
c08f11c
Delete notes.txt
dimitrisstyl7 Feb 20, 2023
ca6f37b
Update table.py
dimitrisstyl7 Feb 20, 2023
a0d7325
Code summary
dimitrisstyl7 Nov 3, 2023
ea80764
Update and rename summary.md to code_summary.md
dimitrisstyl7 Nov 3, 2023
7c9932e
Update code_summary.md
dimitrisstyl7 Nov 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
268 changes: 268 additions & 0 deletions code_summary.md

Large diffs are not rendered by default.

232 changes: 213 additions & 19 deletions mdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,13 @@
import readline
import traceback
import shutil

sys.path.append('miniDB')

from database import Database
from table import Table
from miniDB.database import Database
from miniDB.table import Table
from miniDB.query_plans import multiple_query_plans
from miniDB.evaluate_query_plans import evaluate_query_plans
# art font is "big"
art = '''
_ _ _____ ____
Expand All @@ -20,6 +23,7 @@
'''



def search_between(s, first, last):
'''
Search in 's' for the substring that is between 'first' and 'last'
Expand All @@ -37,7 +41,6 @@ def in_paren(qsplit, ind):
'''
return qsplit[:ind].count('(')>qsplit[:ind].count(')')


def create_query_plan(query, keywords, action):
'''
Given a query, the set of keywords that we expect to pe present and the overall action, return the query plan for this query.
Expand Down Expand Up @@ -66,8 +69,6 @@ def create_query_plan(query, keywords, action):
ql.pop(i+1)
kw_positions.append(i)
i+=1



for i in range(len(kw_in_query)-1):
dic[kw_in_query[i]] = ' '.join(ql[kw_positions[i]+1:kw_positions[i+1]])
Expand All @@ -78,6 +79,9 @@ def create_query_plan(query, keywords, action):
if action=='select':
dic = evaluate_from_clause(dic)

if dic['where'] is not None:
dic = evaluate_where_clause(dic)

if dic['distinct'] is not None:
dic['select'] = dic['distinct']
dic['distinct'] = True
Expand All @@ -96,16 +100,26 @@ def create_query_plan(query, keywords, action):
if action=='create table':
args = dic['create table'][dic['create table'].index('('):dic['create table'].index(')')+1]
dic['create table'] = dic['create table'].removesuffix(args).strip()
arg_nopk = args.replace('primary key', '')[1:-1]
arglist = [val.strip().split(' ') for val in arg_nopk.split(',')]
temp_args = args.replace('primary key', '')[1:-1] # remove primary key and parentheses
temp_args = temp_args.replace('unique', '') # remove unique
arglist = [val.strip().split(' ') for val in temp_args.split(',')]
dic['column_names'] = ','.join([val[0] for val in arglist])
dic['column_types'] = ','.join([val[1] for val in arglist])

if 'primary key' in args:
arglist = args[1:-1].split(' ')
dic['primary key'] = arglist[arglist.index('primary')-2]
else:
dic['primary key'] = None


if 'unique' in args:
arglist = args[1:-1].split(',')
arglist = [val.strip().split(' ') for val in arglist]
column_names = [val[0] for val in arglist if len(val)>2 and val[2]=='unique']
dic['unique_columns'] = ','.join(column_names)
else:
dic['unique_columns'] = None

if action=='import':
dic = {'import table' if key=='import' else key: val for key, val in dic.items()}

Expand All @@ -121,9 +135,29 @@ def create_query_plan(query, keywords, action):
else:
dic['force'] = False

return dic

if action=='delete from':
if dic['where'] is not None:
dic = evaluate_where_clause(dic)
else:
dic['where'] = None

if action=='update':
if dic['where'] is not None:
dic = evaluate_where_clause(dic)
else:
dic['where'] = None

if action=='create index':
# Check if 'on' clause is not None and if is of the form 'table_name (column_name)'
if dic['on'] is not None and '(' in dic['on'] and ')' in dic['on'] and dic['on'].count('(') == dic['on'].count(')') == 1:
on_clause = dic['on'].split('(')
table_name = on_clause[0].strip()
column_name = on_clause[1][:-1].strip()
dic['on'] = { 'table_name': table_name, 'column_name': column_name }
else:
raise ValueError('\nWrong syntax: "on" clause must be of the form "table_name (column_name)"\n')

return dic

def evaluate_from_clause(dic):
'''
Expand All @@ -148,7 +182,11 @@ def evaluate_from_clause(dic):
join_dic['join'] = 'inner'
join_dic['left'] = ' '.join(from_split[:join_idx])
join_dic['right'] = ' '.join(from_split[join_idx+1:on_idx])
join_dic['on'] = ''.join(from_split[on_idx+1:])
and_idx = [i+on_idx+1 for i,word in enumerate(from_split[on_idx+1:]) if word=='and']
if and_idx:
join_dic['on'] = {'and':{'left':' '.join(from_split[on_idx+1:and_idx[0]]),'right':' '.join(from_split[and_idx[0]+1:])}}
else:
join_dic['on'] = ''.join(from_split[on_idx+1:])

if join_dic['left'].startswith('(') and join_dic['left'].endswith(')'):
join_dic['left'] = interpret(join_dic['left'][1:-1].strip())
Expand All @@ -160,6 +198,144 @@ def evaluate_from_clause(dic):

return dic

def evaluate_where_clause(dic):
'''
Evaluate the part of the query (argument or subquery) that is supplied as the 'where' argument.
'''
def convert_list_to_dict(lst):
'''
Converting the list of dictionaries to the desired dictionary,
where each key ('OPR') will later be replaced with an operator ('and' or 'or').
'''
if len(lst) == 1:
return {'OPR': lst[0]}
else:
return {'OPR': {'left': lst[0]['left'], 'right': convert_list_to_dict(lst[1:])}}

def build_list(oprt_idx, where_split):
'''
Building a list of dictionaries, where each dictionary contains
the left and right sides of the operator ('and' or 'or').
'''
if oprt_idx:
oprt_dic = {}

if ' '.join(where_split[:oprt_idx[0]]).startswith('not '):
oprt_dic['left'] = ' '.join(where_split[:oprt_idx[0]])
else:
oprt_dic['left'] = ' '.join(where_split[oprt_idx[0]-1:oprt_idx[0]])
if oprt_dic['left'] == ')':
pos = where_split[:oprt_idx[0]].index('(')
oprt_dic['left'] = ' '.join(where_split[pos:oprt_idx[0]])
if ' '.join(where_split[:oprt_idx[0]]).__contains__('between'):
btwn_idx = where_split[:oprt_idx[0]].index('between')
if not in_paren(where_split, btwn_idx):
raise ValueError(f'\nWrong syntax: "between" clause must be in parentheses.\n')

oprt_dic['left'] = evaluate_where_clause( { 'where': oprt_dic['left'] } )['where']
oprt_dic['right'] = ' '.join(where_split[oprt_idx[0]+1:])
oprt_dic['right'] = evaluate_where_clause( { 'where': oprt_dic['right'] } )['where']
List.append(oprt_dic)

def put_paren_in_oprt_and(where_split):
'''
Placing parentheses around the 'and' operator, if it is not already within parentheses, when an 'or' operator exists,
helps ensure the proper priority of the operators ('and' and 'or'). This also works for some 'between' clauses,
but it is recommended to always put parentheses around the 'between' clause.
'''
def find_idx():
'''
Finding the indices of the 'and' and 'or' operators.
'''
and_idx = [i for i,word in enumerate(where_split) if word=='and' and not in_paren(where_split,i)]
or_idx = [i for i,word in enumerate(where_split) if word=='or' and not in_paren(where_split,i)]
oprt_idx = and_idx + or_idx
oprt_idx.sort()
return oprt_idx, and_idx, or_idx

oprt_idx, and_idx, or_idx = find_idx()
previous_and = False
idx_for_paren = 0
while len(oprt_idx) > 0 and and_idx and or_idx:
idx = oprt_idx.pop(0)
if where_split[idx] == 'and' and not previous_and:
previous_and = True
where_split.insert(idx_for_paren, '(')
if len(oprt_idx) == 0:
where_split.append(')')
break
elif where_split[idx+2] == 'not':
where_split.insert(idx+4, ')')
else:
where_split.insert(idx+3, ')')
oprt_idx, and_idx, or_idx = find_idx()
elif where_split[idx] == 'and' and previous_and:
where_split.pop(idx-1)
where_split.insert(idx+1, ')')
else:
previous_and = False
idx_for_paren = idx+1
return where_split

where_split = put_paren_in_oprt_and(dic['where'].split(' '))

'''
not/between/and/or operators not in parentheses.
'''
not_idx = [i for i, word in enumerate(where_split) if word == 'not' and not in_paren(where_split, i)]
btwn_idx = [i for i, word in enumerate(where_split) if word == 'between' and not in_paren(where_split, i)]
oprt_idx = [i for i,word in enumerate(where_split) if (word=='or' or word=='and') and not in_paren(where_split,i)] # oprt_idx contains the indices of operators 'and' and 'or'.

'''
Checking if the 'where' clause is within parentheses or if it starts with
'not' operator outside of parentheses and is followed by parentheses.
'''
if not oprt_idx and (where_split[0] == '(' or where_split[0] == 'not' and where_split[1] == '(') and where_split[-1] == ')':
if not_idx:
dic['where'] = { 'not': evaluate_where_clause( { 'where': ' '.join(where_split).removeprefix('not ') } )['where'] }
return dic
else:
dic['where'] = evaluate_where_clause( { 'where': ' '.join(where_split[1:-1]) } )['where']
return dic

if btwn_idx and len(oprt_idx) == 1:
'''
If the 'between' operator exists and there is only one operator ('and'),
evaluate the 'between' clause and return the result.
'''
if len(where_split) < 5:
raise Exception('\nWrong syntax of "between" clause.\n')
dic['where'] = { 'column': where_split[btwn_idx[0]-1], 'between': evaluate_where_clause( { 'where': ' '.join(where_split[btwn_idx[0]+1:]) } )['where'] }
return dic

if oprt_idx:
'''
If there are operators ('and' or 'or'), we build a list of dictionaries and then convert it into a dictionary.
The 'OPR' string is used as a placeholder for the actual operator and then is replaced with the correct operator.
'''
List = []
build_list(oprt_idx, where_split)
oprt_dic = str(convert_list_to_dict(List))
oprt_words = [where_split[i] for i in oprt_idx]

'''
Replace the 'OPR' string with the actual operator ('and' or 'or').
'''
oprt_dic = oprt_dic.replace('OPR', oprt_words[0], 1)
dic['where'] = dict(eval(oprt_dic))
return dic

if not_idx:
'''
If the simple 'not' operator exists, create a dictionary with the 'not' operator
where the key is 'not' and the value is the right side of the 'not' operator.
'''
dic['where'] = {'not': where_split[not_idx[0]+1]}
return dic

dic['where'] = ''.join(where_split)
return dic

def interpret(query):
'''
Interpret the query.
Expand All @@ -174,12 +350,13 @@ def interpret(query):
'lock table': ['lock table', 'mode'],
'unlock table': ['unlock table', 'force'],
'delete from': ['delete from', 'where'],
'update table': ['update table', 'set', 'where'],
'update': ['update', 'set', 'where'],
'create index': ['create index', 'on', 'using'],
'drop index': ['drop index'],
'create view' : ['create view', 'as']
}


if query[-1]!=';':
query+=';'

Expand All @@ -196,14 +373,14 @@ def execute_dic(dic):
Execute the given dictionary
'''
for key in dic.keys():
if isinstance(dic[key],dict):
if isinstance(dic[key], dict) and key == 'from':
dic[key] = execute_dic(dic[key])

action = list(dic.keys())[0].replace(' ','_')
return getattr(db, action)(*dic.values())

def interpret_meta(command):
"""
'''
Interpret meta commands. These commands are used to handle DB stuff, something that can not be easily handled with mSQL given the current architecture.

The available meta commands are:
Expand All @@ -212,7 +389,7 @@ def interpret_meta(command):
lstb - list tables
cdb - change/create database
rmdb - delete database
"""
'''
action = command.split(' ')[0].removesuffix(';')

db_name = db._name if search_between(command, action,';')=='' else search_between(command, action,';')
Expand Down Expand Up @@ -251,9 +428,8 @@ def remove_db(db_name):
dbname = os.getenv('DB')

db = Database(dbname, load=True)

#db.print_statistics() # uncomment to print statistics


if fname is not None:
for line in open(fname, 'r').read().splitlines():
if line.startswith('--'): continue
Expand All @@ -280,19 +456,37 @@ def remove_db(db_name):
line+=';'
except (KeyboardInterrupt, EOFError):
print('\nbye!')
try:
db.calculate_tables_statistics() # calculate statistics before exiting.
except:
pass # if the database is not loaded or does not exist, do not calculate statistics.
break
try:
if line=='exit':
if line=='exit;':
print('\nbye!')
try:
db.calculate_tables_statistics() # calculate statistics before exiting.
except:
pass # if the database is not loaded or does not exist, do not calculate statistics.
break
if line.split(' ')[0].removesuffix(';') in ['lsdb', 'lstb', 'cdb', 'rmdb']:
interpret_meta(line)
elif line.startswith('explain'):
dic = interpret(line.removeprefix('explain '))


pprint(dic, sort_dicts=False)
else:
dic = interpret(line)

if 'select' in dic.keys() and not (isinstance(dic['from'],str) and dic['from'].startswith('meta')):
queries, is_valid = multiple_query_plans(dic)
if(is_valid):
dic = evaluate_query_plans(db,queries)

result = execute_dic(dic)
if isinstance(result,Table):
result.show()
except Exception:
except Exception as e:
print(traceback.format_exc())
print(e)
Loading