The problem is stated here.
I will first attempt explain the challenge simply, since from my recent experience, it seems that a lot of software engineering exercises have as much to do with reading comprehension as with actual programming skills.
Then I will describe my approach and also present an alternative quick solution.
Finally I will paste the annotated code and make the script available for download.
You get a 2 lists of directories – one is a list of “existing” directories, the second is a list of directories we want to “create”. The file system in question is imaginary, as we are dealing simply with the logical aspect of creating directories as per the unix directory structure standard (“dir/subdir/subdir”).
Once the program understands the existing tree, it will then calculate how many “mkdir”‘s it will take to satisfy the list of “to create” dirs.
Coming from a SysAdmin background, my initial solution was very straightforward, and used basic GNU tools to get the job done, however it had very little to do with programming. It was more like mkdir -pv /test/dir/subdir/ | wc -l. With a bit of wrangling that basic piece of logic should actually be able to handle the whole thing.
However, as the problem also includes the input parsing, and in fact, what was at question here are meaningful engineering skills, rather than creative workarounds (those have their place as well don’t get me wrong), I went on to write the following script:
### Google Code Challenge "File Fix-it" ###
### Solution by Tom Goren ###
# Input: T - No. of test cases
# N - No. of lines that pertain to dirs already existing
# M - No. of lines that are dirs we want to create
# Output: Case No. and no. of moves it takes to create
##############
# Example: #
# Input #
# #
# 2 # T = 2
# 0 2 # N=0, M=2
# /fiz/boo # No dirs 'already existing', need to create 2 (/fiz and /fiz/boo)
# /fiz/bot # /fiz already created in last move, now only need to create /fiz/bot
# 1 1 # N=1, M=1
# /ox/shuk # /ox/shuk 'already exists'
# /ox/shmi # only need to create /ox/shmi - 1 move
##############
# Output #
# #
# Case #1: 3 #
# Case #2: 1 #
# #
# #
##############
import sys
# define helpful function, for nice output
printf = sys.stdout.write
# Read input file
#input_file = "A-large-practice.in" # Google's large set
input_file = "A-small-practice.in" # Google's small set
infile = open(input_file)
def tree_build(base_tree,dirs):
""" helper function for building the dir tree """
move_count = 0
tree = base_tree # python dictionaries are special in assignment (this is the trick)
for subdir in dirs:
if subdir not in tree:
tree[subdir] = {} # dir is top level
move_count += 1
tree = tree[subdir] # next traverse will be inside element
return move_count
def mkdir_moves(existing, to_create):
""" Build multi-dimensional array with dictionaries for the 'existing' dirs,
and then check what it will take to create 'non-existent' ones.
Return no. of moves """
base_tree = {}
for exist in existing:
exist = exist.strip().split("/") # cleanup
exist.pop(0) # more cleanup
tree_build(base_tree,exist) # pass empty dict and dir in form of list
move_count = 0
for create in to_create:
create = create.strip().split("/")
create.pop(0)
move_count += tree_build(base_tree,create)
return move_count
# start parsing
N = 0
M = 0
case_index = 0
line_counter = 0
case_counter = 1
existing_lines = []
create_lines = []
for line in infile:
if line_counter == 0 and int(line) > 0:
no_of_cases = line
line_spl = line.split()
if len(line_spl) > 1: # splitting by spaces - only case definitions should have a space in them ('N M')
N, M = int(line_spl[0]), int(line_spl[1])
if N > 100 or N < 0 or M > 100 or M < 0:
#printf("bad input for line: %s (bad case definitions)" % line_counter)
#sys.exit(1)
continue
case_index = line_counter # for lines within each case
# printf("DEBUG N: %s, M: %s\n" % (N, M))
printf("\nCase #%s: "% case_counter) # Output
existing_lines = [] # reset list
create_lines = [] # reset list
case_counter += 1 # advance to next case counter
elif len(line) > 100:
#printf("bad input for line: %s (longer than 100 chars)" % line_counter)
#sys.exit(1)
continue
else:
case_length = N + M
# printf("DEBUG case length: %s\n" % case_length)
delta = line_counter - case_index
if delta <= N:
""" lines that exist """
existing_lines += [line]
elif delta > N and delta < case_length:
""" lines to create """
create_lines += [line]
elif delta == case_length:
create_lines += [line]
get_moves = mkdir_moves(existing_lines, create_lines)
# printf("DEBUG existing_lines: %s\n create_lines: %s\n" % (existing_lines, create_lines) )
printf("%s" % get_moves)
line_counter += 1
You may notice that there are two meaningless clauses checking the validity of the parsed lines (see “bad input”). This is due to some seeming incongruity between the challenge definition and the provided test cases.
If these clauses are ignored, the script solves the problem to their checking mechanism’s satisfaction – feel free to comment them out and test it.
Also let me know if there is something I misunderstood in the definition that “No path will have more than 100 characters in it.“.
Anyhow, here is the script for download, along with the provided test cases.
Thanks for your post, this was my attempt:
def __init__(self):
self.existingdirs = []
self.newdirs = []
self.mkdirs = 0
class Dir:
def __init__(self, fullpath):
self.fullpath = fullpath
def parent(self):
return self.fullpath[:self.fullpath.rfind('/')]
#inputfile = 'A-small-practice.in'
inputfile = 'A-large-practice.in'
with open(inputfile) as f:
theinput = f.readlines()
numtests = int(theinput.pop(0))
for i in range(0,numtests):
t = Test()
numexistingdirs, numnewdirs = theinput.pop(0).split()
for i1 in range(0,int(numexistingdirs)):
t.existingdirs.append(theinput.pop(0).strip())
for i1 in range(0,int(numnewdirs)):
d = Dir(theinput.pop(0).strip())
t.newdirs.append(d)
for currentdir in t.newdirs:
direxists = False
if not currentdir.fullpath in t.existingdirs:
while currentdir.parent() and not direxists:
if currentdir.parent() in t.existingdirs:
direxists = True
else:
t.mkdirs += 1
t.existingdirs.append(currentdir.fullpath)
currentdir = Dir(currentdir.parent())
if currentdir.fullpath not in t.existingdirs:
t.mkdirs += 1
t.existingdirs.append(currentdir.fullpath)
print('Case #%s: %s' % (i + 1, t.mkdirs))
Link | April 12th, 2012 at 23:35
Ok tab spaces are ignored…oh well
Link | April 12th, 2012 at 23:35
Fixed that for you and added formatting [didn't do anything else
]
Link | April 13th, 2012 at 00:30
I like it! Concise, readable and object oriented. Awesome job, thanks for sharing!
Link | April 13th, 2012 at 00:33